GPTKB: Comprehensively Materializing Factual LLM Knowledge

From International Center for Computational Logic
Toggle side column

GPTKB: Comprehensively Materializing Factual LLM Knowledge

Yujia HuYujia Hu,  Tuan-Phong NguyenTuan-Phong Nguyen,  Shrestha GhoshShrestha Ghosh,  Simon RazniewskiSimon Razniewski
GPTKB: Comprehensively Materializing Factual LLM Knowledge


Yujia Hu, Tuan-Phong Nguyen, Shrestha Ghosh, Simon Razniewski
GPTKB: Comprehensively Materializing Factual LLM Knowledge
Technical Report, arXiv.org, November 2024
  • KurzfassungAbstract
    LLMs have majorly advanced NLP and AI, and next to their ability to perform a wide range of procedural tasks, a major success factor is their internalized factual knowledge. Since (Petroni et al., 2019), analyzing this knowledge has gained attention. However, most approaches investigate one question at a time via modest-sized pre-defined samples, introducing an availability bias (Tversky and Kahnemann, 1973) that prevents the discovery of knowledge (or beliefs) of LLMs beyond the experimenter's predisposition.

    To address this challenge, we propose a novel methodology to comprehensively materializing an LLM's factual knowledge through recursive querying and result consolidation.

    As a prototype, we employ GPT-4o-mini to construct GPTKB, a large-scale knowledge base (KB) comprising 105 million triples for over 2.9 million entities - achieved at 1% of the cost of previous KB projects. This work marks a milestone in two areas: For LLM research, for the first time, it provides constructive insights into the scope and structure of LLMs' knowledge (or beliefs). For KB construction, it pioneers new pathways for the long-standing challenge of general-domain KB construction. GPTKB is accessible at this https://gptkb.org
  • Weitere Informationen unter:Further Information: Link
  • Forschungsgruppe:Research Group: Knowledge-aware Artificial IntelligenceKnowledge-aware Artificial Intelligence
@techreport{HNGR2024,
  author      = {Yujia Hu and Tuan-Phong Nguyen and Shrestha Ghosh and Simon
                 Razniewski},
  title       = {GPTKB: Comprehensively Materializing Factual {LLM} Knowledge},
  institution = {arXiv.org},
  year        = {2024},
  month       = {November}
}