Overview
This interface presents GPTKB v1.5 (Hu et al. arXiv, 2025b), a large general-domain knowledge base (KB) entirely from a large language model (LLM). It demonstrates the feasibility of large-scale KB construction from LLMs, while highlighting specific challenges arising around entity recognition, entity and property canonicalization, and taxonomy construction (Hu et al. ACL, 2025a).
Based on GPT-4.1, GPTKB v1.5 contains 100 million triples for more than 6.1 million entities, at a cost 10x less than previous KBC projects. We also provide GPTKB v1.1 for download.
GPTKB is a landmark for two fields:- For NLP, for the first time, it provides constructive insights into the knowledge (or beliefs) of LLMs.
- For the Semantic Web, it shows novel ways forward for the long-standing challenge of general-domain KB construction.
Main papers
If you use this data, please cite the following paper(s):
Yujia Hu, Tuan-Phong Nguyen, Shrestha Ghosh, Simon Razniewski
Enabling LLM Knowledge Analysis via Extensive Materialization
ACL, 2025
Yujia Hu, Tuan-Phong Nguyen, Shrestha Ghosh, Moritz Müller, Simon Razniewski
GPTKB v1.5: A Massive Knowledge Base for Exploring Factual LLM Knowledge
arXiv, 2025