OpenAI WebText

GPTKB entity

Statements (18)
Predicate Object
gptkbp:instanceOf gptkb:dataset
gptkbp:access not publicly released
gptkbp:createdBy gptkb:OpenAI
gptkbp:excludes gptkb:Wikipedia
https://www.w3.org/2000/01/rdf-schema#label OpenAI WebText
gptkbp:language English
gptkbp:notableCollection web scraping
gptkbp:releaseDate 2019
gptkbp:size over 8 million documents
gptkbp:source web pages
gptkbp:usedFor natural language processing
text generation
machine learning research
language model training
gptkbp:usedIn gptkb:GPT-2
gptkb:GPT-3
gptkbp:bfsParent gptkb:OpenWebText
gptkbp:bfsLayer 7