WebText

GPTKB entity

Statements (19)
Predicate Object
gptkbp:instanceOf gptkb:dataset
gptkbp:createdBy gptkb:OpenAI
gptkbp:excludes gptkb:Wikipedia
OpenAI's own content
content behind paywalls
gptkbp:filteredFrom Reddit outbound links
https://www.w3.org/2000/01/rdf-schema#label WebText
gptkbp:language English
gptkbp:notPubliclyAvailable true
gptkbp:relatedTo gptkb:GPT-2_dataset
language modeling
gptkbp:releaseYear 2019
gptkbp:size over 8 million documents
gptkbp:source web pages
gptkbp:usedFor unsupervised learning
language model training
gptkbp:usedIn gptkb:GPT-2
gptkbp:bfsParent gptkb:GPT-3
gptkbp:bfsLayer 5