Gutenberg (PG-19)

GPTKB entity

Statements (22)
Predicate Object
gptkbp:instanceOf gptkb:dataset
gptkbp:contains public domain books
English books
gptkbp:domain natural language processing
gptkbp:format plain text
gptkbp:fullName Project Gutenberg Dataset (PG-19)
https://www.w3.org/2000/01/rdf-schema#label Gutenberg (PG-19)
gptkbp:language English
gptkbp:license public domain
gptkbp:notablePublication A Dataset of English Books for Long-Range Language Modeling (Rae et al., 2019)
gptkbp:numberOfBooks 28,595
gptkbp:period books published before 1919
gptkbp:releaseYear gptkb:University_of_Edinburgh
2018
gptkbp:size over 2 billion words
gptkbp:source gptkb:Project_Gutenberg
gptkbp:url https://github.com/deepmind/pg19
gptkbp:usedFor gptkb:machine_learning
language modeling
text analysis
gptkbp:bfsParent gptkb:The_Pile
gptkbp:bfsLayer 7