Statements (22)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:dataset
|
| gptkbp:contains |
public domain books
English books |
| gptkbp:domain |
natural language processing
|
| gptkbp:format |
plain text
|
| gptkbp:fullName |
Project Gutenberg Dataset (PG-19)
|
| gptkbp:language |
English
|
| gptkbp:license |
public domain
|
| gptkbp:notablePublication |
A Dataset of English Books for Long-Range Language Modeling (Rae et al., 2019)
|
| gptkbp:numberOfBooks |
28,595
|
| gptkbp:period |
books published before 1919
|
| gptkbp:releaseYear |
gptkb:University_of_Edinburgh
2018 |
| gptkbp:size |
over 2 billion words
|
| gptkbp:source |
gptkb:Project_Gutenberg
|
| gptkbp:url |
https://github.com/deepmind/pg19
|
| gptkbp:usedFor |
gptkb:machine_learning
language modeling text analysis |
| gptkbp:bfsParent |
gptkb:The_Pile
|
| gptkbp:bfsLayer |
8
|
| https://www.w3.org/2000/01/rdf-schema#label |
Gutenberg (PG-19)
|