Statements (23)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:machine_learning_dataset
|
| gptkbp:contains |
gptkb:academic_journal
gptkb:Wikipedia books news articles web data text data multilingual data GitHub code |
| gptkbp:curatedBy |
gptkb:BigScience_Workshop
|
| gptkbp:license |
various open licenses
|
| gptkbp:notableCollection |
gptkb:OpenSubtitles
gptkb:Common_Crawl gptkb:OSCAR gptkb:The_Pile |
| gptkbp:openToPublic |
true
|
| gptkbp:releaseYear |
2022
|
| gptkbp:size |
1.6 terabytes
|
| gptkbp:supportsLanguage |
46 languages
|
| gptkbp:usedFor |
training BLOOM language model
|
| gptkbp:bfsParent |
gptkb:BigScience
|
| gptkbp:bfsLayer |
7
|
| https://www.w3.org/2000/01/rdf-schema#label |
BLOOM training dataset
|