Statements (19)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:dataset
|
| gptkbp:availableOn |
public dataset
|
| gptkbp:contains |
news articles
|
| gptkbp:createdBy |
gptkb:Common_Crawl
|
| gptkbp:firstReleased |
2016
|
| gptkbp:format |
gptkb:WARC
|
| gptkbp:frequency |
monthly
|
| gptkbp:language |
English
|
| gptkbp:license |
gptkb:CC-BY_4.0
|
| gptkbp:size |
billions of words
|
| gptkbp:source |
news websites
web crawls |
| gptkbp:url |
https://commoncrawl.org/2016/10/news-dataset-available/
|
| gptkbp:usedFor |
gptkb:machine_learning
natural language processing language model training |
| gptkbp:bfsParent |
gptkb:RoBERTa
|
| gptkbp:bfsLayer |
7
|
| https://www.w3.org/2000/01/rdf-schema#label |
CC-News
|