Statements (31)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:language_corpus
|
| gptkbp:access |
downloadable
|
| gptkbp:basedOn |
gptkb:Common_Crawl
|
| gptkbp:citation |
https://arxiv.org/abs/2106.00874
|
| gptkbp:contains |
deduplicated data
filtered data web-crawled text |
| gptkbp:createdBy |
gptkb:Language_Resources_and_Evaluation_(LREC)
|
| gptkbp:firstReleased |
2019
|
| gptkbp:fullName |
gptkb:Open_Super-large_Crawled_Aggregated_coRpus
|
| gptkbp:latestReleaseVersion |
23.01
|
| gptkbp:license |
CC BY 4.0
|
| gptkbp:maintainedBy |
gptkb:Université_de_Lorraine
|
| gptkbp:relatedTo |
gptkb:Common_Crawl
gptkb:CCNet gptkb:The_Pile gptkb:mC4 |
| gptkbp:size |
multi-terabyte
|
| gptkbp:supportsLanguage |
multiple languages
|
| gptkbp:type |
gptkb:text
|
| gptkbp:usedBy |
gptkb:researchers
universities AI companies |
| gptkbp:usedFor |
gptkb:machine_learning
natural language processing language modeling text analysis |
| gptkbp:website |
https://oscar-corpus.com/
|
| gptkbp:bfsParent |
gptkb:AOL_Instant_Messenger
|
| gptkbp:bfsLayer |
6
|
| https://www.w3.org/2000/01/rdf-schema#label |
OSCAR
|