Statements (21)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:web_crawl_dataset
|
| gptkbp:availableOn |
lemurproject.org/clueweb12
|
| gptkbp:contains |
over 700 million web pages
|
| gptkbp:createdBy |
gptkb:Carnegie_Mellon_University
|
| gptkbp:format |
gptkb:WARC
|
| gptkbp:language |
gptkb:Chinese
English |
| gptkbp:license |
research use only
|
| gptkbp:notableCollection |
ClueWeb12-A
ClueWeb12-B13 |
| gptkbp:releaseYear |
2012
|
| gptkbp:size |
about 27TB (compressed)
|
| gptkbp:successor |
gptkb:ClueWeb09_dataset
|
| gptkbp:usedFor |
web mining
natural language processing research information retrieval research |
| gptkbp:usedIn |
gptkb:TREC_Web_Track
NTCIR Web Track |
| gptkbp:bfsParent |
gptkb:Lemur_Project
|
| gptkbp:bfsLayer |
7
|
| https://www.w3.org/2000/01/rdf-schema#label |
ClueWeb12 dataset
|