gptkbp:instanceOf
|
gptkb:text
|
gptkbp:access
|
licensed
|
gptkbp:contains
|
newswire text
|
gptkbp:firstReleased
|
2003
|
gptkbp:format
|
gptkb:SGML
plain text
|
gptkbp:frequency
|
periodically updated
|
gptkbp:hasVersion
|
gptkb:Arabic_Gigaword
gptkb:Chinese_Gigaword
gptkb:English_Gigaword
gptkb:French_Gigaword
gptkb:Spanish_Gigaword
|
https://www.w3.org/2000/01/rdf-schema#label
|
Gigaword corpus
|
gptkbp:language
|
gptkb:Arabic
gptkb:Chinese
gptkb:French
gptkb:Spanish
English
|
gptkbp:notableFor
|
gptkb:Associated_Press
gptkb:Los_Angeles_Times
gptkb:The_New_York_Times
gptkb:Xinhua_News_Agency
gptkb:Agence_France-Presse
gptkb:Washington_Post
gptkb:Central_News_Agency_(Taiwan)
Agence Tunis Afrique Presse
|
gptkbp:publisher
|
gptkb:Linguistic_Data_Consortium
|
gptkbp:size
|
billions of words
|
gptkbp:source
|
news agencies
|
gptkbp:usedFor
|
gptkb:machine_learning
machine translation
natural language processing
language modeling
text summarization
|
gptkbp:usedIn
|
research
academic studies
industry applications
|
gptkbp:bfsParent
|
gptkb:Linguistic_Data_Consortium
|
gptkbp:bfsLayer
|
7
|