gptkbp:instanceOf
|
gptkb:performance
|
gptkbp:assesses
|
gptkb:Matthews_correlation_coefficient
F1 score
accuracy
|
gptkbp:containsTask
|
gptkb:RTE
gptkb:COPA
gptkb:BoolQ
gptkb:AX-b
gptkb:MultiRC
gptkb:ReCoRD
gptkb:WiC
gptkb:WSC
CB
|
gptkbp:creator
|
gptkb:Samuel_Bowman
gptkb:Alex_Wang
gptkb:Amanpreet_Singh
gptkb:Felix_Hill
gptkb:Julian_Michael
gptkb:Omer_Levy
gptkb:Nikita_Nangia
gptkb:Yada_Pruksachatkun
|
gptkbp:designedFor
|
evaluating general language understanding
|
gptkbp:difficultyComparedToGLUE
|
harder
|
gptkbp:domain
|
gptkb:artificial_intelligence
natural language processing
|
gptkbp:fullName
|
gptkb:Super_General_Language_Understanding_Evaluation
|
https://www.w3.org/2000/01/rdf-schema#label
|
SuperGLUE
|
gptkbp:introducedIn
|
2019
|
gptkbp:language
|
English
|
gptkbp:license
|
gptkb:Apache_License_2.0
|
gptkbp:memiliki_tugas
|
question answering
reading comprehension
coreference resolution
word sense disambiguation
natural language inference
|
gptkbp:notableModel
|
gptkb:BART
gptkb:T5
gptkb:GPT-3
gptkb:ALBERT
gptkb:DeBERTa
gptkb:RoBERTa
|
gptkbp:notablePublication
|
gptkb:SuperGLUE:_A_Stickier_Benchmark_for_General-Purpose_Language_Understanding_Systems
https://arxiv.org/abs/1905.00537
|
gptkbp:predecessor
|
gptkb:GLUE
|
gptkbp:purpose
|
benchmarking progress in language understanding
|
gptkbp:usedBy
|
AI researchers
NLP practitioners
|
gptkbp:usedFor
|
evaluating large language models
measuring generalization in NLP
|
gptkbp:website
|
https://super.gluebenchmark.com/
|
gptkbp:bfsParent
|
gptkb:GLUE
gptkb:DeBERTa
|
gptkbp:bfsLayer
|
6
|