gptkbp:instanceOf
|
gptkb:performance
natural language understanding benchmark
|
gptkbp:assesses
|
gptkb:Matthews_correlation_coefficient
F1 score
accuracy
|
gptkbp:containsTask
|
gptkb:RTE
gptkb:COPA
gptkb:BoolQ
gptkb:AX-b
gptkb:MultiRC
gptkb:ReCoRD
gptkb:WiC
gptkb:WSC
CB
|
gptkbp:designedFor
|
evaluating AI models
|
gptkbp:difficulty
|
harder than GLUE
|
gptkbp:domain
|
natural language processing
|
gptkbp:focusesOn
|
general language understanding
|
https://www.w3.org/2000/01/rdf-schema#label
|
SuperGLUE benchmark
|
gptkbp:introduced
|
gptkb:Samuel_Bowman
gptkb:Alex_Wang
gptkb:Amanpreet_Singh
gptkb:Felix_Hill
gptkb:Julian_Michael
gptkb:Omer_Levy
gptkb:Nikita_Nangia
gptkb:Yada_Pruksachatkun
|
gptkbp:introducedIn
|
2019
|
gptkbp:language
|
English
|
gptkbp:license
|
gptkb:Apache_License_2.0
|
gptkbp:notablePublication
|
gptkb:SuperGLUE:_A_Stickier_Benchmark_for_General-Purpose_Language_Understanding_Systems
https://arxiv.org/abs/1905.00537
|
gptkbp:predecessor
|
gptkb:GLUE_benchmark
|
gptkbp:usedBy
|
AI researchers
NLP community
|
gptkbp:usedFor
|
benchmarking language models
comparing model performance
|
gptkbp:website
|
https://super.gluebenchmark.com/
|
gptkbp:bfsParent
|
gptkb:COPA_(Choice_of_Plausible_Alternatives)
gptkb:A._Alex_Wang
gptkb:GLUE_team
|
gptkbp:bfsLayer
|
7
|