Statements (25)
Predicate | Object |
---|---|
gptkbp:instanceOf |
gptkb:performance
|
gptkbp:availableOn |
https://github.com/hendrycks/test
|
gptkbp:contains |
57 tasks
|
gptkbp:difficulty |
high school to professional
|
gptkbp:focusesOn |
language model evaluation
|
gptkbp:format |
multiple choice
|
gptkbp:fullName |
gptkb:Massive_Multitask_Language_Understanding
|
https://www.w3.org/2000/01/rdf-schema#label |
MMLU
|
gptkbp:introduced |
gptkb:OpenAI
gptkb:AI2 |
gptkbp:introducedIn |
2021
|
gptkbp:license |
gptkb:MIT_License
|
gptkbp:measures |
multitask accuracy
|
gptkbp:notableUser |
gptkb:GPT-3
gptkb:GPT-4 gptkb:LLaMA gptkb:PaLM |
gptkbp:numberOfRooms |
15,908
|
gptkbp:tasksCover |
gptkb:STEM
humanities social sciences other professional domains |
gptkbp:usedFor |
evaluating general knowledge of language models
|
gptkbp:bfsParent |
gptkb:Mixtral
|
gptkbp:bfsLayer |
6
|