Statements (25)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:performance
|
| gptkbp:availableOn |
https://github.com/hendrycks/test
|
| gptkbp:contains |
57 tasks
|
| gptkbp:difficulty |
high school to professional
|
| gptkbp:focusesOn |
language model evaluation
|
| gptkbp:format |
multiple choice
|
| gptkbp:fullName |
gptkb:Massive_Multitask_Language_Understanding
|
| gptkbp:introduced |
gptkb:OpenAI
gptkb:AI2 |
| gptkbp:introducedIn |
2021
|
| gptkbp:license |
gptkb:MIT_License
|
| gptkbp:measures |
multitask accuracy
|
| gptkbp:notableUser |
gptkb:GPT-3
gptkb:GPT-4 gptkb:LLaMA gptkb:PaLM |
| gptkbp:numberOfRooms |
15,908
|
| gptkbp:tasksCover |
gptkb:STEM
humanities social sciences other professional domains |
| gptkbp:usedFor |
evaluating general knowledge of language models
|
| gptkbp:bfsParent |
gptkb:LLaMA_3
|
| gptkbp:bfsLayer |
7
|
| https://www.w3.org/2000/01/rdf-schema#label |
MMLU
|