Scaling Laws for Neural Language Models
GPTKB entity
Statements (29)
Predicate | Object |
---|---|
gptkbp:instanceOf |
gptkb:academic_journal
|
gptkbp:arXivID |
2001.08361
|
gptkbp:author |
gptkb:Ilya_Sutskever
gptkb:Benjamin_Chess gptkb:Tom_B._Brown gptkb:Scott_Gray gptkb:Alec_Radford gptkb:Dario_Amodei gptkb:Jared_Kaplan gptkb:Rewon_Child gptkb:Sam_McCandlish gptkb:Tom_Henighan gptkb:Jeffrey_Wu |
gptkbp:citation |
high
|
gptkbp:foundIn |
Performance of neural language models improves predictably as model size, dataset size, and compute increase
There are diminishing returns to increasing model size or dataset size alone Test loss scales as a power-law with respect to model size, dataset size, and compute Optimal allocation of compute between model size and dataset size can be derived |
https://www.w3.org/2000/01/rdf-schema#label |
Scaling Laws for Neural Language Models
|
gptkbp:influenced |
subsequent research on large language models
|
gptkbp:memberSchool |
gptkb:OpenAI
|
gptkbp:publicationYear |
2020
|
gptkbp:publishedIn |
gptkb:arXiv
|
gptkbp:topic |
deep learning
scaling laws language models |
gptkbp:bfsParent |
gptkb:NeurIPS_2022
gptkb:Jared_Kaplan |
gptkbp:bfsLayer |
6
|