Scaling Laws for Neural Language Models

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:academic_journal
gptkbp:arXivID	2001.08361
gptkbp:author	gptkb:Ilya_Sutskever gptkb:Benjamin_Chess gptkb:Tom_B._Brown gptkb:Scott_Gray gptkb:Alec_Radford gptkb:Dario_Amodei gptkb:Jared_Kaplan gptkb:Rewon_Child gptkb:Sam_McCandlish gptkb:Tom_Henighan gptkb:Jeffrey_Wu
gptkbp:citation	high
gptkbp:foundIn	Performance of neural language models improves predictably as model size, dataset size, and compute increase There are diminishing returns to increasing model size or dataset size alone Test loss scales as a power-law with respect to model size, dataset size, and compute Optimal allocation of compute between model size and dataset size can be derived
gptkbp:influenced	subsequent research on large language models
gptkbp:memberSchool	gptkb:OpenAI
gptkbp:publicationYear	2020
gptkbp:publishedIn	gptkb:arXiv
gptkbp:topic	deep learning scaling laws language models
gptkbp:bfsParent	gptkb:NeurIPS_2022 gptkb:Jared_Kaplan gptkb:Rewon_Child
gptkbp:bfsLayer	7
https://www.w3.org/2000/01/rdf-schema#label	Scaling Laws for Neural Language Models