Adaptive Moment Estimation

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:mathematical_optimization
gptkbp:abbreviation	gptkb:Adam
gptkbp:advantage	little memory requirement sometimes generalizes worse than SGD well suited for large data well suited for parameters efficient computation may lead to non-convergent solutions
gptkbp:category	gradient-based optimization stochastic optimization
gptkbp:citation	tens of thousands of papers
gptkbp:defaultBeta1	0.9
gptkbp:defaultBeta2	0.999
gptkbp:defaultEpsilon	1e-8
gptkbp:defaultLearningRate	0.001
gptkbp:hasConcept	computes adaptive learning rates for each parameter
gptkbp:influenced	optimization in neural networks training of deep learning models
gptkbp:license	open source
gptkbp:openSource	gptkb:TensorFlow gptkb:Chainer gptkb:Keras gptkb:MXNet gptkb:CNTK gptkb:Caffe gptkb:Theano gptkb:JAX gptkb:PyTorch
gptkbp:proposedBy	gptkb:Diederik_P._Kingma gptkb:Jimmy_Ba
gptkbp:publicationYear	2014
gptkbp:publishedIn	gptkb:arXiv
gptkbp:relatedTo	gptkb:AdaGrad gptkb:RMSProp Stochastic Gradient Descent
gptkbp:url	https://arxiv.org/abs/1412.6980
gptkbp:usedIn	gptkb:machine_learning deep learning
gptkbp:uses	first moment estimates second moment estimates
gptkbp:variant	gptkb:AMSGrad gptkb:AdaMax gptkb:AdamW
gptkbp:bfsParent	gptkb:Adam_optimizer
gptkbp:bfsLayer	5
http://www.w3.org/2000/01/rdf-schema#label	Adaptive Moment Estimation