Adam optimization algorithm

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:model gptkb:mathematical_optimization
gptkbp:advantage	computationally efficient may converge to suboptimal solutions can have poor generalization in some cases requires little memory well suited for problems with large data well suited for problems with many parameters
gptkbp:category	gradient-based optimization stochastic optimization
gptkbp:commonIn	computer vision deep learning natural language processing
gptkbp:defaultBeta1	0.9
gptkbp:defaultBeta2	0.999
gptkbp:defaultEpsilon	1e-8
gptkbp:defaultLearningRate	0.001
gptkbp:fullName	gptkb:Adaptive_Moment_Estimation
gptkbp:introduced	gptkb:Diederik_P._Kingma gptkb:Jimmy_Ba
gptkbp:introducedIn	2014
gptkbp:parameter	learning rate beta1 beta2 epsilon
gptkbp:publishedIn	gptkb:arXiv:1412.6980
gptkbp:relatedTo	gptkb:AdaGrad gptkb:RMSProp SGD
gptkbp:usedFor	training neural networks
gptkbp:uses	momentum exponentially decaying averages of past squared gradients adaptive learning rates exponentially decaying averages of past gradients
gptkbp:bfsParent	gptkb:Adam:_A_Method_for_Stochastic_Optimization
gptkbp:bfsLayer	7
https://www.w3.org/2000/01/rdf-schema#label	Adam optimization algorithm