Adaptive Moment Estimation

GPTKB entity

Statements (46)
Predicate Object
gptkbp:instanceOf mathematical optimization
gptkbp:abbreviation gptkb:Adam
gptkbp:advantage little memory requirement
sometimes generalizes worse than SGD
well suited for large data
well suited for parameters
efficient computation
may lead to non-convergent solutions
gptkbp:category gradient-based optimization
stochastic optimization
gptkbp:citation tens of thousands of papers
gptkbp:defaultBeta1 0.9
gptkbp:defaultBeta2 0.999
gptkbp:defaultEpsilon 1e-8
gptkbp:defaultLearningRate 0.001
gptkbp:hasConcept computes adaptive learning rates for each parameter
https://www.w3.org/2000/01/rdf-schema#label Adaptive Moment Estimation
gptkbp:influenced optimization in neural networks
training of deep learning models
gptkbp:license open source
gptkbp:openSource gptkb:TensorFlow
gptkb:Chainer
gptkb:Keras
gptkb:MXNet
gptkb:CNTK
gptkb:Caffe
gptkb:Theano
gptkb:JAX
gptkb:PyTorch
gptkbp:proposedBy gptkb:Diederik_P._Kingma
gptkb:Jimmy_Ba
gptkbp:publicationYear 2014
gptkbp:publishedIn gptkb:arXiv
gptkbp:relatedTo gptkb:AdaGrad
gptkb:RMSProp
Stochastic Gradient Descent
gptkbp:url https://arxiv.org/abs/1412.6980
gptkbp:usedIn gptkb:machine_learning
deep learning
gptkbp:uses first moment estimates
second moment estimates
gptkbp:variant gptkb:AMSGrad
gptkb:AdaMax
gptkb:AdamW
gptkbp:bfsParent gptkb:Adam_optimizer
gptkbp:bfsLayer 5