Adam optimizer

GPTKB entity

Statements (45)
Predicate Object
gptkbp:instanceOf mathematical optimization
gptkbp:advantage computationally efficient
little memory requirement
well suited for large data
well suited for parameters
gptkbp:category gradient-based optimization
stochastic optimization
gptkbp:citation high
gptkbp:commonIn neural networks
gptkbp:defaultBeta1 0.9
gptkbp:defaultBeta2 0.999
gptkbp:defaultEpsilon 1e-8
gptkbp:defaultLearningRate 0.001
gptkbp:fullName gptkb:Adaptive_Moment_Estimation
https://www.w3.org/2000/01/rdf-schema#label Adam optimizer
gptkbp:implementedIn gptkb:TensorFlow
gptkb:Keras
gptkb:JAX
gptkb:PyTorch
gptkbp:introduced gptkb:Diederik_P._Kingma
gptkb:Jimmy_Ba
gptkbp:introducedIn 2014
gptkbp:limitation may converge to suboptimal solutions
sensitive to hyperparameters
sometimes generalizes worse than SGD
gptkbp:openSource yes
gptkbp:parameter learning rate
beta1
beta2
epsilon
gptkbp:publishedIn gptkb:arXiv:1412.6980
gptkbp:relatedTo gptkb:AdaGrad
gptkb:Momentum_optimizer
gptkb:RMSProp
SGD
gptkbp:updateRule parameter update based on moving averages of gradient and squared gradient
uses bias-corrected first and second moment estimates
gptkbp:usedIn gptkb:machine_learning
deep learning
gptkbp:uses momentum
exponentially decaying averages of past squared gradients
adaptive learning rates
exponentially decaying averages of past gradients
gptkbp:bfsParent gptkb:machine_learning
gptkbp:bfsLayer 4