gptkbp:instanceOf
|
mathematical optimization
|
gptkbp:address
|
variance of adaptive learning rate in Adam
|
gptkbp:advantage
|
better generalization
less sensitivity to learning rate
more stable training
|
gptkbp:category
|
gradient-based optimization
stochastic optimization
|
gptkbp:citation
|
1000+
|
gptkbp:defaultBeta1
|
0.9
|
gptkbp:defaultBeta2
|
0.999
|
gptkbp:defaultEpsilon
|
1e-8
|
gptkbp:defaultLearningRate
|
0.001
|
gptkbp:fullName
|
Rectified Adam
|
https://www.w3.org/2000/01/rdf-schema#label
|
Radam
|
gptkbp:improves
|
gptkb:Adam
|
gptkbp:openSource
|
gptkb:TensorFlow
gptkb:Keras
gptkb:PyTorch
|
gptkbp:parameter
|
learning rate
beta1
beta2
epsilon
|
gptkbp:proposedBy
|
gptkb:Pengcheng_He
gptkb:Weizhu_Chen
gptkb:Xiaodong_Liu
gptkb:Jianfeng_Gao
2019
Haoming Jiang
Liyuan Liu
|
gptkbp:publishedIn
|
arXiv:1908.03265
|
gptkbp:relatedTo
|
gptkb:Adam
gptkb:RMSProp
SGD
AdaBelief
AdaBound
|
gptkbp:repository
|
https://github.com/LiyuanLucasLiu/RAdam
|
gptkbp:usedIn
|
gptkb:machine_learning
deep learning
|
gptkbp:bfsParent
|
gptkb:Tekkaman_Blade
|
gptkbp:bfsLayer
|
7
|