Statements (22)
Predicate | Object |
---|---|
gptkbp:instanceOf |
mathematical optimization
|
gptkbp:advantage |
handles sparse gradients
|
gptkbp:category |
gradient-based optimization
stochastic optimization |
gptkbp:citation |
gptkb:Duchi,_J.,_Hazan,_E.,_&_Singer,_Y._(2011)._Adaptive_Subgradient_Methods_for_Online_Learning_and_Stochastic_Optimization._JMLR,_12,_2121-2159.
|
gptkbp:field |
gptkb:machine_learning
|
gptkbp:fullName |
gptkb:Adaptive_Gradient_Algorithm
|
https://www.w3.org/2000/01/rdf-schema#label |
AdaGrad
|
gptkbp:introduced |
gptkb:Yoram_Singer
gptkb:Elad_Hazan gptkb:John_Duchi |
gptkbp:introducedIn |
2011
|
gptkbp:limitation |
accumulated squared gradients can lead to vanishing learning rates
|
gptkbp:purpose |
improve learning rate adaptation
|
gptkbp:relatedTo |
gptkb:Adam
gptkb:RMSProp |
gptkbp:updateRule |
per-parameter learning rate
|
gptkbp:usedFor |
training neural networks
sparse data |
gptkbp:bfsParent |
gptkb:Adam_optimizer
gptkb:RMSprop |
gptkbp:bfsLayer |
5
|