Statements (22)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:mathematical_optimization
|
| gptkbp:advantage |
handles sparse gradients
|
| gptkbp:category |
gradient-based optimization
stochastic optimization |
| gptkbp:citation |
gptkb:Duchi,_J.,_Hazan,_E.,_&_Singer,_Y._(2011)._Adaptive_Subgradient_Methods_for_Online_Learning_and_Stochastic_Optimization._JMLR,_12,_2121-2159.
|
| gptkbp:field |
gptkb:machine_learning
|
| gptkbp:fullName |
gptkb:Adaptive_Gradient_Algorithm
|
| gptkbp:introduced |
gptkb:Yoram_Singer
gptkb:Elad_Hazan gptkb:John_Duchi |
| gptkbp:introducedIn |
2011
|
| gptkbp:limitation |
accumulated squared gradients can lead to vanishing learning rates
|
| gptkbp:purpose |
improve learning rate adaptation
|
| gptkbp:relatedTo |
gptkb:Adam
gptkb:RMSProp |
| gptkbp:updateRule |
per-parameter learning rate
|
| gptkbp:usedFor |
training neural networks
sparse data |
| gptkbp:bfsParent |
gptkb:Adam_optimizer
gptkb:RMSprop |
| gptkbp:bfsLayer |
5
|
| https://www.w3.org/2000/01/rdf-schema#label |
AdaGrad
|