Stochastic gradient descent

GPTKB entity

Statements (35)
Predicate Object
gptkbp:instanceOf mathematical optimization
gptkbp:advantage faster convergence on large datasets
may not converge to exact minimum
noisy updates
gptkbp:alsoKnownAs SGD
gptkbp:alternativeTo mini-batch gradient descent
batch gradient descent
gptkbp:category first-order optimization method
gptkbp:commonIn linear regression
logistic regression
neural network training
gptkbp:dependsOn data shuffling
learning rate schedule
loss surface
https://www.w3.org/2000/01/rdf-schema#label Stochastic gradient descent
gptkbp:hyperparameter learning rate
batch size
gptkbp:improves gptkb:Nesterov_accelerated_gradient
momentum
weight decay
learning rate schedules
gptkbp:introducedIn 1951
gptkbp:isFoundationFor gptkb:Adam_optimizer
gptkb:RMSprop
Adagrad
gptkbp:proposedBy gptkb:Herbert_Robbins
gptkb:Sutton_Monro
gptkbp:requires differentiable objective function
gptkbp:updateRule parameter update using single or few samples
gptkbp:usedFor minimizing loss functions
gptkbp:usedIn gptkb:machine_learning
deep learning
gptkbp:variant gradient descent
gptkbp:bfsParent gptkb:Stochastic_Gradient_Langevin_Dynamics
gptkbp:bfsLayer 7