TD(0)

GPTKB entity

Statements (28)
Predicate Object
gptkbp:instanceOf reinforcement learning algorithm
gptkbp:application control systems
robotics
game playing
gptkbp:category model-free methods
gptkbp:compatibleWith model of environment
gptkbp:convergesTo true value function under certain conditions
gptkbp:fullName Temporal Difference Learning with zero-step lookahead
gptkbp:hasSpecialCase gptkb:TD(λ)
temporal difference learning
https://www.w3.org/2000/01/rdf-schema#label TD(0)
gptkbp:input state transitions
reward signal
gptkbp:introduced gptkb:Richard_S._Sutton
gptkbp:introducedIn 1988
gptkbp:learns state value function
gptkbp:output updated value function
gptkbp:parameter discount factor (γ)
learning rate (α)
gptkbp:relatedTo gptkb:Monte_Carlo_methods
gptkb:SARSA
gptkb:Q-learning
gptkbp:updateRule V(s) ← V(s) + α [r + γ V(s') − V(s)]
gptkbp:usedIn gptkb:reinforcement_learning
gptkbp:uses bootstrapping
temporal difference error
gptkbp:bfsParent gptkb:Temporal_Difference_Learning
gptkbp:bfsLayer 7