TD(0)

URI: https://gptkb.org/entity/TD(0)

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:reinforcement_learning_algorithm
gptkbp:application	control systems robotics game playing
gptkbp:category	model-free methods
gptkbp:compatibleWith	model of environment
gptkbp:convergesTo	true value function under certain conditions
gptkbp:fullName	Temporal Difference Learning with zero-step lookahead
gptkbp:hasSpecialCase	gptkb:TD(λ) temporal difference learning
gptkbp:input	state transitions reward signal
gptkbp:introduced	gptkb:Richard_S._Sutton
gptkbp:introducedIn	1988
gptkbp:learns	state value function
gptkbp:output	updated value function
gptkbp:parameter	discount factor (γ) learning rate (α)
gptkbp:relatedTo	gptkb:Monte_Carlo_methods gptkb:SARSA gptkb:Q-learning
gptkbp:updateRule	V(s) ← V(s) + α [r + γ V(s') − V(s)]
gptkbp:usedIn	gptkb:reinforcement_learning
gptkbp:uses	bootstrapping temporal difference error
gptkbp:bfsParent	gptkb:Temporal_Difference_Learning
gptkbp:bfsLayer	7
http://www.w3.org/2000/01/rdf-schema#label	TD(0)