Statements (23)
Predicate | Object |
---|---|
gptkbp:instanceOf |
reinforcement learning algorithm
|
gptkbp:category |
gptkb:artificial_intelligence
gptkb:machine_learning temporal difference learning |
gptkbp:citation |
Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3(1), 9-44.
|
gptkbp:combines |
gptkb:Monte_Carlo_methods
dynamic programming |
gptkbp:fullName |
Temporal Difference Learning with Eligibility Traces
|
https://www.w3.org/2000/01/rdf-schema#label |
TD(λ)
|
gptkbp:introduced |
gptkb:Richard_S._Sutton
|
gptkbp:introducedIn |
1988
|
gptkbp:parameter |
discount factor
learning rate lambda (λ) |
gptkbp:relatedTo |
gptkb:Monte_Carlo_methods
gptkb:TD(0) Q(λ) SARSA(λ) |
gptkbp:usedFor |
policy evaluation
value function estimation |
gptkbp:usedIn |
gptkb:reinforcement_learning
|
gptkbp:bfsParent |
gptkb:Temporal_Difference_Learning
|
gptkbp:bfsLayer |
7
|