Statements (23)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:reinforcement_learning_algorithm
|
| gptkbp:category |
gptkb:artificial_intelligence
gptkb:machine_learning temporal difference learning |
| gptkbp:citation |
Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3(1), 9-44.
|
| gptkbp:combines |
gptkb:Monte_Carlo_methods
dynamic programming |
| gptkbp:fullName |
Temporal Difference Learning with Eligibility Traces
|
| gptkbp:introduced |
gptkb:Richard_S._Sutton
|
| gptkbp:introducedIn |
1988
|
| gptkbp:parameter |
discount factor
learning rate lambda (λ) |
| gptkbp:relatedTo |
gptkb:Monte_Carlo_methods
gptkb:TD(0) Q(λ) SARSA(λ) |
| gptkbp:usedFor |
policy evaluation
value function estimation |
| gptkbp:usedIn |
gptkb:reinforcement_learning
|
| gptkbp:bfsParent |
gptkb:Temporal_Difference_Learning
|
| gptkbp:bfsLayer |
7
|
| https://www.w3.org/2000/01/rdf-schema#label |
TD(λ)
|