Temporal Difference Learning
GPTKB entity
Statements (27)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:Reinforcement_Learning_Algorithm
|
| gptkbp:abbreviation |
gptkb:TD_Learning
|
| gptkbp:appliesTo |
gptkb:robot
Control Systems Game Playing |
| gptkbp:category |
gptkb:Machine_Learning
gptkb:artificial_intelligence |
| gptkbp:combines |
gptkb:Dynamic_Programming
Monte Carlo Methods |
| gptkbp:developedBy |
gptkb:Richard_S._Sutton
|
| gptkbp:example |
gptkb:Model-Free_Learning
|
| gptkbp:hasVariant |
gptkb:SARSA
gptkb:Q-Learning gptkb:TD(λ) gptkb:TD(0) |
| gptkbp:introducedIn |
1988
|
| gptkbp:learnsFrom |
Experience
|
| gptkbp:objective |
Estimate Value Functions
Optimize Policies |
| gptkbp:relatedTo |
gptkb:Policy_Evaluation
gptkb:Policy_Improvement |
| gptkbp:updated |
Value Functions
|
| gptkbp:usedIn |
gptkb:Reinforcement_Learning
|
| gptkbp:uses |
Bootstrapping
|
| gptkbp:bfsParent |
gptkb:Peter_Dayan
|
| gptkbp:bfsLayer |
6
|
| https://www.w3.org/2000/01/rdf-schema#label |
Temporal Difference Learning
|