Statements (23)
Predicate | Object |
---|---|
gptkbp:instanceOf |
reinforcement learning algorithm
|
gptkbp:appliesTo |
control systems
robotics game playing |
gptkbp:combines |
gptkb:Monte_Carlo_methods
dynamic programming |
gptkbp:compatibleWith |
model of environment
|
gptkbp:example |
gptkb:TD(λ)
gptkb:TD(0) |
gptkbp:fullName |
gptkb:Temporal_Difference_learning
|
gptkbp:hasConcept |
bootstrapping
temporal difference error |
https://www.w3.org/2000/01/rdf-schema#label |
TD learning
|
gptkbp:introduced |
gptkb:Richard_S._Sutton
|
gptkbp:introducedIn |
1988
|
gptkbp:learnsFrom |
raw experience
|
gptkbp:relatedTo |
gptkb:SARSA
gptkb:Q-learning |
gptkbp:updated |
after every step
value estimates |
gptkbp:usedIn |
gptkb:reinforcement_learning
|
gptkbp:bfsParent |
gptkb:Temporal_Difference_learning
|
gptkbp:bfsLayer |
8
|