Statements (23)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:Reinforcement_Learning_Method
|
| gptkbp:appliesTo |
gptkb:robot
Control Systems Game Playing |
| gptkbp:canBe |
Off-Policy
On-Policy |
| gptkbp:compatibleWith |
Explicit Model of Environment
|
| gptkbp:contrastsWith |
Model-Based Learning
|
| gptkbp:dependsOn |
Trial and Error
|
| gptkbp:doesNotBuild |
Reward Model
Transition Model |
| gptkbp:example |
gptkb:SARSA
gptkb:Q-Learning Policy Gradient Methods |
| gptkbp:focusesOn |
Learning from Experience
|
| gptkbp:learns |
Policies
Value Functions |
| gptkbp:studiedIn |
gptkb:Machine_Learning
gptkb:artificial_intelligence |
| gptkbp:usedIn |
gptkb:Reinforcement_Learning
|
| gptkbp:bfsParent |
gptkb:Temporal_Difference_Learning
|
| gptkbp:bfsLayer |
7
|
| https://www.w3.org/2000/01/rdf-schema#label |
Model-Free Learning
|