Statements (18)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:mathematical_concept
|
| gptkbp:appliesTo |
gptkb:Markov_chain
|
| gptkbp:describes |
value function recursion
|
| gptkbp:enables |
policy evaluation
policy improvement |
| gptkbp:field |
dynamic programming
|
| gptkbp:formedBy |
1950s
|
| gptkbp:hasVariant |
Bellman expectation equation
Bellman optimality equation |
| gptkbp:namedAfter |
gptkb:Richard_Bellman
|
| gptkbp:relatedTo |
gptkb:Hamilton–Jacobi–Bellman_equation
policy iteration value iteration |
| gptkbp:usedIn |
gptkb:reinforcement_learning
gptkb:optimal_control_theory |
| gptkbp:bfsParent |
gptkb:Q-learning
|
| gptkbp:bfsLayer |
5
|
| https://www.w3.org/2000/01/rdf-schema#label |
Bellman equation
|