Statements (18)
Predicate | Object |
---|---|
gptkbp:instanceOf |
gptkb:mathematical_concept
|
gptkbp:appliesTo |
Markov chain
|
gptkbp:describes |
value function recursion
|
gptkbp:enables |
policy evaluation
policy improvement |
gptkbp:field |
dynamic programming
|
gptkbp:formedBy |
1950s
|
gptkbp:hasVariant |
Bellman expectation equation
Bellman optimality equation |
https://www.w3.org/2000/01/rdf-schema#label |
Bellman equation
|
gptkbp:namedAfter |
gptkb:Richard_Bellman
|
gptkbp:relatedTo |
gptkb:Hamilton–Jacobi–Bellman_equation
policy iteration value iteration |
gptkbp:usedIn |
gptkb:reinforcement_learning
gptkb:optimal_control_theory |
gptkbp:bfsParent |
gptkb:Q-learning
|
gptkbp:bfsLayer |
5
|