Statements (21)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:algorithmic_framework
gptkb:reinforcement_learning_method |
| gptkbp:appliesTo |
stochastic control
Markov decision processes |
| gptkbp:describedBy |
Neuro-Dynamic Programming (book)
|
| gptkbp:describedYear |
1996
|
| gptkbp:developedBy |
gptkb:John_Tsitsiklis
gptkb:Dimitri_Bertsekas |
| gptkbp:fieldOfStudy |
gptkb:artificial_intelligence
control theory operations research |
| gptkbp:relatedTo |
gptkb:Q-learning
policy iteration value iteration temporal-difference learning |
| gptkbp:uses |
gptkb:reinforcement_learning
neural networks approximate dynamic programming |
| gptkbp:bfsParent |
gptkb:John_Tsitsiklis
|
| gptkbp:bfsLayer |
7
|
| https://www.w3.org/2000/01/rdf-schema#label |
Neuro-dynamic programming
|