Policy iteration

GPTKB entity

Statements (22)
Predicate Object
gptkbp:instanceOf gptkb:algorithm
gptkbp:canBe approximate
exact
gptkbp:category dynamic programming
gptkbp:complexity polynomial time (for finite MDPs)
gptkbp:convergesTo optimal policy
gptkbp:goal find optimal policy
gptkbp:hasVariant generalized policy iteration
modified policy iteration
https://www.w3.org/2000/01/rdf-schema#label Policy iteration
gptkbp:input Markov chain
gptkbp:introduced gptkb:Richard_Bellman
gptkbp:output optimal policy
gptkbp:relatedTo value iteration
gptkbp:requires reward function
transition probabilities
gptkbp:step policy evaluation
policy improvement
gptkbp:usedIn gptkb:reinforcement_learning
Markov chain
gptkbp:bfsParent gptkb:Partially_Observable_Markov_Decision_Process
gptkbp:bfsLayer 7