Markov Decision Processes

GPTKB entity

Statements (52)
Predicate Object
gptkbp:instanceOf gptkb:logic
gptkb:stochastic_process
decision process
gptkbp:actionValueFunction Q(s, a)
gptkbp:assumes discrete time
full observability
gptkbp:canBe infinite
finite
gptkbp:component gptkb:public_policy
states
actions
rewards
transition probabilities
gptkbp:discountFactor gamma
gptkbp:field gptkb:artificial_intelligence
gptkb:reinforcement_learning
control theory
economics
operations research
gptkbp:generalizes gptkb:Markov_chains
gptkbp:hasApplication autonomous vehicles
finance
healthcare
natural language processing
inventory management
queueing systems
resource allocation
https://www.w3.org/2000/01/rdf-schema#label Markov Decision Processes
gptkbp:introduced gptkb:Richard_Bellman
gptkbp:introducedIn 1957
gptkbp:namedAfter gptkb:Andrey_Markov
gptkbp:objective maximize expected reward
gptkbp:policyFunction π(a|s)
gptkbp:property gptkb:Markov_property
gptkbp:relatedTo gptkb:Bellman_equation
gptkb:Partially_Observable_Markov_Decision_Processes
gptkb:Semi-Markov_Decision_Processes
stochastic games
gptkbp:rewardFunction R(s, a)
gptkbp:solvedBy dynamic programming
policy iteration
value iteration
linear programming
gptkbp:transitionFunction P(s'|s, a)
gptkbp:usedIn gptkb:machine_learning
game theory
planning
robotics
gptkbp:valueFunction V(s)
gptkbp:bfsParent gptkb:SARSA
gptkb:Stochastic_Games
gptkbp:bfsLayer 6