Q-Learning

GPTKB entity

Statements (33)
Predicate Object
gptkbp:instanceOf Reinforcement learning algorithm
gptkbp:application gptkb:robot
Game playing
Autonomous control
gptkbp:category Temporal difference learning
gptkbp:compatibleWith Model of environment
gptkbp:convergesTo Optimal policy (under certain conditions)
gptkbp:explorationStrategy Epsilon-greedy
gptkbp:field gptkb:artificial_intelligence
Machine learning
gptkbp:form Markov chain
gptkbp:goal Learn optimal action-selection policy
https://www.w3.org/2000/01/rdf-schema#label Q-Learning
gptkbp:influenced Deep Q-Learning
gptkbp:input gptkb:action
gptkb:state_order
gptkbp:introduced gptkb:Christopher_Watkins
gptkbp:introducedIn 1989
gptkbp:output Q-value
gptkbp:relatedTo gptkb:Deep_Q-Network
gptkb:SARSA
gptkbp:rewardSignal Reinforcement signal
gptkbp:type Model-free algorithm
Off-policy algorithm
gptkbp:updateParameter Discount factor
Learning rate
gptkbp:updateRule gptkb:Bellman_equation
gptkbp:usedIn Resource management
Autonomous navigation
Atari game agents
gptkbp:uses Q-values
gptkbp:bfsParent gptkb:Temporal_Difference_Learning
gptkbp:bfsLayer 7