Q-Learning

URI: https://gptkb.org/entity/Q-Learning

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:Reinforcement_learning_algorithm
gptkbp:application	gptkb:robot Game playing Autonomous control
gptkbp:category	Temporal difference learning
gptkbp:compatibleWith	Model of environment
gptkbp:convergesTo	Optimal policy (under certain conditions)
gptkbp:explorationStrategy	Epsilon-greedy
gptkbp:field	gptkb:artificial_intelligence Machine learning
gptkbp:form	gptkb:Markov_chain
gptkbp:goal	Learn optimal action-selection policy
gptkbp:influenced	Deep Q-Learning
gptkbp:input	gptkb:action gptkb:state_order
gptkbp:introduced	gptkb:Christopher_Watkins
gptkbp:introducedIn	1989
gptkbp:output	Q-value
gptkbp:relatedTo	gptkb:Deep_Q-Network gptkb:SARSA
gptkbp:rewardSignal	Reinforcement signal
gptkbp:type	Model-free algorithm Off-policy algorithm
gptkbp:updateParameter	Discount factor Learning rate
gptkbp:updateRule	gptkb:Bellman_equation
gptkbp:usedIn	Resource management Autonomous navigation Atari game agents
gptkbp:uses	Q-values
gptkbp:bfsParent	gptkb:Temporal_Difference_Learning
gptkbp:bfsLayer	7
http://www.w3.org/2000/01/rdf-schema#label	Q-Learning