Q-learning

URI: https://gptkb.org/entity/Q-learning

GPTKB entity

AI-created image

Predicate	Object
gptkbp:instanceOf	gptkb:Reinforcement_learning_algorithm
gptkbp:application	gptkb:Autonomous_vehicles gptkb:robot Game playing Resource management
gptkbp:category	Off-policy learning
gptkbp:citation	gptkb:Watkins,_C.J.C.H._(1989)._Learning_from_Delayed_Rewards._PhD_thesis,_University_of_Cambridge.
gptkbp:compatibleWith	Model of environment
gptkbp:convergesTo	Optimal policy
gptkbp:explorationStrategy	Epsilon-greedy Softmax
gptkbp:field	gptkb:artificial_intelligence Machine learning Reinforcement learning
gptkbp:influenced	Deep reinforcement learning
gptkbp:input	gptkb:action gptkb:state_order
gptkbp:introduced	gptkb:Christopher_Watkins
gptkbp:introducedIn	1989
gptkbp:output	Q-value
gptkbp:relatedTo	gptkb:Deep_Q-Network gptkb:SARSA
gptkbp:rewardSignal	Reinforcement signal
gptkbp:solvedBy	gptkb:Markov_chain
gptkbp:type	Model-free algorithm
gptkbp:updateRule	gptkb:Bellman_equation
gptkbp:uses	Q-value
gptkbp:bfsParent	gptkb:machine_learning
gptkbp:bfsLayer	4
http://www.w3.org/2000/01/rdf-schema#label	Q-learning