Double DQN

GPTKB entity

Statements (67)
Predicate Object
gptkbp:instance_of gptkb:Artificial_Intelligence
gptkbp:bfsLayer 4
gptkbp:bfsParent gptkb:DQN
gptkbp:aims_to improve stability
gptkbp:applies_to gptkb:public_transportation_system
healthcare
financial modeling
energy management
Atari games
experience replay
gptkbp:based_on Q-learning algorithm
gptkbp:can_be_used_with prioritized experience replay
gptkbp:competes_with gptkb:DQN
gptkbp:developed_by gptkb:Google_Deep_Mind
gptkbp:form off-policy learning
https://www.w3.org/2000/01/rdf-schema#label Double DQN
gptkbp:improves Q-learning
gptkbp:introduced gptkb:2015
gptkbp:is_a_framework_for decision making
adaptive learning
gptkbp:is_a_solution_for control problems
gptkbp:is_a_tool_for automated decision making
gptkbp:is_cited_in academic papers
gptkbp:is_compared_to gptkb:A3_C
gptkb:DDPG
gptkbp:is_designed_to maximize cumulative reward
gptkbp:is_evaluated_by benchmark tasks
gptkbp:is_implemented_in gptkb:Graphics_Processing_Unit
gptkb:Py_Torch
gptkbp:is_influenced_by gptkb:DQN
Q-learning
gptkbp:is_known_for sample efficiency
gptkbp:is_part_of gptkb:software_framework
deep learning research
gptkbp:is_related_to policy gradient methods
gptkbp:is_used_in gptkb:robot
reinforcement learning
game AI
gptkbp:marketing_strategy policy improvement
balances exploration and exploitation
gptkbp:reduces overestimation bias
gptkbp:related_concept gptkb:Artificial_Intelligence
gptkbp:related_model temporal difference learning
learns from experience
adapts to changing environments
generalizes well
supports continuous action spaces
gptkbp:requires large amounts of data
gptkbp:subject research studies
gptkbp:technique dynamic programming
function approximation
enhances performance
uses neural networks
can be parallelized.
can be scaled easily
facilitates learning from past experiences
improves convergence speed
improves learning efficiency
increases robustness
optimizes learning process
reduces variance in estimates
updates action values
value estimation
gptkbp:type_of deep reinforcement learning
gptkbp:uses two separate networks
gptkbp:utilizes target network
gptkbp:variant gptkb:DQN