Double DQN

GPTKB entity

Statements (68)
Predicate Object
gptkbp:instance_of gptkb:Artificial_Intelligence
gptkbp:aims_to improve stability
gptkbp:applies_to experience replay
gptkbp:can_be_combined_with prioritized experience replay
gptkbp:competes_with gptkb:DQN
gptkbp:concept gptkb:Artificial_Intelligence
gptkbp:developed_by gptkb:Google_Deep_Mind
https://www.w3.org/2000/01/rdf-schema#label Double DQN
gptkbp:improves Q-learning
gptkbp:introduced_in gptkb:2015
gptkbp:is_a_form_of off-policy learning
gptkbp:is_a_framework_for decision making
adaptive learning
gptkbp:is_a_solution_for control problems
gptkbp:is_a_subject_of research studies
gptkbp:is_a_tool_for automated decision making
gptkbp:is_applicable_to gptkb:transportation
healthcare
financial modeling
energy management
gptkbp:is_applied_in Atari games
gptkbp:is_based_on Q-learning algorithm
gptkbp:is_cited_in academic papers
gptkbp:is_compared_to gptkb:A3_C
gptkb:DDPG
gptkbp:is_designed_to maximize cumulative reward
gptkbp:is_evaluated_by benchmark tasks
gptkbp:is_implemented_in gptkb:Tensor_Flow
gptkb:Py_Torch
gptkbp:is_influenced_by gptkb:DQN
Q-learning
gptkbp:is_known_for sample efficiency
gptkbp:is_part_of gptkb:machine_learning
deep learning research
gptkbp:is_related_to policy gradient methods
gptkbp:is_used_in gptkb:robotics
reinforcement learning
game AI
gptkbp:marketing_strategy policy improvement
balances exploration and exploitation
gptkbp:model temporal difference learning
learns from experience
adapts to changing environments
generalizes well
supports continuous action spaces
gptkbp:reduces overestimation bias
gptkbp:requires large amounts of data
gptkbp:technique dynamic programming
function approximation
enhances performance
uses neural networks
can be parallelized.
can be scaled easily
facilitates learning from past experiences
improves convergence speed
improves learning efficiency
increases robustness
optimizes learning process
reduces variance in estimates
updates action values
value estimation
gptkbp:type_of deep reinforcement learning
gptkbp:uses two separate networks
gptkbp:utilizes target network
gptkbp:variant gptkb:DQN
gptkbp:bfsParent gptkb:Dueling_DQN
gptkb:DQN
gptkbp:bfsLayer 6