gptkbp:instance_of
|
gptkb:software_framework
|
gptkbp:bfsLayer
|
4
|
gptkbp:bfsParent
|
gptkb:Lunar_Lander-v2
gptkb:DQN
|
gptkbp:applies_to
|
advantage function
|
gptkbp:based_on
|
Bellman equation
|
gptkbp:competes_with
|
gptkb:Double_DQN
|
gptkbp:developed_by
|
gptkb:Google_Deep_Mind
|
gptkbp:enhances
|
value function estimation
|
gptkbp:has_achievements
|
better performance
|
https://www.w3.org/2000/01/rdf-schema#label
|
Dueling DQN
|
gptkbp:improves
|
Q-learning
|
gptkbp:introduced
|
gptkb:2016
|
gptkbp:is_adopted_by
|
universities
government agencies
research institutions
startups
tech companies
|
gptkbp:is_cited_in
|
many research papers
|
gptkbp:is_compared_to
|
policy gradient methods
SARSA
traditional Q-learning
|
gptkbp:is_divided_into
|
value function into two streams
|
gptkbp:is_evaluated_by
|
gptkb:A3_C
TRPO
PPO
benchmark tasks
|
gptkbp:is_explored_in
|
gptkb:academic_research
game development
industry applications
robotics research
simulation environments
|
gptkbp:is_implemented_in
|
gptkb:Graphics_Processing_Unit
gptkb:Py_Torch
|
gptkbp:is_influenced_by
|
gptkb:DQN
|
gptkbp:is_known_for
|
convergence speed
sample efficiency
stability in training
|
gptkbp:is_optimized_for
|
gradient descent
|
gptkbp:is_part_of
|
gptkb:DQN_family
AI algorithms
|
gptkbp:is_popular_in
|
AI research community
|
gptkbp:is_related_to
|
reinforcement learning
|
gptkbp:is_supported_by
|
gptkb:document
tutorials
community contributions
open-source projects
|
gptkbp:is_tested_for
|
various environments
|
gptkbp:is_used_for
|
gptkb:robot
decision making
game playing
|
gptkbp:is_used_in
|
gptkb:battle
healthcare
finance
natural language processing
self-driving cars
Atari games
|
gptkbp:reduces
|
overestimation bias
|
gptkbp:requires
|
experience replay
|
gptkbp:uses
|
gptkb:microprocessor
|
gptkbp:utilizes
|
target networks
|