gptkbp:instance_of
|
gptkb:software_framework
|
gptkbp:bfsLayer
|
3
|
gptkbp:bfsParent
|
gptkb:philosopher
|
gptkbp:allows
|
Overfitting
Computational Cost
Sample Inefficiency
|
gptkbp:applies_to
|
gptkb:Atari_Games
gptkb:software_framework
|
gptkbp:based_on
|
Q-Learning
|
gptkbp:can_be_used_with
|
Value Function Approximation
|
gptkbp:developed_by
|
gptkb:philosopher
|
gptkbp:enhances
|
Sample Efficiency
|
gptkbp:has_achievements
|
Superhuman Performance
|
gptkbp:has_programs
|
Finance
Healthcare
Game Playing
|
https://www.w3.org/2000/01/rdf-schema#label
|
Deep Q-Networks
|
gptkbp:improves
|
Exploration Strategies
Action Selection
|
gptkbp:inspired_by
|
Human Learning
|
gptkbp:introduced
|
gptkb:2013
|
gptkbp:is_challenged_by
|
Adversarial Attacks
Non-Stationary Environments
Partial Observability
|
gptkbp:is_evaluated_by
|
Mean Squared Error
Performance Metrics
DQN Algorithm
Cumulative Reward
|
gptkbp:is_explored_in
|
gptkb:Workshops
Conferences
Academic Papers
|
gptkbp:is_implemented_in
|
gptkb:Graphics_Processing_Unit
gptkb:Py_Torch
|
gptkbp:is_influenced_by
|
Cognitive Science
Behavioral Psychology
Human Cognition
|
gptkbp:is_part_of
|
gptkb:Artificial_Intelligence
|
gptkbp:is_related_to
|
gptkb:Artificial_Neural_Networks
gptkb:Deep_Learning
Markov Decision Processes
Policy Gradient Methods
Temporal Difference Learning
|
gptkbp:is_supported_by
|
gptkb:DJ
gptkb:Google_AI
gptkb:Open_AI
|
gptkbp:is_used_for
|
Control Systems
Decision Making
Game AI
|
gptkbp:is_used_in
|
gptkb:film_production_company
gptkb:Educational_Institution
gptkb:robot
gptkb:Research_Institute
|
gptkbp:requires
|
Large Datasets
|
gptkbp:training
|
Stochastic Gradient Descent
Loss Functions
Reward Signals
|
gptkbp:uses
|
gptkb:microprocessor
|
gptkbp:utilizes
|
Experience Replay
|