gptkbp:instance_of
|
gptkb:machine_learning
|
gptkbp:applies_to
|
gptkb:machine_learning
|
gptkbp:based_on
|
Q-Learning
|
gptkbp:can_be_combined_with
|
Value Function Approximation
|
gptkbp:can_be_used_in
|
gptkb:robotics
|
gptkbp:developed_by
|
gptkb:Deep_Mind
|
gptkbp:enhances
|
Sample Efficiency
|
gptkbp:has_achieved
|
Superhuman Performance
|
gptkbp:has_applications_in
|
Finance
Healthcare
Game Playing
|
gptkbp:has_limitations
|
Overfitting
Computational Cost
Sample Inefficiency
|
https://www.w3.org/2000/01/rdf-schema#label
|
Deep Q-Networks
|
gptkbp:improves
|
Exploration Strategies
Action Selection
|
gptkbp:inspired_by
|
Human Learning
|
gptkbp:introduced_in
|
gptkb:2013
|
gptkbp:is_applied_in
|
gptkb:Atari_Games
|
gptkbp:is_challenged_by
|
Adversarial Attacks
Non-Stationary Environments
Partial Observability
|
gptkbp:is_evaluated_by
|
Mean Squared Error
Performance Metrics
DQN Algorithm
Cumulative Reward
|
gptkbp:is_explored_in
|
gptkb:Workshops
Conferences
Academic Papers
|
gptkbp:is_implemented_in
|
gptkb:Tensor_Flow
gptkb:Py_Torch
|
gptkbp:is_influenced_by
|
Cognitive Science
Behavioral Psychology
Human Cognition
|
gptkbp:is_part_of
|
gptkb:Artificial_Intelligence
|
gptkbp:is_related_to
|
gptkb:Artificial_Neural_Networks
gptkb:Deep_Learning
Markov Decision Processes
Policy Gradient Methods
Temporal Difference Learning
|
gptkbp:is_supported_by
|
gptkb:Google_AI
gptkb:Open_AI
gptkb:NVIDIA
|
gptkbp:is_trained_in
|
Stochastic Gradient Descent
|
gptkbp:is_used_for
|
Control Systems
Decision Making
Game AI
|
gptkbp:is_used_in
|
gptkb:Industry
gptkb:research
gptkb:Education
|
gptkbp:requires
|
Large Datasets
|
gptkbp:training
|
Loss Functions
Reward Signals
|
gptkbp:uses
|
gptkb:neural_networks
|
gptkbp:utilizes
|
Experience Replay
|
gptkbp:bfsParent
|
gptkb:Deep_Reinforcement_Learning
|
gptkbp:bfsLayer
|
4
|