Statements (47)
Predicate | Object |
---|---|
gptkbp:instance_of |
gptkb:machine_learning
|
gptkbp:benefits |
Improved Decision Making
Learning from Interaction Adaptability to Changing Environments Automation of Complex Tasks Handling Large State Spaces |
gptkbp:challenges |
Sample Efficiency
High Dimensional State Spaces Exploration vs Exploitation Credit Assignment Problem Stability and Convergence |
gptkbp:developed_by |
gptkb:Google_Deep_Mind
gptkb:Facebook_AI_Research gptkb:Open_AI University Research Labs |
gptkbp:has_applications_in |
gptkb:Natural_Language_Processing
gptkb:robotics Game Playing |
gptkbp:has_limitations |
Difficult to Interpret
Long Training Times Requires Large Amounts of Data Risk of Overfitting Sensitive to Hyperparameters |
gptkbp:has_method |
gptkb:Proximal_Policy_Optimization
gptkb:Trust_Region_Policy_Optimization gptkb:Deep_Q-Networks Actor-Critic Methods Policy Gradients |
https://www.w3.org/2000/01/rdf-schema#label |
Deep Reinforcement Learning
|
gptkbp:is_evaluated_by |
Learning Rate
Convergence Speed Generalization Ability Cumulative Reward Stability of Policy |
gptkbp:is_related_to |
gptkb:neural_networks
Markov Decision Processes Monte Carlo Methods Function Approximation Temporal Difference Learning |
gptkbp:is_used_in |
gptkb:Autonomous_Vehicles
Finance Healthcare Smart Grids Energy Management |
gptkbp:bfsParent |
gptkb:Jürgen_Schmidhuber
gptkb:Geoffrey_R._Hinton |
gptkbp:bfsLayer |
3
|