Statements (57)
Predicate | Object |
---|---|
gptkbp:instanceOf |
algorithm
|
gptkbp:employs |
target networks
|
gptkbp:hasContent |
Q-learning
|
gptkbp:hasDepartment |
PPO
|
gptkbp:hasVariants |
gptkb:DDPG_with_Hindsight_Experience_Replay
DDPG_with_prioritized_experience_replay |
https://www.w3.org/2000/01/rdf-schema#label |
DDPG
|
gptkbp:isA |
off-policy algorithm
|
gptkbp:isAvenueFor |
robotics
game playing control tasks |
gptkbp:isBasedOn |
actor-critic architecture
|
gptkbp:isChallengedBy |
instability during training
overestimation bias sample inefficiency |
gptkbp:isConsidered |
state-of-the-art in certain tasks
|
gptkbp:isEvaluatedBy |
mean squared error
average reward continuous control benchmarks robotic control tasks Atari_games |
gptkbp:isInfluencedBy |
Q-learning
DQN SARSA |
gptkbp:isLocatedIn |
gptkb:PyTorch
TensorFlow |
gptkbp:isNotableFor |
discrete action spaces
|
gptkbp:isPartOf |
AI research
deep learning frameworks |
gptkbp:isRelatedTo |
actor-critic methods
deep reinforcement learning policy gradient methods |
gptkbp:isSimilarTo |
TD3
|
gptkbp:isSupportedBy |
research papers
tutorials online courses |
gptkbp:isUsedFor |
autonomous driving
resource management financial trading other algorithms game AI development policy learning value function approximation |
gptkbp:isUsedIn |
real-world applications
reinforcement learning simulated environments |
gptkbp:isUtilizedIn |
OpenAI Gym
Unity ML-Agents |
gptkbp:isVisitedBy |
soft updates
ensemble methods double Q-learning |
gptkbp:performance |
continuous action spaces
|
gptkbp:requires |
hyperparameter tuning
|
gptkbp:uses |
deep neural networks
|
gptkbp:utilizes |
experience replay
|
gptkbp:wasAffecting |
Lillicrap_et_al.
|
gptkbp:wasEstablishedIn |
2015
|