Statements (23)
Predicate | Object |
---|---|
gptkbp:instanceOf |
reinforcement learning algorithm
|
gptkbp:abbreviation |
gptkb:TD3
|
gptkbp:address |
overestimation bias in actor-critic methods
|
gptkbp:application |
robotics
continuous control simulated environments |
gptkbp:basedOn |
gptkb:Deep_Deterministic_Policy_Gradient
|
gptkbp:citation |
Addressing Function Approximation Error in Actor-Critic Methods
|
gptkbp:citesPaper |
Lillicrap et al., 2015
|
gptkbp:developedBy |
gptkb:David_Meger
gptkb:Herke_van_Hoof gptkb:Scott_Fujimoto |
https://www.w3.org/2000/01/rdf-schema#label |
Twin Delayed DDPG
|
gptkbp:improves |
gptkb:DDPG
other actor-critic algorithms |
gptkbp:openSource |
gptkb:GitHub
|
gptkbp:publicationYear |
2018
|
gptkbp:publishedIn |
gptkb:arXiv
|
gptkbp:technique |
delayed policy updates
target policy smoothing clipped double Q-learning |
gptkbp:bfsParent |
gptkb:DDPG
|
gptkbp:bfsLayer |
7
|