Twin Delayed Deep Deterministic Policy Gradient
GPTKB entity
Statements (25)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:reinforcement_learning_algorithm
|
| gptkbp:abbreviation |
gptkb:TD3
|
| gptkbp:address |
overestimation bias in actor-critic methods
|
| gptkbp:application |
continuous control
|
| gptkbp:basedOn |
gptkb:Deep_Deterministic_Policy_Gradient
|
| gptkbp:citation |
high (hundreds to thousands)
|
| gptkbp:notableFor |
improved stability and performance over DDPG
|
| gptkbp:openSource |
gptkb:GitHub
|
| gptkbp:proposedBy |
gptkb:David_Meger
gptkb:Herke_van_Hoof gptkb:Scott_Fujimoto |
| gptkbp:publicationYear |
2018
|
| gptkbp:publishedIn |
gptkb:International_Conference_on_Machine_Learning_(ICML)_2018
|
| gptkbp:relatedTo |
gptkb:Deep_Q-Network
gptkb:Soft_Actor-Critic gptkb:Proximal_Policy_Optimization |
| gptkbp:usedIn |
autonomous vehicles
robotics simulated environments |
| gptkbp:uses |
delayed policy updates
target policy smoothing twin Q-networks |
| gptkbp:bfsParent |
gptkb:TD3
|
| gptkbp:bfsLayer |
7
|
| https://www.w3.org/2000/01/rdf-schema#label |
Twin Delayed Deep Deterministic Policy Gradient
|