Twin Delayed Deep Deterministic Policy Gradient

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:reinforcement_learning_algorithm
gptkbp:abbreviation	gptkb:TD3
gptkbp:address	overestimation bias in actor-critic methods
gptkbp:application	continuous control
gptkbp:basedOn	gptkb:Deep_Deterministic_Policy_Gradient
gptkbp:citation	high (hundreds to thousands)
gptkbp:notableFor	improved stability and performance over DDPG
gptkbp:openSource	gptkb:GitHub
gptkbp:proposedBy	gptkb:David_Meger gptkb:Herke_van_Hoof gptkb:Scott_Fujimoto
gptkbp:publicationYear	2018
gptkbp:publishedIn	gptkb:International_Conference_on_Machine_Learning_(ICML)_2018
gptkbp:relatedTo	gptkb:Deep_Q-Network gptkb:Soft_Actor-Critic gptkb:Proximal_Policy_Optimization
gptkbp:usedIn	autonomous vehicles robotics simulated environments
gptkbp:uses	delayed policy updates target policy smoothing twin Q-networks
gptkbp:bfsParent	gptkb:TD3
gptkbp:bfsLayer	7
https://www.w3.org/2000/01/rdf-schema#label	Twin Delayed Deep Deterministic Policy Gradient