Soft Actor-Critic

URI: https://gptkb.org/entity/Soft_Actor-Critic

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:reinforcement_learning_algorithm
gptkbp:abbreviation	gptkb:SAC
gptkbp:advantage	gptkb:explorer stability sample efficiency
gptkbp:appliesTo	continuous action spaces
gptkbp:basedOn	maximum entropy reinforcement learning
gptkbp:citation	gptkb:Soft_Actor-Critic:_Off-Policy_Maximum_Entropy_Deep_Reinforcement_Learning_with_a_Stochastic_Actor
gptkbp:introduced	gptkb:Pieter_Abbeel gptkb:Sergey_Levine Aurick Zhou Tuomas Haarnoja
gptkbp:introducedIn	2018
gptkbp:optimizedFor	expected reward policy entropy
gptkbp:publishedIn	arXiv:1801.01290
gptkbp:relatedTo	gptkb:Deep_Deterministic_Policy_Gradient gptkb:Twin_Delayed_DDPG gptkb:Proximal_Policy_Optimization
gptkbp:type	off-policy model-free
gptkbp:uses	value function Q-function replay buffer actor-critic architecture target networks stochastic policy
gptkbp:bfsParent	gptkb:DDPG gptkb:Denis_Yarats
gptkbp:bfsLayer	7
https://www.w3.org/2000/01/rdf-schema#label	Soft Actor-Critic