Soft Actor-Critic

GPTKB entity

Statements (31)
Predicate Object
gptkbp:instanceOf reinforcement learning algorithm
gptkbp:abbreviation gptkb:SAC
gptkbp:advantage gptkb:explorer
stability
sample efficiency
gptkbp:appliesTo continuous action spaces
gptkbp:basedOn maximum entropy reinforcement learning
gptkbp:citation gptkb:Soft_Actor-Critic:_Off-Policy_Maximum_Entropy_Deep_Reinforcement_Learning_with_a_Stochastic_Actor
https://www.w3.org/2000/01/rdf-schema#label Soft Actor-Critic
gptkbp:introduced gptkb:Pieter_Abbeel
gptkb:Sergey_Levine
Aurick Zhou
Tuomas Haarnoja
gptkbp:introducedIn 2018
gptkbp:optimizedFor expected reward
policy entropy
gptkbp:publishedIn arXiv:1801.01290
gptkbp:relatedTo gptkb:Deep_Deterministic_Policy_Gradient
gptkb:Twin_Delayed_DDPG
gptkb:Proximal_Policy_Optimization
gptkbp:type off-policy
model-free
gptkbp:uses value function
Q-function
replay buffer
actor-critic architecture
target networks
stochastic policy
gptkbp:bfsParent gptkb:DDPG
gptkb:Denis_Yarats
gptkbp:bfsLayer 7