Statements (26)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:reinforcement_learning_algorithm
|
| gptkbp:appliesTo |
deep reinforcement learning
|
| gptkbp:fullName |
Actor Critic using Kronecker-Factored Trust Region
|
| gptkbp:improves |
sample efficiency
training stability |
| gptkbp:introduced |
gptkb:Ilya_Sutskever
gptkb:Sergey_Levine gptkb:Shixiang_Gu gptkb:Timothy_Lillicrap Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic |
| gptkbp:openSource |
gptkb:OpenAI_Baselines
gptkb:Stable_Baselines rl-baselines3-zoo |
| gptkbp:optimizedFor |
policy parameters
value function parameters |
| gptkbp:publicationYear |
2017
|
| gptkbp:relatedTo |
gptkb:A2C
gptkb:A3C gptkb:TRPO K-FAC |
| gptkbp:type |
actor-critic method
policy optimization algorithm |
| gptkbp:uses |
Kronecker-factored approximation
|
| gptkbp:bfsParent |
gptkb:OpenAI_Baselines
|
| gptkbp:bfsLayer |
6
|
| https://www.w3.org/2000/01/rdf-schema#label |
ACKTR
|