Statements (25)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:reinforcement_learning_algorithm
|
| gptkbp:appliesTo |
gptkb:Atari_games
robotics continuous control tasks |
| gptkbp:category |
on-policy algorithm
|
| gptkbp:developedBy |
gptkb:DeepMind
|
| gptkbp:heldBy |
synchronous version of A3C
|
| gptkbp:implementedIn |
gptkb:TensorFlow
gptkb:PyTorch |
| gptkbp:input |
gptkb:state_order
|
| gptkbp:introducedIn |
2017
|
| gptkbp:openSource |
gptkb:OpenAI_Baselines
gptkb:Stable_Baselines |
| gptkbp:output |
gptkb:action
value estimate |
| gptkbp:relatedTo |
gptkb:A3C
|
| gptkbp:standsFor |
gptkb:Advantage_Actor-Critic
|
| gptkbp:usedIn |
deep reinforcement learning
|
| gptkbp:uses |
policy gradient methods
actor-critic architecture |
| gptkbp:bfsParent |
gptkb:Actor-Critic
gptkb:OpenAI_Baselines gptkb:Stable_Baselines |
| gptkbp:bfsLayer |
6
|
| https://www.w3.org/2000/01/rdf-schema#label |
A2C
|