Statements (25)
Predicate | Object |
---|---|
gptkbp:instanceOf |
reinforcement learning algorithm
|
gptkbp:appliesTo |
gptkb:Atari_games
robotics continuous control tasks |
gptkbp:category |
on-policy algorithm
|
gptkbp:developedBy |
gptkb:DeepMind
|
gptkbp:heldBy |
synchronous version of A3C
|
https://www.w3.org/2000/01/rdf-schema#label |
A2C
|
gptkbp:implementedIn |
gptkb:TensorFlow
gptkb:PyTorch |
gptkbp:input |
gptkb:state_order
|
gptkbp:introducedIn |
2017
|
gptkbp:openSource |
gptkb:OpenAI_Baselines
gptkb:Stable_Baselines |
gptkbp:output |
gptkb:action
value estimate |
gptkbp:relatedTo |
gptkb:A3C
|
gptkbp:standsFor |
gptkb:Advantage_Actor-Critic
|
gptkbp:usedIn |
deep reinforcement learning
|
gptkbp:uses |
policy gradient methods
actor-critic architecture |
gptkbp:bfsParent |
gptkb:Actor-Critic
gptkb:OpenAI_Baselines gptkb:Stable_Baselines |
gptkbp:bfsLayer |
6
|