Proximal Policy Optimization (PPO)
GPTKB entity
Statements (31)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:reinforcement_learning_algorithm
|
| gptkbp:application |
gptkb:game_AI
robotics autonomous control |
| gptkbp:citation |
high
|
| gptkbp:developedBy |
gptkb:OpenAI
|
| gptkbp:feature |
simplicity
robustness clipped surrogate objective improved sample efficiency |
| gptkbp:introducedIn |
2017
|
| gptkbp:openSource |
gptkb:OpenAI_Baselines
gptkb:Stable_Baselines gptkb:Ray_RLlib |
| gptkbp:optimizedFor |
policy gradient
on-policy |
| gptkbp:popularFor |
widely used
|
| gptkbp:publicationAuthors |
gptkb:Filip_Wolski
gptkb:Oleg_Klimov gptkb:Prafulla_Dhariwal gptkb:John_Schulman gptkb:Alec_Radford |
| gptkbp:publicationYear |
2017
|
| gptkbp:publishedIn |
gptkb:Proximal_Policy_Optimization_Algorithms
|
| gptkbp:relatedTo |
gptkb:Trust_Region_Policy_Optimization_(TRPO)
Actor-Critic methods |
| gptkbp:usedFor |
gptkb:reinforcement_learning
policy optimization |
| gptkbp:bfsParent |
gptkb:John_Schulman
|
| gptkbp:bfsLayer |
6
|
| https://www.w3.org/2000/01/rdf-schema#label |
Proximal Policy Optimization (PPO)
|