Proximal Policy Optimization (PPO)
GPTKB entity
Statements (31)
Predicate | Object |
---|---|
gptkbp:instanceOf |
reinforcement learning algorithm
|
gptkbp:application |
gptkb:game_AI
robotics autonomous control |
gptkbp:citation |
high
|
gptkbp:developedBy |
gptkb:OpenAI
|
gptkbp:feature |
simplicity
robustness clipped surrogate objective improved sample efficiency |
https://www.w3.org/2000/01/rdf-schema#label |
Proximal Policy Optimization (PPO)
|
gptkbp:introducedIn |
2017
|
gptkbp:openSource |
gptkb:OpenAI_Baselines
gptkb:Stable_Baselines gptkb:Ray_RLlib |
gptkbp:optimizedFor |
policy gradient
on-policy |
gptkbp:popularFor |
widely used
|
gptkbp:publicationAuthors |
gptkb:Filip_Wolski
gptkb:Oleg_Klimov gptkb:Prafulla_Dhariwal gptkb:John_Schulman gptkb:Alec_Radford |
gptkbp:publicationYear |
2017
|
gptkbp:publishedIn |
gptkb:Proximal_Policy_Optimization_Algorithms
|
gptkbp:relatedTo |
gptkb:Trust_Region_Policy_Optimization_(TRPO)
Actor-Critic methods |
gptkbp:usedFor |
gptkb:reinforcement_learning
policy optimization |
gptkbp:bfsParent |
gptkb:John_Schulman
|
gptkbp:bfsLayer |
6
|