Statements (11)
Predicate | Object |
---|---|
gptkbp:instanceOf |
algorithm
|
gptkbp:appliesTo |
continuous action spaces
|
gptkbp:basedOn |
trust region methods
|
gptkbp:developedBy |
gptkb:John_Schulman
|
https://www.w3.org/2000/01/rdf-schema#label |
TRPO
|
gptkbp:improves |
policy optimization
|
gptkbp:provides |
theoretical guarantees
|
gptkbp:relatedTo |
PPO
|
gptkbp:requires |
second-order optimization
|
gptkbp:usedIn |
reinforcement learning
|
gptkbp:yearEstablished |
2015
|