Trust Region Policy Optimization
GPTKB entity
Statements (26)
Predicate | Object |
---|---|
gptkbp:instance_of |
gptkb:Artificial_Intelligence
|
gptkbp:applies_to |
reinforcement learning
|
gptkbp:developed_by |
gptkb:John_Schulman
|
https://www.w3.org/2000/01/rdf-schema#label |
Trust Region Policy Optimization
|
gptkbp:improves |
policy gradient methods
|
gptkbp:is_applied_in |
gptkb:vehicles
healthcare finance game AI |
gptkbp:is_compared_to |
vanilla policy gradient
|
gptkbp:is_evaluated_by |
Atari games
continuous control tasks |
gptkbp:is_implemented_in |
gptkb:Tensor_Flow
gptkb:Py_Torch |
gptkbp:is_known_for |
stability
sample efficiency |
gptkbp:is_optimized_for |
policy parameters
|
gptkbp:is_part_of |
policy optimization techniques
|
gptkbp:is_related_to |
gptkb:Proximal_Policy_Optimization
|
gptkbp:is_used_by |
gptkb:Open_AI
|
gptkbp:is_used_in |
gptkb:robotics
|
gptkbp:published_in |
gptkb:2015
|
gptkbp:requires |
second-order information
|
gptkbp:uses |
trust regions
|
gptkbp:bfsParent |
gptkb:Deep_Reinforcement_Learning
|
gptkbp:bfsLayer |
4
|