Trust Region Policy Optimization

GPTKB entity

Statements (49)
Predicate Object
gptkbp:instanceOf algorithm
gptkbp:appliesTo reinforcement learning
gptkbp:developedBy gptkb:John_Schulman
https://www.w3.org/2000/01/rdf-schema#label Trust Region Policy Optimization
gptkbp:improves policy gradient methods
gptkbp:isAvenueFor continuous action spaces
discrete action spaces
gptkbp:isChallengedBy high-dimensional spaces
local optima
non-stationary environments
gptkbp:isEvaluatedBy real-world applications
stability
simulated environments
convergence rate
sample efficiency
robotic tasks
Atari_games
gptkbp:isExploredIn academic research
industry applications
AI conferences
machine learning workshops
reinforcement learning symposiums
gptkbp:isInfluencedBy trust region methods
natural gradient methods
gptkbp:isLocatedIn gptkb:PyTorch
TensorFlow
gptkbp:isPartOf gptkb:REINFORCE
gptkb:DDPG
policy optimization family
TRPO_variants
gptkbp:isRelatedTo actor-critic methods
policy optimization
gptkbp:isSupportedBy empirical results
theoretical guarantees
gptkbp:isUsedFor improving performance
exploration strategies
training agents
gptkbp:isUsedIn robotics
game playing
gptkbp:isVisitedBy robust learning
sample-efficient learning
scalable learning
gptkbp:maxRange expected reward
gptkbp:performance policy parameters
gptkbp:publishedIn 2015
gptkbp:reduces KL divergence
gptkbp:relatedTo Proximal Policy Optimization
gptkbp:requires second-order information
gptkbp:uses trust region methods