Statements (20)
Predicate | Object |
---|---|
gptkbp:instanceOf |
gptkb:algorithm
|
gptkbp:alternativeTo |
vanilla policy gradient
|
gptkbp:application |
control systems
robotics game playing |
gptkbp:category |
gptkb:model
|
gptkbp:citation |
Kakade, S. (2001). A Natural Policy Gradient. NIPS.
|
gptkbp:field |
gptkb:reinforcement_learning
|
gptkbp:form |
gptkb:information_geometry
|
https://www.w3.org/2000/01/rdf-schema#label |
Natural Policy Gradient
|
gptkbp:influenced |
gptkb:Trust_Region_Policy_Optimization
gptkb:Proximal_Policy_Optimization |
gptkbp:introduced |
gptkb:Shun-ichi_Amari
|
gptkbp:introducedIn |
1998
|
gptkbp:purpose |
improve policy optimization
|
gptkbp:reduces |
gptkb:KL_divergence
|
gptkbp:relatedTo |
policy gradient methods
|
gptkbp:uses |
gptkb:Fisher_information_matrix
|
gptkbp:bfsParent |
gptkb:Trust_Region_Policy_Optimization_(TRPO)
|
gptkbp:bfsLayer |
7
|