Statements (20)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:algorithm
|
| gptkbp:alternativeTo |
vanilla policy gradient
|
| gptkbp:application |
control systems
robotics game playing |
| gptkbp:category |
gptkb:model
|
| gptkbp:citation |
Kakade, S. (2001). A Natural Policy Gradient. NIPS.
|
| gptkbp:field |
gptkb:reinforcement_learning
|
| gptkbp:form |
gptkb:information_geometry
|
| gptkbp:influenced |
gptkb:Trust_Region_Policy_Optimization
gptkb:Proximal_Policy_Optimization |
| gptkbp:introduced |
gptkb:Shun-ichi_Amari
|
| gptkbp:introducedIn |
1998
|
| gptkbp:purpose |
improve policy optimization
|
| gptkbp:reduces |
gptkb:KL_divergence
|
| gptkbp:relatedTo |
policy gradient methods
|
| gptkbp:uses |
gptkb:Fisher_information_matrix
|
| gptkbp:bfsParent |
gptkb:Trust_Region_Policy_Optimization_(TRPO)
|
| gptkbp:bfsLayer |
7
|
| https://www.w3.org/2000/01/rdf-schema#label |
Natural Policy Gradient
|