Statements (23)
Predicate | Object |
---|---|
gptkbp:instanceOf |
gptkb:convolutional_neural_network
|
gptkbp:canBe |
gptkb:convolutional_neural_network
gptkb:Recurrent_Neural_Network Feedforward Neural Network Deterministic Stochastic |
gptkbp:function |
Approximates Policy Function
|
https://www.w3.org/2000/01/rdf-schema#label |
Policy Network
|
gptkbp:input |
gptkb:state_order
|
gptkbp:learns |
gptkb:public_policy
|
gptkbp:output |
gptkb:action
|
gptkbp:relatedTo |
Value Network
|
gptkbp:trainer |
Reinforcement Learning Algorithms
|
gptkbp:usedIn |
gptkb:Trust_Region_Policy_Optimization
gptkb:Reinforcement_Learning gptkb:Proximal_Policy_Optimization Actor-Critic Methods Deep Reinforcement Learning Policy Gradient Methods Deterministic Policy Gradient Stochastic Policy Gradient |
gptkbp:bfsParent |
gptkb:Roger_Liddle
|
gptkbp:bfsLayer |
7
|