Statements (15)
Predicate | Object |
---|---|
gptkbp:instanceOf |
concept
|
gptkbp:canBe |
positive
zero negative |
https://www.w3.org/2000/01/rdf-schema#label |
Reward (R)
|
gptkbp:measures |
desirability of an action
|
gptkbp:relatedTo |
reward function
value function policy optimization |
gptkbp:represents |
feedback signal
|
gptkbp:symbol |
R
|
gptkbp:usedFor |
guide agent behavior
|
gptkbp:usedIn |
gptkb:reinforcement_learning
|
gptkbp:bfsParent |
gptkb:iterated_Prisoner's_Dilemma
|
gptkbp:bfsLayer |
6
|