Statements (16)
Predicate | Object |
---|---|
gptkbp:instanceOf |
gptkb:algorithm
|
gptkbp:alsoKnownAs |
gptkb:R-N_algorithm
|
gptkbp:field |
gptkb:reinforcement_learning
|
https://www.w3.org/2000/01/rdf-schema#label |
Rummery and Niranjan
|
gptkbp:proposedBy |
gptkb:G.A._Rummery
gptkb:M._Niranjan |
gptkbp:publishedIn |
On-line Q-learning using connectionist systems
|
gptkbp:relatedTo |
gptkb:SARSA
gptkb:Q-learning |
gptkbp:type |
reinforcement learning algorithm
on-policy algorithm |
gptkbp:usedFor |
policy evaluation
temporal difference learning |
gptkbp:yearProposed |
1994
|
gptkbp:bfsParent |
gptkb:SARSA
|
gptkbp:bfsLayer |
6
|