Statements (22)
Predicate | Object |
---|---|
gptkbp:instanceOf |
gptkb:person
|
gptkbp:almaMater |
gptkb:University_of_Massachusetts_Amherst
|
gptkbp:awardReceived |
gptkb:IJCAI_Award_for_Research_Excellence
|
gptkbp:birthYear |
1956
|
gptkbp:doctoralAdvisor |
gptkb:Andrew_G._Barto
|
gptkbp:employer |
gptkb:DeepMind
gptkb:University_of_Alberta |
gptkbp:field |
gptkb:artificial_intelligence
gptkb:reinforcement_learning |
https://www.w3.org/2000/01/rdf-schema#label |
Richard S. Sutton
|
gptkbp:knownFor |
policy gradient methods
temporal difference learning reinforcement learning theory |
gptkbp:memberOf |
gptkb:Royal_Society_of_Canada
|
gptkbp:nationality |
gptkb:American
gptkb:Canadian |
gptkbp:notableWork |
gptkb:Reinforcement_Learning:_An_Introduction
|
gptkbp:occupation |
gptkb:computer_scientist
gptkb:professor |
gptkbp:website |
http://incompleteideas.net/
|
gptkbp:bfsParent |
gptkb:reinforcement_learning
|
gptkbp:bfsLayer |
5
|