Statements (22)
Predicate | Object |
---|---|
gptkbp:instanceOf |
gptkb:algorithm
|
gptkbp:citation |
high (hundreds)
|
gptkbp:field |
gptkb:reinforcement_learning
imitation learning |
gptkbp:fullName |
Aggregate Values to Imitate
|
https://www.w3.org/2000/01/rdf-schema#label |
AggreVaTe
|
gptkbp:input |
expert demonstrations
|
gptkbp:introduced |
gptkb:John_Langford
gptkb:Robert_E._Schapire gptkb:A._Krishnamurthy Alekh Agarwal Nan Jiang |
gptkbp:introducedIn |
2015
|
gptkbp:openSource |
gptkb:GitHub
|
gptkbp:output |
gptkb:public_policy
|
gptkbp:publishedIn |
Advances in Neural Information Processing Systems (NeurIPS) 2015
|
gptkbp:purpose |
improve imitation learning by aggregating cost-to-go information
|
gptkbp:relatedTo |
gptkb:DAgger
imitation learning policy optimization |
gptkbp:bfsParent |
gptkb:DAgger
|
gptkbp:bfsLayer |
8
|