gptkbp:instanceOf
|
imitation learning algorithm
|
gptkbp:address
|
compounding errors in imitation learning
|
gptkbp:category
|
sequential decision making
learning from demonstration
|
gptkbp:citation
|
gptkb:Ross,_S.,_Gordon,_G.,_&_Bagnell,_D._(2011)._A_reduction_of_imitation_learning_and_structured_prediction_to_no-regret_online_learning._AISTATS.
|
gptkbp:field
|
gptkb:artificial_intelligence
gptkb:machine_learning
robotics
|
gptkbp:fullName
|
Dataset Aggregation
|
https://www.w3.org/2000/01/rdf-schema#label
|
DAgger
|
gptkbp:influenced
|
gptkb:AggreVaTe
gptkb:DAggerFM
gptkb:SafeDAgger
|
gptkbp:introduced
|
gptkb:Ross,_Gordon,_and_Bagnell
|
gptkbp:introducedIn
|
2011
|
gptkbp:key
|
aggregate dataset of expert and learner trajectories
|
gptkbp:method
|
iterative algorithm
|
gptkbp:openSource
|
gptkb:OpenAI_Baselines
gptkb:Stable_Baselines
|
gptkbp:publishedIn
|
gptkb:AISTATS_2011
|
gptkbp:purpose
|
improve imitation learning
|
gptkbp:relatedTo
|
gptkb:behavior_cloning
gptkb:reinforcement_learning
|
gptkbp:step
|
query expert on learner's states
|
gptkbp:usedIn
|
game playing
autonomous driving
robot manipulation
|
gptkbp:uses
|
expert demonstrations
policy aggregation
|
gptkbp:bfsParent
|
gptkb:Imitation_Learning
|
gptkbp:bfsLayer
|
7
|