Statements (22)
Predicate | Object |
---|---|
gptkbp:instanceOf |
inverse reinforcement learning algorithm
|
gptkbp:alsoKnownAs |
MaxEnt IRL
|
gptkbp:assumes |
demonstrations are noisy or stochastic
|
gptkbp:basedOn |
gptkb:maximum_entropy_principle
|
gptkbp:citation |
Ziebart, B. D., Maas, A. L., Bagnell, J. A., & Dey, A. K. (2008). Maximum entropy inverse reinforcement learning. AAAI.
|
gptkbp:extendsTo |
classic IRL
|
gptkbp:form |
probability of trajectory proportional to exponential of reward
|
gptkbp:goal |
recover reward function from expert demonstrations
|
https://www.w3.org/2000/01/rdf-schema#label |
maximum entropy IRL
|
gptkbp:influenced |
deep maximum entropy IRL
generative adversarial imitation learning |
gptkbp:introduced |
Brian Ziebart
|
gptkbp:introducedIn |
2008
|
gptkbp:output |
stochastic policy
|
gptkbp:relatedTo |
gptkb:reinforcement_learning
inverse reinforcement learning apprenticeship learning |
gptkbp:usedIn |
robotics
autonomous driving imitation learning |
gptkbp:bfsParent |
gptkb:Inverse_Reinforcement_Learning
|
gptkbp:bfsLayer |
8
|