Statements (52)
Predicate | Object |
---|---|
gptkbp:instanceOf |
algorithm
|
gptkbp:appliesTo |
reinforcement learning
|
gptkbp:developedBy |
Schulman_et_al.
|
gptkbp:enables |
learning from failures
|
gptkbp:hasRelatedPatent |
healthcare
finance autonomous driving |
https://www.w3.org/2000/01/rdf-schema#label |
Hindsight Experience Replay
|
gptkbp:improves |
sample efficiency
|
gptkbp:isAttendedBy |
research community
industry practitioners |
gptkbp:isBasedOn |
off-policy learning
|
gptkbp:isConsidered |
state-of-the-art
|
gptkbp:isDocumentedIn |
conference proceedings
journals academic papers technical reports |
gptkbp:isEvaluatedBy |
simulated environments
robotic tasks Atari_games |
gptkbp:isExaminedBy |
tutorials
workshops online courses webinars seminars |
gptkbp:isInfluencedBy |
Q-learning
experience replay techniques temporal difference learning |
gptkbp:isLocatedIn |
gptkb:PyTorch
TensorFlow |
gptkbp:isPartOf |
gptkb:DDPG
DQN deep reinforcement learning prioritized experience replay standard experience replay |
gptkbp:isRelatedTo |
goal-conditioned reinforcement learning
|
gptkbp:isSupportedBy |
theoretical analysis
empirical studies |
gptkbp:isUsedBy |
improve exploration
optimize policies train agents |
gptkbp:isUsedFor |
transfer learning
multi-task learning other_RL_algorithms |
gptkbp:isUsedIn |
robotics
game playing |
gptkbp:isUtilizedIn |
continuous action spaces
discrete action spaces |
gptkbp:mayHave |
policy learning
|
gptkbp:reduces |
training time
|
gptkbp:uses |
experience replay
|
gptkbp:wasAffecting |
2017
|