Statements (28)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:deep_reinforcement_learning_algorithm
|
| gptkbp:application |
gptkb:reinforcement_learning
gptkb:Atari_2600_games |
| gptkbp:basedOn |
gptkb:Deep_Q-Network_(DQN)
|
| gptkbp:citation |
gptkb:Dueling_Network_Architectures_for_Deep_Reinforcement_Learning
https://arxiv.org/abs/1511.06581 |
| gptkbp:contribution |
decoupling state value and action advantage estimation
|
| gptkbp:developedBy |
gptkb:Nando_de_Freitas
gptkb:Hado_van_Hasselt gptkb:Marc_Lanctot gptkb:Matteo_Hessel gptkb:Tom_Schaul gptkb:Ziyu_Wang |
| gptkbp:hasComponent |
value function
advantage function |
| gptkbp:improves |
gptkb:DQN
|
| gptkbp:language |
gptkb:Python
|
| gptkbp:platform |
gptkb:TensorFlow
gptkb:PyTorch |
| gptkbp:proposedBy |
separate value and advantage streams
|
| gptkbp:publicationDate |
gptkb:arXiv
|
| gptkbp:publishedIn |
2016
|
| gptkbp:usedIn |
gptkb:OpenAI_Baselines
gptkb:Stable_Baselines |
| gptkbp:bfsParent |
gptkb:Deep_Q-Network
gptkb:Deep_Q-Network_(DQN) |
| gptkbp:bfsLayer |
6
|
| https://www.w3.org/2000/01/rdf-schema#label |
Dueling DQN
|