Statements (28)
Predicate | Object |
---|---|
gptkbp:instanceOf |
deep reinforcement learning algorithm
|
gptkbp:application |
gptkb:reinforcement_learning
gptkb:Atari_2600_games |
gptkbp:basedOn |
gptkb:Deep_Q-Network_(DQN)
|
gptkbp:citation |
gptkb:Dueling_Network_Architectures_for_Deep_Reinforcement_Learning
https://arxiv.org/abs/1511.06581 |
gptkbp:contribution |
decoupling state value and action advantage estimation
|
gptkbp:developedBy |
gptkb:Nando_de_Freitas
gptkb:Hado_van_Hasselt gptkb:Marc_Lanctot gptkb:Matteo_Hessel gptkb:Tom_Schaul gptkb:Ziyu_Wang |
gptkbp:hasComponent |
value function
advantage function |
https://www.w3.org/2000/01/rdf-schema#label |
Dueling DQN
|
gptkbp:improves |
gptkb:DQN
|
gptkbp:language |
gptkb:Python
|
gptkbp:platform |
gptkb:TensorFlow
gptkb:PyTorch |
gptkbp:proposedBy |
separate value and advantage streams
|
gptkbp:publicationDate |
gptkb:arXiv
|
gptkbp:publishedIn |
2016
|
gptkbp:usedIn |
gptkb:OpenAI_Baselines
gptkb:Stable_Baselines |
gptkbp:bfsParent |
gptkb:Deep_Q-Network
gptkb:Deep_Q-Network_(DQN) |
gptkbp:bfsLayer |
6
|