Statements (63)
Predicate | Object |
---|---|
gptkbp:instance_of |
gptkb:Artificial_Intelligence
|
gptkbp:architecture |
gptkb:neural_networks
|
gptkbp:career_games_played |
gptkb:Go
gptkb:chess_match shogi |
gptkbp:competitors |
gptkb:Stockfish
|
gptkbp:developed_by |
gptkb:Deep_Mind_Technologies
|
gptkbp:evaluates |
gptkb:Monte_Carlo_Tree_Search
self-play |
gptkbp:field_of_study |
gptkb:strategy
|
gptkbp:goal |
maximize winning chances
|
gptkbp:has_programs |
gptkb:Monte_Carlo_Tree_Search
|
https://www.w3.org/2000/01/rdf-schema#label |
Deep Mind Alpha Zero
|
gptkbp:impact |
gptkb:AI_technology
|
gptkbp:influenced |
gptkb:neural_networks
gptkb:machine_learning AI game playing |
gptkbp:initialization |
random
|
gptkbp:input_output |
game state
move probabilities |
gptkbp:key_feature |
parallel processing
resource management scalability strategic planning performance evaluation real-time decision making adaptability long-term planning high efficiency model training data efficiency fast learning tactical play state representation action selection end-to-end learning generalization across games multi-game capability policy improvement dynamic evaluation algorithmic innovation exploration vs exploitation no prior knowledge value estimation learning efficiency complexity handling computational power utilization high-level reasoning reward optimization |
gptkbp:notable_achievement |
defeated top Go players
defeated Elmo in shogi defeated Stockfish |
gptkbp:performance |
superhuman
|
gptkbp:provides_information_on |
no human data
|
gptkbp:publication |
gptkb:Nature
|
gptkbp:release_year |
gptkb:2017
|
gptkbp:successor |
gptkb:Alpha_Go_Zero
|
gptkbp:training |
several hours
self-play |
gptkbp:training_programs |
simulated games
|
gptkbp:type |
reinforcement learning
|
gptkbp:bfsParent |
gptkb:Richard_Rapport
|
gptkbp:bfsLayer |
6
|