Deep Learning,
a branch of machine learning based on a set of algorithms that attempt to model high levelabstractions in data - characterized as a buzzword, or a rebranding of neural networks. A deep neural network (DNN) is an ANN with multiple hidden layers of units between the input and output layers which can be discriminatively trained with the standard backpropagation algorithm. Two common issues if naively trained are overfitting and computation time. While deep learning techniques have yielded in another breakthrough in computer Go (after Monte-Carlo Tree Search), some trials in computer chess were promising as well, but until December 2017, less spectacular.
Convolutional neural networks form a subclass of feedforward neural networks that have special weight constraints, individual neurons are tiled in such a way that they respond to overlapping regions. Convolutional NNs are suited for deep learning and are highly suitable for parallelization on GPUs[2]. In 2014, two teams independently investigated whether deep convolutional neural networks could be used to directly represent and learn a move evaluation function for the game of Go. Christopher Clark and Amos Storkey trained an 8-layer convolutional neural network by supervised learning from a database of human professional games, which without any search, defeated the traditional search program Gnu Go in 86% of the games [3][4][5][6]. In their paper Move Evaluation in Go Using Deep Convolutional Neural Networks[7], Chris J. Maddison, Aja Huang, Ilya Sutskever, and David Silver report they trained a large 12-layer convolutional neural network in a similar way, to beat Gnu Go in 97% of the games, and matched the performance of a state-of-the-art Monte-Carlo tree search that simulates a million positions per move [8].
In 2016, Omid E. David, Nathan S. Netanyahu, and Lior Wolf introduced DeepChess obtaining a grandmaster-level chess playing performance using a learning method incorporating two deep neural networks, which are trained using a combination of unsupervised pretraining and supervised training. The unsupervised training extracts high level features from a given chess position, and the supervised training learns to compare two chess positions to select the more favorable one. In order to use DeepChess inside a chess program, a novel version of alpha-beta is used that does not require bounds but positions αpos and βpos[13].
AlphaZero
In December 2017, the GoogleDeepMind team with Matthew Lai involved published on their generalized AlphaZero algorithm, combining Deep learning with Monte-Carlo Tree Search. AlphaZero can achieve, tabula rasa, superhuman performance in many challenging domains with some training effort. Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved a superhuman level of play in the games of chess and Shogi as well as Go, and convincingly defeated a world-champion program in each case [14].
Kunihiko Fukushima (1980). Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position. Biological Cybernetics, Vol. 36, pdf[15]
a branch of machine learning based on a set of algorithms that attempt to model high level abstractions in data - characterized as a buzzword, or a rebranding of neural networks. A deep neural network (DNN) is an ANN with multiple hidden layers of units between the input and output layers which can be discriminatively trained with the standard backpropagation algorithm. Two common issues if naively trained are overfitting and computation time. While deep learning techniques have yielded in another breakthrough in computer Go (after Monte-Carlo Tree Search), some trials in computer chess were promising as well, but until December 2017, less spectacular.
Table of Contents
Go
Convolutional neural networks form a subclass of feedforward neural networks that have special weight constraints, individual neurons are tiled in such a way that they respond to overlapping regions. Convolutional NNs are suited for deep learning and are highly suitable for parallelization on GPUs [2]. In 2014, two teams independently investigated whether deep convolutional neural networks could be used to directly represent and learn a move evaluation function for the game of Go. Christopher Clark and Amos Storkey trained an 8-layer convolutional neural network by supervised learning from a database of human professional games, which without any search, defeated the traditional search program Gnu Go in 86% of the games [3] [4] [5] [6]. In their paper Move Evaluation in Go Using Deep Convolutional Neural Networks [7], Chris J. Maddison, Aja Huang, Ilya Sutskever, and David Silver report they trained a large 12-layer convolutional neural network in a similar way, to beat Gnu Go in 97% of the games, and matched the performance of a state-of-the-art Monte-Carlo tree search that simulates a million positions per move [8].In 2015, a team affiliated with Google DeepMind around David Silver and Aja Huang, supported by Google researchers John Nham and Ilya Sutskever, build a Go playing program dubbed AlphaGo [9], combining Monte-Carlo tree search with their 12-layer networks [10].
Chess
Giraffe & Zurichess
In 2015, Matthew Lai trained Giraffe's deep neural network by TD-Leaf [11]. Zurichess by Alexandru Moșoi uses the TensorFlow library for automated tuning - in a two layers neural network, the second layer is responsible for a tapered eval to phase endgame and middlegame scores [12].DeepChess
In 2016, Omid E. David, Nathan S. Netanyahu, and Lior Wolf introduced DeepChess obtaining a grandmaster-level chess playing performance using a learning method incorporating two deep neural networks, which are trained using a combination of unsupervised pretraining and supervised training. The unsupervised training extracts high level features from a given chess position, and the supervised training learns to compare two chess positions to select the more favorable one. In order to use DeepChess inside a chess program, a novel version of alpha-beta is used that does not require bounds but positions αpos and βpos [13].AlphaZero
In December 2017, the Google DeepMind team with Matthew Lai involved published on their generalized AlphaZero algorithm, combining Deep learning with Monte-Carlo Tree Search. AlphaZero can achieve, tabula rasa, superhuman performance in many challenging domains with some training effort. Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved a superhuman level of play in the games of chess and Shogi as well as Go, and convincingly defeated a world-champion program in each case [14].See also
Selected Publications
1965 ...
1980 ...
1990 ...
2000 ...
2012 ...
- Yoshua Bengio (2012). Deep Learning of Representations for Unsupervised and Transfer Learning. JMLR: Workshop on Unsupervised and Transfer Learning, 2011, pdf
2013- Grégoire Montavon (2013). On Layer-Wise Representations in Deep Neural Networks. Ph.D. Thesis, TU Berlin, advisor Klaus-Robert Müller
- Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller (2013). Playing Atari with Deep Reinforcement Learning. arXiv:1312.5602 [20]
20142015 ...
- Peter H. Jin, Kurt Keutzer (2015). Convolutional Monte Carlo Rollouts in Go. arXiv:1512.03375 » Go, MCTS
- Michael Nielsen (2015). Neural networks and deep learning. Determination Press
- Matthew Lai (2015). Giraffe: Using Deep Reinforcement Learning to Play Chess. M.Sc. thesis, Imperial College London, arXiv:1509.01549v1 » Giraffe
- Nikolai Yakovenko, Liangliang Cao, Colin Raffel, James Fan (2015). Poker-CNN: A Pattern Learning Strategy for Making Draws and Bets in Poker Games. arXiv:1509.06731
- Andrew Lavin, Scott Gray (2015). Fast Algorithms for Convolutional Neural Networks. arXiv:1509.09308 [23]
- Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis (2015). Human-level control through deep reinforcement learning. Nature, Vol. 518
- Yuandong Tian, Yan Zhu (2015). Better Computer Go Player with Neural Network and Long-term Prediction. arXiv:1511.06410, ICLR 2016 [24] [25] » Go
- Quoc V. Le (2015). A Tutorial on Deep Learning - Part 1: Nonlinear Classifiers and The Backpropagation Algorithm. Google Brain, pdf [26]
- Quoc V. Le (2015). A Tutorial on Deep Learning - Part 2: Autoencoders, Convolutional Neural Networks and Recurrent Neural Networks. Google Brain, pdf
- Jürgen Schmidhuber (2015). Deep Learning in Neural Networks: An Overview. Neural Networks, Vol. 61
- Zachary C. Lipton, John Berkowitz, Charles Elkan (2015). A Critical Review of Recurrent Neural Networks for Sequence Learning. arXiv:1506.00019v4
- Barak Oshri, Nishith Khandwala (2015). Predicting Moves in Chess using Convolutional Neural Networks. pdf [27] [28]
- Yann LeCun, Yoshua Bengio, Geoffrey E. Hinton (2015). Deep Learning. Nature, Vol. 521 [29]
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun (2015). Deep Residual Learning for Image Recognition. arXiv:1512.03385
2016- Tim Dettmers (2016). 8-Bit Approximations for Parallelism in Deep Learning. arXiv:1511.04561v4, ICLR 2016
- Ziyu Wang, Nando de Freitas, Marc Lanctot (2016). Dueling Network Architectures for Deep Reinforcement Learning. arXiv:1511.06581
- David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis (2016). Mastering the game of Go with deep neural networks and tree search. Nature, Vol. 529 » AlphaGo
- Tobias Graf, Marco Platzner (2016). Using Deep Convolutional Neural Networks in Monte Carlo Tree Search. CG 2016
- Aja Huang (2016). AlphaGo: Combining Deep Neural Networks with Tree Search. CG 2016, Keynote Lecture
- Hung Guei, Tinghan Wei, Jin-Bo Huang, I-Chen Wu (2016). An Empirical Study on Applying Deep Reinforcement Learning to the Game 2048. CG 2016
- Omid E. David, Nathan S. Netanyahu, Lior Wolf (2016). DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess. ICAAN 2016, Lecture Notes in Computer Science, Vol. 9887, Springer, pdf preprint » DeepChess [30] [31]
- Dror Sholomon, Omid E. David, Nathan S. Netanyahu (2016). DNN-Buddies: A Deep Neural Network-Based Estimation Metric for the Jigsaw Puzzle Problem. ICAAN 2016, Lecture Notes in Computer Science, Vol. 9887, Springer [32]
- Ian Goodfellow, Yoshua Bengio, Aaron Courville (2016). Deep Learning. MIT Press
- Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu (2016). Asynchronous Methods for Deep Reinforcement Learning. arXiv:1602.01783v2
- Johannes Heinrich, David Silver (2016). Deep Reinforcement Learning from Self-Play in Imperfect-Information Games. arXiv:1603.01121 [33]
- Dale Schuurmans, Martin Zinkevich (2016). Deep Learning Games. NIPS 2016
- Andrei A. Rusu, Neil C. Rabinowitz, Guillaume Desjardins, Hubert Soyer, James Kirkpatrick, Koray Kavukcuoglu, Razvan Pascanu, Raia Hadsell (2016). Progressive Neural Networks. arXiv:1606.04671
- Shixiang Gu, Ethan Holly, Timothy Lillicrap, Sergey Levine (2016). Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates. arXiv:1610.00633
- Jane X Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z Leibo, Rémi Munos, Charles Blundell, Dharshan Kumaran, Matt Botvinick (2016). Learning to reinforcement learn. arXiv:1611.05763
- Jonathan Rosenthal (2016). Deep Learning for Go. B.Sc. thesis, ETH Zurich
2017Forum Posts
2014
2015 ...
- Who introduced the term “deep learning” to the field of Machine Learning by Jürgen Schmidhuber, Google+, March 18, 2015 [36]
- *First release* Giraffe, a new engine based on deep learning by Matthew Lai, CCC, July 08, 2015 » Giraffe
2016- Chess position evaluation with convolutional neural network in Julia by Kamil Czarnogorski, Machine learning with Julia and python, April 02, 2016 [37]
- Deep Learning Chess Engine ? by Eren Yavuz, CCC, July 21, 2016
- Neuronet plus conventional approach combined? by Rasmus Althoff, CCC, September 02, 2016
- DeepChess: Another deep-learning based chess program by Matthew Lai, CCC, October 17, 2016 » DeepChess
- The scaling of Deep Learning MCTS Go engines by Kai Laskos, CCC, October 23, 2016 » Deep Learning, Go, MCTS
2017Re: Deep Learning Chess Engine ? by Alexandru Mosoi, CCC, July 21, 2016 » Zurichess
Re: Deep Learning Chess Engine ? by Matthew Lai, CCC, August 04, 2016 » Giraffe [38]
- No one really knows how the most advanced algorithms work by Daniel José Queraltó, CCC, April 11, 2017
- How far away are we from deep learning Stockfish, Komodo,... by John Margusen, CCC, May 19, 2017
- Is AlphaGo approach unsuitable to chess? by Mel Cooper, CCC, May 27, 2017 » AlphaGo, Giraffe
- Neural nets for Go - chain pooling? by David Wu, Computer Go Archive, August 18, 2017
- We are doomed - AlphaGo Zero, learning only from basic rules by Vincent Lejeune, CCC, October 18, 2017
- AlphaGo Zero by Alberto Sanjuan, CCC, October 19, 2017
- Zero performance by Gian-Carlo Pascutto, Computer Go Archive, October 20, 2017 » AlphaGo,
- Neural networks for chess position evaluation- request by Kamil Czarnogorski, CCC, November 13, 2017
- Google's AlphaGo team has been working on chess by Peter Kappler, CCC, December 06, 2017 » AlphaZero
- Historic Milestone: AlphaZero by Miguel Castanuela, CCC, December 06, 2017
- An AlphaZero inspired project by Truls Edvard Stokke, CCC, December 14, 2017 » AlphaZero
- To TPU or not to TPU... by Srdja Matovic, CCC, December 16, 2017
2018Re: Is AlphaGo approach unsuitable to chess? by Peter Österlund, CCC, May 31, 2017 » Texel
Re: To TPU or not to TPU... by Rémi Coulom, CCC, December 16, 2017
External Links
Networks
Software
Libraries
Chess
Games
Music Generation
Nvidia
Reports & Blogs
Texas Hold'em: AI is almost as good as humans at playing poker by Matt Burgess, Wired UK, March 30, 2016
GitHub - suragnair/alpha-zero-general: A clean and simple implementation of a self-play learning algorithm based on AlphaGo Zero (any game, any framework!)
Videos
References
What links here?
Up one Level