Deep Learning,
a branch of machine learning based on a set of algorithms that attempt to model high levelabstractions in data - characterized as a buzzword, or a rebranding of neural networks. A deep neural network (DNN) is an ANN with multiple hidden layers of units between the input and output layers which can be discriminatively trained with the standard backpropagation algorithm. Two common issues if naively trained are overfitting and computation time. While deep learning techniques have yielded in another breakthrough in computer Go (after Monte-Carlo Tree Search), some trials in computer chess were promising as well, but until December 2017, less spectacular.

Convolutional neural networks form a subclass of feedforward neural networks that have special weight constraints, individual neurons are tiled in such a way that they respond to overlapping regions. Convolutional NNs are suited for deep learning and are highly suitable for parallelization on GPUs^{[2]}. In 2014, two teams independently investigated whether deep convolutional neural networks could be used to directly represent and learn a move evaluation function for the game of Go. Christopher Clark and Amos Storkey trained an 8-layer convolutional neural network by supervised learning from a database of human professional games, which without any search, defeated the traditional search program Gnu Go in 86% of the games ^{[3]}^{[4]}^{[5]}^{[6]}. In their paper Move Evaluation in Go Using Deep Convolutional Neural Networks^{[7]}, Chris J. Maddison, Aja Huang, Ilya Sutskever, and David Silver report they trained a large 12-layer convolutional neural network in a similar way, to beat Gnu Go in 97% of the games, and matched the performance of a state-of-the-art Monte-Carlo tree search that simulates a million positions per move ^{[8]}.

In 2016, Omid E. David, Nathan S. Netanyahu, and Lior Wolf introduced DeepChess obtaining a grandmaster-level chess playing performance using a learning method incorporating two deep neural networks, which are trained using a combination of unsupervised pretraining and supervised training. The unsupervised training extracts high level features from a given chess position, and the supervised training learns to compare two chess positions to select the more favorable one. In order to use DeepChess inside a chess program, a novel version of alpha-beta is used that does not require bounds but positions αpos and βpos^{[13]}.

AlphaZero

In December 2017, the GoogleDeepMind team with Matthew Lai involved published on their generalized AlphaZero algorithm, combining Deep learning with Monte-Carlo Tree Search. AlphaZero can achieve, tabula rasa, superhuman performance in many challenging domains with some training effort. Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved a superhuman level of play in the games of chess and Shogi as well as Go, and convincingly defeated a world-champion program in each case ^{[14]}.

Kunihiko Fukushima (1980). Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position. Biological Cybernetics, Vol. 36, pdf^{[15]}

Home * Learning * Neural Networks * Deep LearningDeep Learning,a branch of machine learning based on a set of algorithms that attempt to model high level abstractions in data - characterized as a buzzword, or a rebranding of neural networks. A deep neural network (DNN) is an ANN with multiple hidden layers of units between the input and output layers which can be discriminatively trained with the standard backpropagation algorithm. Two common issues if naively trained are overfitting and computation time. While deep learning techniques have yielded in another breakthrough in computer Go (after Monte-Carlo Tree Search), some trials in computer chess were promising as well, but until December 2017, less spectacular.

^{[1]}## Table of Contents

## Go

Convolutional neural networks form a subclass of feedforward neural networks that have special weight constraints, individual neurons are tiled in such a way that they respond to overlapping regions. Convolutional NNs are suited for deep learning and are highly suitable for parallelization on GPUs^{[2]}. In 2014, two teams independently investigated whether deep convolutional neural networks could be used to directly represent and learn a move evaluation function for the game of Go. Christopher Clark and Amos Storkey trained an 8-layer convolutional neural network by supervised learning from a database of human professional games, which without any search, defeated the traditional search program Gnu Go in 86% of the games^{[3]}^{[4]}^{[5]}^{[6]}. In their paperMove Evaluation in Go Using Deep Convolutional Neural Networks^{[7]}, Chris J. Maddison, Aja Huang, Ilya Sutskever, and David Silver report they trained a large 12-layer convolutional neural network in a similar way, to beat Gnu Go in 97% of the games, and matched the performance of a state-of-the-art Monte-Carlo tree search that simulates a million positions per move^{[8]}.In 2015, a team affiliated with Google DeepMind around David Silver and Aja Huang, supported by Google researchers John Nham and Ilya Sutskever, build a Go playing program dubbed AlphaGo

^{[9]}, combining Monte-Carlo tree search with their 12-layer networks^{[10]}.## Chess

## Giraffe & Zurichess

In 2015, Matthew Lai trained Giraffe's deep neural network by TD-Leaf^{[11]}. Zurichess by Alexandru Moșoi uses the TensorFlow library for automated tuning - in a two layers neural network, the second layer is responsible for a tapered eval to phase endgame and middlegame scores^{[12]}.## DeepChess

In 2016, Omid E. David, Nathan S. Netanyahu, and Lior Wolf introduced DeepChess obtaining a grandmaster-level chess playing performance using a learning method incorporating two deep neural networks, which are trained using a combination of unsupervised pretraining and supervised training. The unsupervised training extracts high level features from a given chess position, and the supervised training learns to compare two chess positions to select the more favorable one. In order to use DeepChess inside a chess program, a novel version of alpha-beta is used that does not require bounds but positions αpos and βpos^{[13]}.## AlphaZero

In December 2017, the Google DeepMind team with Matthew Lai involved published on their generalized AlphaZero algorithm, combining Deep learning with Monte-Carlo Tree Search. AlphaZero can achieve, tabula rasa, superhuman performance in many challenging domains with some training effort. Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved a superhuman level of play in the games of chess and Shogi as well as Go, and convincingly defeated a world-champion program in each case^{[14]}.## See also

## Selected Publications

## 1965 ...

1965).Cybernetic Predicting Devices. Naukova Dumka .1971).Polynomial theory of complex systems. IEEE Transactions on Systems, Man, and Cybernetics, Vol. 1, No. 4## 1980 ...

1980).Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position. Biological Cybernetics, Vol. 36, pdf^{[15]}1985).A Learning Algorithm for Boltzmann Machines. Cognitive Science, Vol. 9, No. 1, pdf1986).Learning While Searching in Constraint-Satisfaction-Problems. AAAI 86, pdf^{[16]}## 1990 ...

1991).Untersuchungen zu dynamischen neuronalen Netzen. Diploma thesis, TU Munich, advisor Jürgen Schmidhuber, pdf (German)^{[17]}1997).Long short-term memory. Neural Computation, Vol. 9, No. 8, pdf^{[18]}## 2000 ...

2000).Multi-Valued and Universal Binary Neurons: Theory, Learning and Applications. Springer^{[19]}2005).Co-Evolving Recurrent Neurons Learn Deep Memory POMDPs. GECCO 2005, pdf2006).A Fast Learning Algorithm for Deep Belief Nets. Neural Computation, Vol. 18, No. 7, pdf2008).Mimicking Go Experts with Convolutional Neural Networks. ICANN 2008, pdf## 2012 ...

2012).Deep Learning of Representations for Unsupervised and Transfer Learning. JMLR: Workshop on Unsupervised and Transfer Learning, 2011, pdf20132013).On Layer-Wise Representations in Deep Neural Networks. Ph.D. Thesis, TU Berlin, advisor Klaus-Robert Müller2013).Playing Atari with Deep Reinforcement Learning. arXiv:1312.5602^{[20]}20142014).Generative Adversarial Networks. arXiv:1406.2661v12014).Multi-Criteria Comparison of Coevolution and Temporal Difference Learning on Othello. EvoApplications 2014, Springer, volume 86022014).Teaching Deep Convolutional Neural Networks to Play Go. arXiv:1412.3409^{[21]}^{[22]}2014).Move Evaluation in Go Using Deep Convolutional Neural Networks. arXiv:1412.6564v1 » Go2014).Deep Learning in Neural Networks: An Overview. arXiv:1404.7828## 2015 ...

2015).Convolutional Monte Carlo Rollouts in Go. arXiv:1512.03375 » Go, MCTS2015).Neural networks and deep learning. Determination Press2015).Giraffe: Using Deep Reinforcement Learning to Play Chess. M.Sc. thesis, Imperial College London, arXiv:1509.01549v1 » Giraffe2015).Poker-CNN: A Pattern Learning Strategy for Making Draws and Bets in Poker Games. arXiv:1509.067312015).Fast Algorithms for Convolutional Neural Networks. arXiv:1509.09308^{[23]}2015).Human-level control through deep reinforcement learning. Nature, Vol. 5182015).Better Computer Go Player with Neural Network and Long-term Prediction. arXiv:1511.06410, ICLR 2016^{[24]}^{[25]}» Go2015).A Tutorial on Deep Learning - Part 1: Nonlinear Classifiers and The Backpropagation Algorithm. Google Brain, pdf^{[26]}2015).A Tutorial on Deep Learning - Part 2: Autoencoders, Convolutional Neural Networks and Recurrent Neural Networks. Google Brain, pdf2015).Deep Learning in Neural Networks: An Overview. Neural Networks, Vol. 612015).A Critical Review of Recurrent Neural Networks for Sequence Learning. arXiv:1506.00019v42015).Predicting Moves in Chess using Convolutional Neural Networks. pdf^{[27]}^{[28]}2015).Deep Learning. Nature, Vol. 521^{[29]}2015).Deep Residual Learning for Image Recognition. arXiv:1512.0338520162016).8-Bit Approximations for Parallelism in Deep Learning. arXiv:1511.04561v4, ICLR 20162016).Dueling Network Architectures for Deep Reinforcement Learning. arXiv:1511.065812016).Mastering the game of Go with deep neural networks and tree search. Nature, Vol. 529 » AlphaGo2016).Using Deep Convolutional Neural Networks in Monte Carlo Tree Search. CG 20162016).AlphaGo: Combining Deep Neural Networks with Tree Search. CG 2016, Keynote Lecture2016).An Empirical Study on Applying Deep Reinforcement Learning to the Game 2048. CG 20162016).DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess. ICAAN 2016, Lecture Notes in Computer Science, Vol. 9887, Springer, pdf preprint » DeepChess^{[30]}^{[31]}2016).DNN-Buddies: A Deep Neural Network-Based Estimation Metric for the Jigsaw Puzzle Problem. ICAAN 2016, Lecture Notes in Computer Science, Vol. 9887, Springer^{[32]}2016).Deep Learning. MIT Press2016).Asynchronous Methods for Deep Reinforcement Learning. arXiv:1602.01783v22016).Deep Reinforcement Learning from Self-Play in Imperfect-Information Games. arXiv:1603.01121^{[33]}2016).Deep Learning Games. NIPS 20162016).Progressive Neural Networks. arXiv:1606.046712016).Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates. arXiv:1610.006332016).Learning to reinforcement learn. arXiv:1611.057632016).Deep Learning for Go. B.Sc. thesis, ETH Zurich20172017).Residual Networks for Computer Go. IEEE Transactions on Computational Intelligence and AI in Games, Vol. PP, No. 99, pdf2017).Deep Learning and Block Go. IJCNN 20172017).Multi-Labelled Value Networks for Computer Go. arXiv:1705.107012017).DeepStack: Expert-level artificial intelligence in heads-up no-limit poker. Science, Vol. 356, No. 63372017).Improved Policy Networks for Computer Go. Advances in Computer Games 15, pdf2017).Deep Reinforcement Learning with Hidden Layers on Future States. Computer Games Workshop at IJCAI 2017, pdf2017).Neural Fictitious Self-Play in Imperfect Information Games with Many Players. Computer Games Workshop at IJCAI 2017, pdf2017).A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning. arXiv:1711.008322017).Mastering the game of Go without human knowledge. Nature, Vol. 550, pdf^{[34]}2017).Deep Reinforcement Learning that Matters. arXiv:1709.065602017).Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv:1712.01815 » AlphaZero2017).Learning to Play Othello Without Human Knowledge. Stanford University, pdf » AlphaZero, MCTS, Othello^{[35]}## Forum Posts

## 2014

## 2015 ...

^{[36]}2016^{[37]}Re: Deep Learning Chess Engine ? by Alexandru Mosoi, CCC, July 21, 2016 » Zurichess

Re: Deep Learning Chess Engine ? by Matthew Lai, CCC, August 04, 2016 » Giraffe

^{[38]}2017Re: Is AlphaGo approach unsuitable to chess? by Peter Österlund, CCC, May 31, 2017 » Texel

Re: To TPU or not to TPU... by Rémi Coulom, CCC, December 16, 2017

2018## External Links

## Networks

## Software

## Libraries

## Chess

^{[39]}^{[40]}## Games

^{[41]}## Music Generation

## Nvidia

## Reports & Blogs

^{[42]}^{[43]}^{[44]}^{[45]}Texas Hold'em: AI is almost as good as humans at playing poker by Matt Burgess, Wired UK, March 30, 2016

^{[46]}GitHub - suragnair/alpha-zero-general: A clean and simple implementation of a self-play learning algorithm based on AlphaGo Zero (any game, any framework!)

## Videos

^{[47]}## References

2014).Teaching Deep Convolutional Neural Networks to Play Go. arXiv:1412.34092014).Move Evaluation in Go Using Deep Convolutional Neural Networks. arXiv:1412.6564v12016).Mastering the game of Go with deep neural networks and tree search. Nature, Vol. 5292016).DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess. ICAAN 2016, Lecture Notes in Computer Science, Vol. 9887, Springer, pdf preprint2017).Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv:1712.018152015)Critique of Paper by "Deep Learning Conspiracy" (Nature 521 p 436).1986).Learning While Searching in Constraint-Satisfaction-Problems. AAAI 86, pdf2015).Predicting Moves in Chess using Convolutional Neural Networks. pdf2015).Better Computer Go Player with Neural Network and Long-term Prediction. arXiv:1511.064102016).Deep Reinforcement Learning from Self-Play in Imperfect-Information Games. arXiv:1603.01121## What links here?

Up one Level