David+Silver

a British computer scientist at Google DeepMind, and co-author of AlphaGo and AlphaZero. Before, since 2010, he was researcher at [|University College London], postdoc at Massachusetts Institute of Technology, Ph.D student and postdoc at University of Alberta, and [|CTO] for [|Elixir Studios] and lead programmer on the PC [|strategy game] [|Republic: the Revolution]. His research interests covers simulation-based search, reinforcement learning, and cooperative [|pathfinding]. || toc =Photos= =See also=
 * Home * People * David Silver**
 * [[image:Dave_Silver.jpg link="http://www.cs.ucl.ac.uk/staff/D.Silver/web/Home.html"]] ||~ || **David Silver**,
 * David Silver ||~ ||^ ||
 * [[image:DaveSilverGerryTesauro.jpg link="http://www-all.cs.umass.edu/~gdk/msrl/index.html"]] ||
 * David Silver introducing Gerry Tesauro ||
 * AlphaGo
 * AlphaZero
 * Reinforcement Learning Course

=Selected Publications=

2006 ...

 * David Silver (**2006**). //Cooperative Pathﬁnding//. In AI Game Programming Wisdom 3, pages 99–111. Charles River Media, [|pdf]
 * 2007**
 * David Silver, Richard Sutton, Martin Müller (**2007**). //Reinforcement learning of local shape in the game of Go//.[|20th IJCAI], [|pdf], [|pdf]
 * Sylvain Gelly, David Silver (**2007**). //Combining Online and Offline Knowledge in UCT.// [|pdf]
 * 2008**
 * David Silver, Richard Sutton and Martin Müller (**2008**). //Sample-Based Learning and Search with Permanent and Transient Memories//. In Proceedings of the 25th International Conference on Machine Learning, [|pdf]
 * Sylvain Gelly, David Silver (**2008**). //Achieving Master Level Play in 9 x 9 Computer Go.// [|pdf]
 * 2009**
 * David Silver, Gerald Tesauro (**2009**). //Monte-Carlo Simulation Balancing//. In Proceedings of the 26th International Conference on Machine Learning (ICML-09).
 * Richard Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári and Eric Wiewiora. (**2009**). //Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation//. In Proceedings of the 26th International Conference on Machine Learning (ICML-09). [|pdf]
 * Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard Sutton (**2009**). //Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.// Accepted in Advances in Neural Information Processing Systems 22, Vancouver, BC. December 2009. MIT Press. [|pdf]
 * Joel Veness, David Silver, William Uther, Alan Blair (**2009**). //[|Bootstrapping from Game Tree Search]//. [|pdf]
 * David Silver (**2009**). //Reinforcement Learning and Simulation-Based Search//. Ph.D. thesis, University of Alberta, [|pdf]
 * Joel Veness, Kee Siong Ng, Marcus Hutter, David Silver (**2009**). //A Monte Carlo AIXI Approximation//, [|pdf]
 * David Silver, Gerald Tesauro (**2009**). //Monte-Carlo Simulation Balancing//. [|ICML 2009], [|pdf]

2010 ...

 * Joel Veness, Kee Siong Ng, Marcus Hutter, David Silver (**2010**). //Reinforcement Learning via AIXI Approximation//. Association for the Advancement of Artificial Intelligence (AAAI), [|pdf]
 * 2011**
 * Sylvain Gelly, David Silver (**2011**). //Monte-Carlo tree search and rapid action value estimation in computer Go//. [|Artificial Intelligence], Vol. 175, No. 11
 * Sylvain Gelly, Marc Schoenauer, Michèle Sebag, Olivier Teytaud, Levente Kocsis, David Silver, Csaba Szepesvári (**2012**). //[|The Grand Challenge of Computer Go: Monte Carlo Tree Search and Extensions]//. Communications of the ACM, Vol. 55, No. 3, [|pdf preprint]
 * 2012**
 * Arthur Guez, David Silver, Peter Dayan (**2012**). //Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search//. [|NIPS 2012], [|pdf]
 * 2013**
 * Arthur Guez, David Silver, Peter Dayan (**2013**). //Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search//. [|Journal of Artificial Intelligence Research], Vol. 48, [|pdf]
 * David Silver, Richard Sutton, Martin Mueller (**2013**). //Temporal-Difference Search in Computer Go//. Proceedings of the [|ICAPS-13 Workshop on Planning and Learning], [|pdf]
 * Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller (**2013**). //Playing Atari with Deep Reinforcement Learning//. [|arXiv:1312.5602]
 * 2014**
 * Tom Schaul, Ioannis Antonoglou, David Silver (**2014**). //Unit Tests for Stochastic Optimization//. [|arXiv:1312.6055v3]
 * Arthur Guez, David Silver, Peter Dayan (**2014**). //Better Optimism By Bayes: Adaptive Planning with Rich Models//. [|arXiv:1402.1958v1]
 * Arthur Guez, Nicolas Heess, David Silver, Peter Dayan (**2014**). //Bayes-Adaptive Simulation-based Search with Value Function Approximation//. [|NIPS 2014], [|pdf]
 * Chris J. Maddison, Aja Huang, Ilya Sutskever, David Silver (**2014**). //Move Evaluation in Go Using Deep Convolutional Neural Networks//. [|arXiv:1412.6564v1] » DCNN in Go
 * Johannes Heinrich, David Silver (**2014**). //[|Self-Play Monte-Carlo Tree Search in Computer Poker]//. AAAI-14 Workshop

2015 ...

 * Johannes Heinrich, Marc Lanctot, David Silver (**2015**). //Fictitious Self-Play in Extensive-Form Games//. [|JMLR: W&CP, Vol. 37], [|pdf]
 * Johannes Heinrich, David Silver (**2015**). //Smooth UCT Search in Computer Poker//. IJCAI 2015, [|pdf]
 * Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis (**2015**). //[|Human-level control through deep reinforcement learning]//. [|Nature], Vol. 518
 * Arun Nair, Praveen Srinivasan, Sam Blackwell, Cagdas Alcicek, Rory Fearon, Alessandro De Maria, Veda Panneershelvam, Mustafa Suleyman, Charles Beattie, Stig Petersen, Shane Legg, Volodymyr Mnih, Koray Kavukcuoglu, David Silver (**2015**). //Massively Parallel Methods for Deep Reinforcement Learning//. [|arXiv:1507.04296]
 * Timothy Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra (**2015**). //Continuous Control with Deep Reinforcement Learning//. [|arXiv:1509.02971]
 * Hado van Hasselt, Arthur Guez, David Silver (**2015**). //Deep Reinforcement Learning with Double Q-learning//. [|arXiv:1509.06461]
 * Tom Schaul, John Quan, Ioannis Antonoglou, David Silver (**2015**). //Prioritized Experience Replay//. [|arXiv:1511.05952]
 * Nicolas Heess, Jonathan J. Hunt, Timothy Lillicrap, David Silver (**2015**). //Memory-based control with recurrent neural networks//. [|arXiv:1512.04455]
 * 2016**
 * David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis (**2016**). //[|Mastering the game of Go with deep neural networks and tree search]//. [|Nature], Vol. 529 » AlphaGo
 * Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu (**2016**). //Asynchronous Methods for Deep Reinforcement Learning//. [|arXiv:1602.01783v2]
 * Max Jaderberg, Volodymyr Mnih, Wojciech Marian Czarnecki, Tom Schaul, Joel Z. Leibo, David Silver, Koray Kavukcuoglu (**2016**). //Reinforcement Learning with Unsupervised Auxiliary Tasks//. [|arXiv:1611.05397v1]
 * Hado van Hasselt, Arthur Guez, Matteo Hessel, Volodymyr Mnih, David Silver (**2016**). //Learning values across many orders of magnitude//. [|arXiv:1602.07714v2], [|NIPS 2016]
 * Johannes Heinrich, David Silver (**2016**). //Deep Reinforcement Learning from Self-Play in Imperfect-Information Games//. [|arXiv:1603.01121]
 * 2017**
 * David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel, Demis Hassabis (**2017**). //[|Mastering the game of Go without human knowledge]//. [|Nature], Vol. 550
 * Marc Lanctot, Vinícius Flores Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Pérolat, David Silver, Thore Graepel (**2017**). //A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning//. [|arXiv:1711.00832]
 * David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis (**2017**). //Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm//. [|arXiv:1712.01815] » AlphaZero

=External Links= > [|AlphaGo Zero: Discovering new knowledge] by David Silver, [|YouTube] Video > media type="youtube" key="WXHFqTvfFSw"
 * [|David Silver homepage]
 * [|David Silver - Google Scholar Citation]
 * [|David M. Silver] from Microsoft Academic Search
 * [|Advanced Topics: RL] by David Silver
 * [|Monte-Carlo Simulation Balancing - videolectures.net]
 * [|AlphaGo's next move] by Demis Hassabis and David Silver, DeepMind, May 27, 2017
 * [|AlphaGo Zero: Learning from scratch] by Demis Hassabis and David Silver, DeepMind, October 18, 2017

=References= =What links here?= include page="David Silver" component="backlinks" limit="100"
 * Up one level**