Learning,
the process of acquiring new knowledge which involves synthesizing different types of information. Machine learning as aspect of computer chess programming deals with algorithms that allow the program to change its behavior based on data, which for instance occurs during game playing against a variety of opponents considering the final outcome and/or the game record for instance as history score chart indexed by ply. Related to Machine learning is evolutionary computation and its sub-areas of genetic algorithms, and genetic programming, that mimics the process of natural evolution, as further mentioned in automated tuning. The process of learning often implies understanding, perception or reasoning. So called Rote learning avoids understanding and focuses on memorization. Inductive learning takes examples and generalizes rather than starting with existing knowledge. Deductive learning takes abstract concepts to make sense of examples ^{[1]}.

Learning inside a chess program may address several disjoint issues. A persistent hash table remembers "important" positions from earlier games inside the search with its exact score^{[3]}. Worse positions may be avoided in advance. Learning opening book moves, that is appending successful novelties or modify the probability of already stored moves from the book based on the outcome of a game ^{[4]}. Another application is learning evaluation weights of various features, f. i. piece-^{[5]} or piece-square^{[6]} values or mobility. Programs may also learn to control search ^{[7]} or time usage^{[8]}.

Supervised learning is learning from examples provided by a knowledgable external supervisor. In machine learning, supervised learning is a technique for deducing a function from training data. The training data consist of pairs of input objects and desired outputs, f.i. in computer chess a sequence of positions associated with the outcome of a game ^{[9]} .

Unsupervised Learning

Unsupervised machine learning seems much harder: the goal is to have the computer learn how to do something that we don't tell it how to do. The learner is given only unlabeled examples, f. i. a sequence of positions of a running game but the final result (still) unknown. A form of reinforcement learning can be used for unsupervised learning, where an agent bases its actions on the previous rewards and punishments without necessarily even learning any information about the exact ways that its actions affect the world. Clustering is another method of unsupervised learning.

Reinforcement Learning

Reinforcement learning is defined not by characterizing learning methods, but by characterizing a learning problem. Reinforcement learning is learning what to do - how to map situations to actions - so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them. The reinforcement learning problem is deeply indebted to the idea of Markov decision processes (MDPs) from the field of optimal control.

Herbert Simon, Edward Feigenbaum (1964). An Information-processing Theory of Some Effects of Similarity, Familiarization, and Meaningfulness in Verbal Learning. Journal of Verbal Learning and Verbal Behavior, Vol. 3, No. 5, pdf

Donald Michie (1966). Game Playing and Game Learning Automata. Advances in Programming and Non-Numerical Computation, Leslie Fox (ed.), pp. 183-200. Oxford, Pergamon. » Includes Appendix: Rules of SOMAC by John Maynard Smith, introduces Expectiminimax tree^{[14]}

A. Harry Klopf (1972). Brain Function and Adaptive Systems - A Heterostatic Theory. Air Force Cambridge Research Laboratories, Special Reports, No. 133, pdf

Jacques Pitrat (1976). A Program to Learn to Play Chess. Pattern Recognition and Artificial Intelligence, pp. 399-419. Academic Press Ltd. London, UK. ISBN 0-12-170950-7.

Jacques Pitrat (1976). Realization of a Program Learning to Find Combinations at Chess. Computer Oriented Learning Processes (ed. J. Simon). Noordhoff, Groningen, The Netherlands.

Pericles Negri (1977). Inductive Learning in a Hierarchical Model for Representing Knowledge in Chess End Games. pdf

Boris Stilman (1977). The Computer Learns. in 1976 US Computer Chess Championship, by David Levy, Computer Science Press, Woodland Hills, CA, pp. 83-90

Richard Sutton (1978). Single channel theory: A neuronal theory of learning. Brain Theory Newsletter 3, No. 3/4, pp. 72-75.

Ross Quinlan (1979). Discovering Rules by Induction from Large Collections of Examples. Expert Systems in the Micro-electronic Age, pp. 168-201. Edinburgh University Press (Introducing ID3)

A. Harry Klopf (1982). The Hedonistic Neuron: A Theory of Memory, Learning, and Intelligence. Hemisphere Publishing Corporation, University of Michigan

Ross Quinlan (1983). Learning efficient classification procedures and their application to chess end games. In Machine Learning: An Artificial Intelligence Approach, pages 463–482. Tioga, Palo Alto

Alen Shapiro (1983). The Role of Structured Induction in Expert Systems. University of Edinburgh, Machine Intelligence Research Unit (Ph.D. thesis)

Hans Berliner (1985). Goals, Plans, and Mechanisms: Non-symbolically in an Evaluation Surface. Presentation at Evolution, Games, and Learning, Center for Nonlinear Studies, Los Alamos National Laboratory, May 21.

Jens Christensen, Richard Korf (1986). A Unified Theory of Heuristic Evaluation functions and Its Applications to Learning. Proceedings of the AAAI-86, pp. 148-152, pdf.

Ivan Bratko, Igor Kononenko (1986). Learning Rules from Incomplete and Noisy Data. Proceedings Unicom Seminar on the Scope of Artificial Intelligence in Statistics. Technical Press

Alen Shapiro (1987). Structured Induction in Expert Systems. Turing Institute Press in association with Addison-Wesley Publishing Company, Workingham, UK

Bruce Abramson (1988). Learning Expected-Outcome Evaluators in Chess. Proceedings of the 1988 AAAI Spring Symposium Series: Computer Game Playing, 26-28.

Bruce Abramson (1989). On Learning and Testing Evaluation Functions. Proceedings of the Sixth Israeli Conference on Artificial Intelligence, 1989, 7-16.

Richard Sutton, Andrew Barto (1990). Time Derivative Models of Pavlovian Reinforcement. Learning and Computational Neuroscience: Foundations of Adaptive Networks: 497-537.

Bruce Abramson (1990). On Learning and Testing Evaluation Functions. Journal of Experimental and Theoretical Artificial Intelligence 2: 241-251.

Steven Walczak (1991). Predicting Actions from Induction on Past Performance. Proceedings of the 8th International Workshop on Machine Learning , pp. 275-279. Morgan Kaufmann

Michael Bain (1992). Learning optimal chess strategies. Proc. Intl. Workshop on Inductive Logic Programming (ed. Stephen Muggleton), Institute for New Generation Computer Technology, Tokyo, Japan.

Shaul Markovitch, Yaron Sella (1993). Learning of Resource Allocation Strategies for Game Playing, The proceedings of the 13th International Joint Conference on Artificial Intelligence, Chambery, France. pdf

Michael Bain, Stephen Muggleton (1994). Learning Optimal Chess Strategies. Machine Intelligence 13 (eds. K. Furukawa and Donald Michie), pp. 291-309. Oxford University Press, Oxford, UK. ISBN 0198538502.

Stuart Russell (1996). Machine Learning. Chapter 4 of M. A. Boden (Ed.), Artificial Intelligence, Academic Press. Part of the Handbook of Perception and Cognition, ps

Kieran Greer, Piyush Ojha, David A. Bell (1997). Learning Search Heuristics from Examples: A Study in Computer Chess, Seventh Conference of the Spanish Association for Artificial Intelligence, CAEPIA’97, November, pp. 695-704.

Ronald Parr, Stuart Russell (1997). Reinforcement Learning with Hierarchies of Machines. In Advances in Neural Information Processing Systems 10, MIT Press, zipped ps

Jonathan Baxter, Andrew Tridgell, Lex Weaver (1998). Knightcap: A chess program that learns by combining td(λ) with game-tree search, Proceedings of the 15th International Conference on Machine Learning, pdf via citeseerX

Csaba Szepesvári (1998). Reinforcement Learning: Theory and Practice. Proceedings of the 2nd Slovak Conference on Artificial Neural Networks, zipped ps

Ryszard Michalski (1998). Learnable Evolution: Combining Symbolic and Evolutionary Learning. Proceedings of the Fourth International Workshop on Multistrategy Learning (MSL'98)

Vassilis Papavassiliou, Stuart Russell (1999). Convergence of reinforcement learning with general function approximators. In Proc. IJCAI-99, Stockholm, ps

Andrew Ng, Stuart Russell (2000). Algorithms for inverse reinforcement learning. In Proceedings of the Seventeenth International Conference on Machine Learning, Stanford, California: Morgan Kaufmann, pdf

Yngvi Björnsson, Tony Marsland (2002). Learning Control of Search Extensions. Proceedings of the 6th Joint Conference on Information Sciences (JCIS 2002), pp. 446-449. pdf

Michael Buro (2002). Improving Mini-max Search by Supervised Learning. Artificial Intelligence, Vol. 134, No. 1, pp. 85-99. ISSN 0004-3702. pdf

Mark Winands, Levente Kocsis, Jos Uiterwijk, Jaap van den Herik (2002). Temporal difference learning and the Neural MoveMap heuristic in the game of Lines of Action. In Mehdi, Q,., Gouch, N., and Cavazza, M., editors, GAME-ON 2002 3rd International Conference on Intelligent Games and Simulation, pages 99-103. SCS Europe Bvba. pdf

Judea Pearl, Stuart Russell (2003). Bayesian Networks. In Michael A. Arbib, Ed., The Handbook of Brain Theory and Neural Networks, 2nd edition, MIT Press, pdf

Yngvi Björnsson, Vignir Hafsteinsson, Ársæll Jóhannsson, Einar Jónsson (2004). Efficient Use of Reinforcement Learning in a Computer Game. In Computer Games: Artificial Intellignece, Design and Education (CGAIDE'04), pp. 379–383, 2004. pdf

Dave Gomboc (2004). Tuning Evaluation Functions by Maximizing Concordance Master of Science Thesis, pdf

Dave Gomboc, Michael Buro, Tony Marsland (2005). Tuning evaluation functions by maximizing concordance Theoretical Computer Science, Volume 349, Issue 2, pp. 202-229, pdf

Sverrir Sigmundarson, Yngvi Björnsson. (2006) Value Back-Propagation vs. Backtracking in Real-Time Search. In Proceedings of the National Conference on Artificial Intelligence (AAAI), Workshop on Learning For Search, pp. 136–141, AAAI Press, Boston, Massachusetts, USA, July 2006. pdf

Johannes Fürnkranz (2007). Recent advances in machine learning and game playing. ÖGAI Journal, Vol. 26, No. 2, Computer Game Playing, pdf

2008

Marco Block, Maro Bader, Ernesto Tapia, Marte Ramírez, Ketill Gunnarsson, Erik Cuevas, Daniel Zaldivar, Raúl Rojas (2008). Using Reinforcement Learning in Chess Engines. CONCIBE SCIENCE 2008, Research in Computing Science: Special Issue in Electronics and Biomedical Engineering, Computer Science and Informatics, ISSN:1870-4069, Vol. 35, pp. 31-40, Guadalajara, Mexico, pdf

Home * LearningLearning,the process of acquiring new knowledge which involves synthesizing different types of information. Machine learning as aspect of computer chess programming deals with algorithms that allow the program to change its behavior based on data, which for instance occurs during game playing against a variety of opponents considering the final outcome and/or the game record for instance as history score chart indexed by ply. Related to Machine learning is evolutionary computation and its sub-areas of genetic algorithms, and genetic programming, that mimics the process of natural evolution, as further mentioned in automated tuning. The process of learning often implies understanding, perception or reasoning. So called Rote learning avoids understanding and focuses on memorization. Inductive learning takes examples and generalizes rather than starting with existing knowledge. Deductive learning takes abstract concepts to make sense of examples

^{[1]}.^{[2]}## Table of Contents

## Learning inside a Chess Program

Learning inside a chess program may address several disjoint issues. A persistent hash table remembers "important" positions from earlier games inside the search with its exact score^{[3]}. Worse positions may be avoided in advance. Learning opening book moves, that is appending successful novelties or modify the probability of already stored moves from the book based on the outcome of a game^{[4]}. Another application is learning evaluation weights of various features, f. i. piece-^{[5]}or piece-square^{[6]}values or mobility. Programs may also learn to control search^{[7]}or time usage^{[8]}.## Learning Paradigms

There are three major learning paradigms, each corresponding to a particular abstract learning task. These are supervised learning, unsupervised learning and reinforcement learning. Usually any given type of neural network architecture can be employed in any of those tasks.## Supervised Learning

Supervised learning is learning from examples provided by a knowledgable external supervisor. In machine learning, supervised learning is a technique for deducing a function from training data. The training data consist of pairs of input objects and desired outputs, f.i. in computer chess a sequence of positions associated with the outcome of a game^{[9]}.## Unsupervised Learning

Unsupervised machine learning seems much harder: the goal is to have the computer learn how to do something that we don't tell it how to do. The learner is given only unlabeled examples, f. i. a sequence of positions of a running game but the final result (still) unknown. A form of reinforcement learning can be used for unsupervised learning, where an agent bases its actions on the previous rewards and punishments without necessarily even learning any information about the exact ways that its actions affect the world. Clustering is another method of unsupervised learning.## Reinforcement Learning

Reinforcement learning is defined not by characterizing learning methods, but by characterizing a learning problem. Reinforcement learning is learning what to do - how to map situations to actions - so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them. The reinforcement learning problem is deeply indebted to the idea of Markov decision processes (MDPs) from the field of optimal control.## Learning Topics

## Programs

## See also

## Selected Publications

^{[10]}## 1940 ...

1942).Some observations on the simple neuron circuit. Bulletin of Mathematical Biology, Vol. 4, No. 31943).A Logical Calculus of the Ideas Immanent in Nervous Activity. Bulletin of Mathematical Biology, Vol. 5, No. 11949).The Organization of Behavior. Wiley & Sons## 1950 ...

1951)Representation of Events in Nerve Nets and Finite Automata. RM-704, RAND paper, pdf, reprinted inClaude Shannon, John McCarthy (eds.) (

1956).Automata Studies. Annals of Mathematics Studies, No. 341951).Machines which can learn. American Scientist, 39:711-7161952).On Game Learning Machines. The Scientific Monthly, Vol. 74, No. 4, April 19521953).Chess. part of the collectionDigital Computers Applied to Gamesin Bertram Vivian Bowden (editor), Faster Than Thought, a symposium on digital computing machines, reprinted 1988 in Computer Chess Compendium, reprinted inAlan Turing, Jack Copeland (editor) (

2004).The Essential Turing, Seminal Writings in Computing, Logic, Philosophy, Artificial Intelligence, and Artificial Life plus The Secrets of Enigma. Oxford University Press, amazon, google books1954).Neural Nets and the Brain Model Problem. Ph.D. dissertation, Princeton University## 1955 ...

1956).Probabilistic Logic and the Synthesis of Reliable Organisms From Unreliable Components. inClaude Shannon, John McCarthy (eds.) (

1956).Automata Studies. Annals of Mathematics Studies, No. 34, pdf1957).The Perceptron - a Perceiving and Recognizing Automaton. Report 85-460-1, Cornell Aeronautical Laboratory^{[11]}1959).Imitation of Pattern Recognition and Trial-and-error Learning in a Conditional Probability Computer. Reviews of Modern Physics, Vol. 31, April 1959, pp. 546-548^{[12]}^{[13]}1959).Some Studies in Machine Learning Using the Game of Checkers. IBM Journal July 1959 » Checkers1959).An Information Processing Theory of Verbal Learning. RAND Paper## 1960 ...

1960).Information Theories of Human Verbal Learning. Ph.D. thesis, Carnegie Mellon University, advisor Herbert Simon1961).The Simulation of Verbal Learning Behavior. Proceedings Western Joint Conference, Vol. 191961).Performance of a Reading Task by an Elementary Perceiving and Memorizing Program. RAND Paper, pdf1961).Trial and Error. Penguin Science Survey, pdf1962).A Theory of the Serial Position Effect. British Journal of Psychology, Vol. 53, 307-32, pdf1962).Concept Learning: An Information Processing Problem. Wiley. google books1962).Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books1963).Learning, Generality and Problem Solving. Memorandum RM-3285-1-PR pdf1964).An Information-processing Theory of Some Effects of Similarity, Familiarization, and Meaningfulness in Verbal Learning. Journal of Verbal Learning and Verbal Behavior, Vol. 3, No. 5, pdf## 1965 ...

1965).A multipurpose Theorem Proving Heuristic Program that learns. IFIP Congress 65, Vol. 21966).Game Playing and Game Learning Automata.Advances in Programming and Non-Numerical Computation, Leslie Fox (ed.), pp. 183-200. Oxford, Pergamon. » Includes Appendix:Rules of SOMACby John Maynard Smith, introduces Expectiminimax tree^{[14]}1966).Thoughts on the Development of Computer Learning Programs. Defense Technical Information Center1966).A new Machine-Learning Technique applied to the Game of Checkers. MIT, Project MAC, MAC-M-2931967).Some Studies in Machine Learning. Using the Game of Checkers. II-Recent Progress. pdf1969).Perceptrons.^{[15]}^{[16]}## 1970 ...

1970).A Pattern Recognition Program which uses a Geometry-Preserving Representation of Features. Technical Report #85, pdf1971).On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities. Theory of Probability and its Applications, Vol. 16, No. 21972).Brain Function and Adaptive Systems - A Heterostatic Theory. Air Force Cambridge Research Laboratories, Special Reports, No. 133, pdf1972).Perceptrons: An Introduction to Computational Geometry. The MIT Press, 2nd edition with corrections1973).A Simulation of Memory for Chess Positions. Cognitive Psychology, Vol. 5, pp. 29-46. pdf1974).A Comparison and Evaluation of Three Machine Learning Procedures as Applied to the Game of Checkers. Artificial Intelligence, Vol. 5, No. 2 » Checkers## 1975 ...

1976).A Program to Learn to Play Chess.Pattern Recognition and Artificial Intelligence, pp. 399-419. Academic Press Ltd. London, UK. ISBN 0-12-170950-7.1976).Realization of a Program Learning to Find Combinations at Chess.Computer Oriented Learning Processes (ed. J. Simon). Noordhoff, Groningen, The Netherlands.1977).Inductive Learning in a Hierarchical Model for Representing Knowledge in Chess End Games. pdf1977).An experiment on inductive learning in chess endgames. Machine Intelligence 8, pdf1977).The Computer Learns. in1976 US Computer Chess Championship, by David Levy, Computer Science Press, Woodland Hills, CA, pp. 83-901978).Single channel theory: A neuronal theory of learning. Brain Theory Newsletter 3, No. 3/4, pp. 72-75.1979).Discovering Rules by Induction from Large Collections of Examples. Expert Systems in the Micro-electronic Age, pp. 168-201. Edinburgh University Press (Introducing ID3)## 1980 ...

1981).Learning and Abstraction in Simulation. IJCAI 1981, pdf1982).Acquisition of Appropriate Bias for Inductive Concept Learning. AAAI 1982, pdf1982).The Hedonistic Neuron: A Theory of Memory, Learning, and Intelligence. Hemisphere Publishing Corporation, University of Michigan1982).Automatic Induction of Classification Rules for Chess End game.Advances in Computer Chess 31982).A Learning Chess Program.Advances in Computer Chess 31983).Machine Learning: An Artificial Intelligence Approach. Tioga Publishing Company, ISBN 0-935382-05-4. google books1983).Learning efficient classification procedures and their application to chess end games. In Machine Learning: An Artificial Intelligence Approach, pages 463–482. Tioga, Palo Alto1983).The Role of Structured Induction in Expert Systems. University of Edinburgh, Machine Intelligence Research Unit (Ph.D. thesis)1984).EPAMlike models of recognition and learning. Cognitive Science, Vol. 8, 305-336, pdf1984).Automated Acquisition on Concepts for the Description of Middle-game Positions in Chess. Turing Institute, Glasgow, Scotland, TIRM-84-0051984).Shift of Bias for Inductive Concept Learning. Ph.D. thesis, Rutgers University, New Brunswick## 1985 ...

1985).Evaluation-Function Factors. ICCA Journal, Vol. 8, No. 2, pdf1985).Validating Concepts from Automated Acquisition Systems. IJCAI 85, pdf1985).Goals, Plans, and Mechanisms: Non-symbolically in an Evaluation Surface.Presentation at Evolution, Games, and Learning, Center for Nonlinear Studies, Los Alamos National Laboratory, May 21.1985).Machine Learning: An Artificial Intelligence Approach. Morgan Kaufmann, ISBN 0-934613-09-5. google books19861986).An Overview of Machine Learning in Chess.ICCA Journal, Vol. 9, No. 11986).A Unified Theory of Heuristic Evaluation functions and Its Applications to Learning.Proceedings of the AAAI-86, pp. 148-152, pdf.1986).Machine Learning: An Artificial Intelligence Approach, Volume II. Morgan Kaufmann, ISBN 0-934613-00-1. google books1986).Machine Learning: A Guide to Current Research. The Kluwer International Series in Engineering and Computer Science, Vol. 121986).Learning Rules from Incomplete and Noisy Data.Proceedings Unicom Seminar on the Scope of Artificial Intelligence in Statistics. Technical Press19871987).A Chess Program that uses its Transposition Table to Learn from Experience.ICCA Journal, Vol. 10, No. 21987).Learning Decision Lists. Machine Learning 2,3, pdf 20011987).A 'Neural' Network that Learns to Play Backgammon. NIPS 19871987).Structured Induction in Expert Systems. Turing Institute Press in association with Addison-Wesley Publishing Company, Workingham, UK1987).On the Operationality/Generality Trade-off in Explanation-based Learning. IJCAI 1987, pdf1987).Explanation-Based Learning of Generalized Robot Assembly Plans. Ph.D. thesis, University of Illinois at Urbana-Champaign, Advisor: Gerald Francis DeJong, II1987).Supervised Learning of Probability Distributions by Neural Networks. NIPS 198719881988).Learning Expected-Outcome Evaluators in Chess.Proceedings of the 1988 AAAI Spring Symposium Series: Computer Game Playing, 26-28.1988).Learning to Predict by the Methods of Temporal Differences. Machine Learning, Vol. 3, No. 1, pdf1988).Genetic Algorithms and Machine Learning. Machine Learning, Vol. 31988).Using Experience-Based Learning in Game Playing. Proceedings of the Fifth International Machine Learning Conference, CiteSeerX » Othello1988).A Pattern Classification Approach to Evaluation Function Learning. Artificial Intelligence, Vol. 36, No. 11988).ID5: An incremental ID3. ML 198819891989).A Self-Learning, Pattern-Oriented Chess Program. ICCA Journal, Vol. 12, No. 41989).On Learning and Testing Evaluation Functions.Proceedings of the Sixth Israeli Conference on Artificial Intelligence, 1989, 7-16.1989).Adaptive Learning of Decision-Theoretic Search Control Knowledge. In Proceedings of the Sixth International Workshop on Machine Learning. Ithaca, NY: Morgan Kaufmann1989).An Experimental Comparison of Human and Machine Learning Formalisms. 6. ML 1989, pdf1989).A Proposal for More Powerful Learning Algorithms. Neural Computation, Vol. 1, No. 21989).The Intelligent Novice - Learning to Play Better. Heuristic Programming in Artificial Intelligence 1## 1990 ...

1990).Time Derivative Models of Pavlovian Reinforcement. Learning and Computational Neuroscience: Foundations of Adaptive Networks: 497-537.1990).On Learning and Testing Evaluation Functions.Journal of Experimental and Theoretical Artificial Intelligence 2: 241-251.1990).Learning in Bebe.Computers, Chess, and Cognition » Mephisto Best-Publication Award1990).Machine Learning: An Artificial Intelligence Approach, Volume III. Morgan Kaufmann, ISBN 1-55860-119-8. google books1990).A symbolic-numerical approach for supervised learning from examples and rules. Ph.D. thesis, Paris Dauphine University19911991).The Design and Analysis of Efficient Learning Algorithms. Ph.D. thesis, Massachusetts Institute of Technology, supervisor Ronald L. Rivest, pdf1991).Feature Construction During Tree Learning. GWAI 1991: 50-61.1991).Neural Networks as a Guide to Optimization - The Chess Middle Game Explored. ICCA Journal, Vol. 14, No. 31991).Genetic Algorithms Optimizing Evaluation Functions. ICCA Journal, Vol. 14, No. 31991).Learning in Bebe.ICCA Journal, Vol. 14, No. 41991).Predicting Actions from Induction on Past Performance. Proceedings of the 8th International Workshop on Machine Learning , pp. 275-279. Morgan Kaufmann1991).Two Kinds of Training Information for Evaluation Function Learning. University of Massachusetts, Amherst, Proceedings of the AAAI 19911991).Neural networks that teach themselves through genetic discovery of novel examples. IEEE International Joint Conference on Neural Networks19921992).Introduction to Machine Learning. Advanced Topics in Artificial Intelligence 19921992).Learning optimal chess strategies.Proc. Intl. Workshop on Inductive Logic Programming (ed. Stephen Muggleton), Institute for New Generation Computer Technology, Tokyo, Japan.1992).First-Order Induction of Patterns in Chess. Ph.D. Thesis, The Turing Institute, University of Strathclyde, Glasgow1992).Learning Chess Patterns. Inductive Logic Programming (ed. Stephen Muggleton), Academic Press, The Apic Series, London, UK1992).Temporal Difference Learning of Backgammon Strategy. ML 19921992).Q-learning. Machine Learning, Vol. 8, No. 21992).Practical Issues in Temporal Difference Learning. Machine Learning, Vol. 8, No. 3-41992).Learning by Analogical Reasoning in General Purpose Problem Solving. Ph.D. thesis, Carnegie Mellon University, advisor Jaime Carbonell19931993).A Game Learning Machine. Ph.D. Thesis, University of California, San Diego, zipped ps1993).Learning of Resource Allocation Strategies for Game Playing, The proceedings of the 13th International Joint Conference on Artificial Intelligence, Chambery, France. pdf1993).Learning Models of Opponent's Strategy in Game Playing. AAAI Proceedings, CiteSeerX1993).Learning simple causal structures. International Journal of Intelligent Systems, 8, pp. 231-247.1993).Integrating Inductive Neural Network Learning and Explanation-Based Learning. IJCAI 1993, zipped ps1993).Bootstrap learning of α-β-evaluation functions. ICCI 1993, pdf19941994).Learning Patterns for Playing Strategies. ICCA Journal, Vol. 17, No. 11994).Towards a chess program based on a model of human memory.Advances in Computer Chess 7 » CHUMP1994).Learning Logical Exceptions in Chess. Ph.D. thesis, University of Strathclyde, CitySeerX1994).Learning Optimal Chess Strategies. Machine Intelligence 13 (eds. K. Furukawa and Donald Michie), pp. 291-309. Oxford University Press, Oxford, UK. ISBN 0198538502.1994).Machine Learning: A Multistrategy Approach, Volume IV. Morgan Kaufmann, ISBN 1-55860-251-8. google books1994).TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play. Neural Computation Vol. 6, No. 21994).A High-Performance Explanation-Based Learning Algorithm. Artificial Intelligence, Vol. 68, Nos. 1-21994).Evolving Neural Networks to focus Minimax Search. AAAI-94, pdf1994).Temporal Difference Learning of Position Evaluation in the Game of Go. Advances in Neural Information Processing Systems 6## 1995 ...

1995).Feature Construction during Tree Learning. GOSLER Final Report 1995: 391-4031995).Tuning Evaluation Functions for Search. ps or pdf from CiteSeerX1995).Learning Bayesian Networks: The Combination of Knowledge and Statistical Data. Machine Learning, Vol. 20, pdf1995).Learning and Problem Solving in Gogol, a Go playing program. pdf1995).Temporal Difference Learning and TD-Gammon. Communications of the ACM Vol. 38, No. 31995).Learning to Play the Game of Chess. in Gerald Tesauro, David S. Touretzky, Todd K. Leen (eds.) Advances in Neural Information Processing Systems 7, MIT Press1995).TD Learning of Game Evaluation Functions with Hierarchical Neural Architectures. Master's thesis, University of Amsterdam, pdf1995, 2002).The Handbook of Brain Theory and Neural Networks. The MIT Press1995).Optimization of Entropy with Neural Networks. Ph.D. thesis, University of California, San Diego1995).Learning and Memory: Major Ideas, Principles, Issues and Applications. Praeger, amazon.com19961996).Reinforcement Learning: An Alternative Approach to Machine Intelligence. pdf1996).Explanation-Based Neural Network Learning: A Lifelong Learning Approach. Kluwer Academic Publishers1996).Reinforcement Learning: A Survey. JAIR Vol. 4, pdf1996).Learning Playing Strategies in Chess. Computational Intelligence, Vol. 12, No. 1, CiteSeerX1996).Agnostic Learning and Single Hidden Layer Neural Networks.Ph.D. thesis, Australian National University, ps1996).Machine Learning in Computer Chess: The Next Generation.ICCA Journal, Vol. 19, No. 3, zipped ps1996).Perception and memory in chess. Heuristics of the professional eye.Assen: Van Gorcum, The Netherlands. ISBN 90-232-2949-5. Chapter 9; A discussion: Two authors, two different views? word1996).Machine Learning.Chapter 4 of M. A. Boden (Ed.), Artificial Intelligence, Academic Press. Part of the Handbook of Perception and Cognition, ps1996).Introduction to the special issue on games: Structure and Learning. Computational Intelligence, Vol. 12, No. 1, pdf1996).General Game-Playing and Reinforcement Learning. Computational Intelligence, Vol. 12, No. 11996).Learning to forecast by explaining the consequences of actions. pdf1996).Self fuzzy learning. pdf1996).Game Theory, On-line Prediction and Boosting. COLT 1996, pdf19971997).A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, Vol. 55, No. 1, 1996 pdf » AdaBoost1997).Long short-term memory. Neural Computation, Vol. 9, No. 8, pdf^{[17]}1997).On Learning How to Play. Advances in Computer Chess 8, CiteSeerX1997).Learning Piece Values Using Temporal Differences. ICCA Journal, Vol. 20, No. 31997).Learning Search Heuristics from Examples: A Study in Computer Chess, Seventh Conference of the Spanish Association for Artificial Intelligence, CAEPIA’97, November, pp. 695-704.1997).Where is the Impact of Bayesian Networks in Learning?In Proc. Fifteenth International Joint Conference on Artificial Intelligence, Nagoya, Japan, ps1997).Reinforcement Learning with Hierarchies of Machines.In Advances in Neural Information Processing Systems 10, MIT Press, zipped ps1997).Gogol (an Analytical Learning Program). IJCAI'97, pdf1997).Machine Learning. McGraw Hill1997).Stochastic Heuristics for Machine Learning & Machine Learning for Stochastic Optimization. Habilitation, Paris-Sud 11 University1997).Adversarial Reinforcement Learning. Carnegie Mellon University, ps1997).Generalizing Adversarial Reinforcement Learning. Carnegie Mellon University, ps1997).HQ-learning. Adaptive Behavior, Vol. 6, No 219981998).Knightcap: A chess program that learns by combining td(λ) with game-tree search, Proceedings of the 15th International Conference on Machine Learning, pdf via citeseerX1998).TDLeaf(lambda): Combining Temporal Difference Learning with Game-Tree Search. Australian Journal of Intelligent Information Processing Systems, Vol. 5 No. 1, arXiv:cs/99010011998).Experiments in Parameter Learning Using Temporal Differences. ICCA Journal, Volume 21 No. 2, pdf1998).Learning to Play Chess Selectively by Acquiring Move Patterns.ICCA Journal, Vol. 21, No. 2, pdf1998).Reinforcement Learning: Theory and Practice. Proceedings of the 2nd Slovak Conference on Artificial Neural Networks, zipped ps1998).Reinforcement Learning: An Introduction. MIT Press1998).Machine Learning and Data Mining: Methods and Applications. John Wiley & SonsMiroslav Kubat, Ivan Bratko, Ryszard Michalski (

1998).A Review of Machine Learning Methods. pdf1998).A Neural Network Program of Tsume-Go. CG 1998^{[18]}1998).Machine Introspection for Machine Learning. Tucson 1998, pdf1998).Integration of Different Reasoning Modes in a Go Playing and Learning System. pdf1998).Learning with Fuzzy Definitions of Goals. pdf1998).Learnable Evolution: Combining Symbolic and Evolutionary Learning. Proceedings of the Fourth International Workshop on Multistrategy Learning (MSL'98)1998).Pedagogical Method for Extraction of Symbolic Knowledge from Neural Networks. Rough Sets and Current Trends in Computing 19981998).Fast online Q (λ). Machine Learning, Vol. 33, No. 119991999).Book Learning - a Methodology to Tune an Opening Book Automatically. ICCA Journal, Vol. 22, No. 11999).A Pattern-Oriented Approach to Move Ordering: the Chessmaps Heuristic. ICCA Journal, Vol. 22, No. 11999).Toward Opening Book Learning.ICCA Journal, Vol. 22, No. 2, pdf1999).Learning Piece-Square Values using Temporal Differences.ICCA Journal, Vol. 22, No. 41999).A tutorial on learning with Bayesian networks. pdf from CiteSeerX1999).Positive and Unlabeled Examples help Learning, The 10th International Conference on Algorithmic Learning Theory, ps1999).Convergence of reinforcement learning with general function approximators.In Proc. IJCAI-99, Stockholm, ps1999).Evolving Logic Programs to Classify Chess-Endgame Positions. Simulated Evolution and Learning, Canberra, Australia. Lecture Notes in Artificial Intelligence, No. 1585, Springer, pdf » Endgame1999).Explorations in Efficient Reinforcement Learning. Ph.D. thesis, University of Amsterdam, advisors Frans Groen and Jürgen Schmidhuber1999).Unsupervised Learning: Foundations of Neural Computation. MIT Press## 2000 ...

2000).Learning Middle Game Patterns in Chess: A Case Study. Lecture Notes in Computer Science, Vol. 1821, Springer2000).The nature of statistical learning theory. Springer2000).A Review of Reinforcement Learning. AI Magazine, Vol. 21, No. 12000).Machine Learning in Games: A Survey. Austrian Research Institute for Artificial Intelligence, OEFAI-TR-2000-3, pdf2000).Learning to Use Operational Advice. ECAI-00, pdf2000).Learning from Perfection: A Data Mining Approach to Evaluation Function Learning in Awari. CG 2000, pdf2000).Chess Neighborhoods, Function Combination, and Reinforcement Learning. CG 20002000).Learning a Go Heuristic with Tilde. CG 20002000).Learning Time Allocation using Neural Networks. CG 2000, postscript2000).Toward Opening Book Learning.Games in AI Research (eds. Jaap van den Herik and Hiroyuki Iida), pp. 47-54. Universiteit Maastricht, Maastricht, The Netherlands. ISBN 90-621-6416-1.2000).Learning from Positive and Unlabeled Examples. ALT 2000: 71-85, ps2000).Algorithms for inverse reinforcement learning.In Proceedings of the Seventeenth International Conference on Machine Learning, Stanford, California: Morgan Kaufmann, pdf2000).An Integrated Connectionist Approach to Reinforcement Learning for Robotic Control. ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning2000).LEARNABLE EVOLUTION MODEL: Evolutionary Processes Guided by Machine Learning. Machine Learning, Vol. 38^{[19]}2000).Learning to Play Chess Using Temporal Differences. Machine Learning, Vol 40, No. 3, pdf2000).Generalising Closed World Specialisation: A Chess End Game Application. CitySeerX20012001).Learning to Evaluate Go Positions via Temporal Difference Methods. in Norio Baba, Lakhmi C. Jain (eds.) (2001).Computational Intelligence in Games, Studies in Fuzziness and Soft Computing. Physica-Verlag, revised version of 1994 paper2001).Temporal Difference Learning Applied to a High-Performance Game-Playing Program. IJCAI 20012001).Rational and Convergent Learning in Stochastic Games. IJCAI 20012001).Move Ordering using Neural Networks, IEA/AIE 2001, LNCS 2070, 45-50 ps2001).Machine Learning in MChess Professional. Advances in Computer Games 92001).Learning Search Control in Adversary Games. Advances in Computer Games 9, pp. 157-174. pdf2001).Chess Neighborhoods, Function Combinations and Reinforcements Learning. In Computers and Games (eds. Tony Marsland and I. Frank). Lecture Notes in Computer Science,. Springer,. pdf2001).Machine Learning and Light Relief: A Review of Truth from Trash. AI Magazine Vol. 22 No. 4, pdf2001).Infused Evolutionary Learning. Proceedings of the Eleventh Belgian-Dutch Conference on Machine Learning, pdf, pdf2001).The Foundations of Cost-Sensitive Learning. IJCAI 20012001).A learning architecture for the game of Go. Game-On 20012001).Machines that Learn to Play Games. Advances in Computation: Theory and Practice, Vol. 8,. NOVA Science Publishers20022002).Learning Control of Search Extensions. Proceedings of the 6th Joint Conference on Information Sciences (JCIS 2002), pp. 446-449. pdf2002).Improving Mini-max Search by Supervised Learning.Artificial Intelligence, Vol. 134, No. 1, pp. 85-99. ISSN 0004-3702. pdf2002).The Neural MoveMap Heuristic in Chess. CG 2002, ps2002).Local Move Prediction in Go. CG 20022002).Learning a Game Strategy Using Pattern-Weights and Self-play. CG 2002, pdf2002).Temporal difference learning and the Neural MoveMap heuristic in the game of Lines of Action. In Mehdi, Q,., Gouch, N., and Cavazza, M., editors, GAME-ON 2002 3rd International Conference on Intelligent Games and Simulation, pages 99-103. SCS Europe Bvba. pdf2002).Methods of Fuzzy Pattern RecognitionRiga Technical University, ps, covers Fuzzy Kora algorithm2003).Improved opponent intelligence trough offline learning. International Journal of Intelligent Games & Simulation, Vol. 22002).Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks. Genetic Programming and Evolvable Machines, Vol. 3, No. 42002).Finite-time Analysis of the Multiarmed Bandit Problem. Machine Learning, Vol. 47, No. 2, pdf2002).Many-Layered Learning. Neural Computation, Vol. 14, No. 10, pdf20032003).Two Learning Algorithms for Forward Pruning. ICGA Journal, Vol 26, No. 3, ps2003)Learning Search Decisions,PhD thesis, Universiteit Maastricht ps2003).Reinforcement Learning in der Schachprogrammierung, Studienarbeit, Freie Universität Berlin, Dozent: Prof. Dr. Raúl Rojas, pdf (German)2003).Evaluation Function Tuning via Ordinal Correlation. Advances in Computer Games 10, pdf2003).Artificial Intelligence: A Modern Approach. 2nd edition, 3rd edition 20092003).Bayesian Networks.In Michael A. Arbib, Ed., The Handbook of Brain Theory and Neural Networks, 2nd edition, MIT Press, pdf2003).Information Theory, Inference, and Learning Algorithms.2003).Abalearn: a Program that Learns How to Play Abalone. ICGA Journal, Vol. 26, No. 42003).Machine Learning in Computer Chess: Genetic Programming and KRK. Harvey Mudd College, pdf2003).Learning to play chess using reinforcement learning with database games. Master’s thesis, Cognitive Artiﬁcial Intelligence, Utrecht University2003).Learning Representative Patterns from Real Chess Positions: A Case Study. IICAI 200320042004).Efficient Use of Reinforcement Learning in a Computer Game. In Computer Games: Artificial Intellignece, Design and Education (CGAIDE'04), pp. 379–383, 2004. pdf2004).Tuning Evaluation Functions by Maximizing ConcordanceMaster of Science Thesis, pdf2004).Genetic Algorithms and Evolutionary Computationfrom the TalkOrigins Archive2004).Genetic algorithms for optimising chess position scoring, Masters thesis, pdf2004).Some aspects of chess programming, Technical University of Łódź , Faculty of Electrical and Electronic Engineering, Department of Computer Science, zipped pdf2004).Reinforcement learning in board games. CSTR-04-004, Department of Computer Science, University of Bristol. pdf^{[20]}2004).Evaluation of Chess Position by Modular Neural network Generated by Genetic Algorithm. EuroGP 20042004).PAC-Bayesian Statistical Learning Theory. Ph.D. thesis, Université Paris VI, pdf, slides as pdf2004).Efficient Exploration for Reinforcement Learning. MSc thesis, pdf2004).A Self-Learning Evolutionary Chess Program. Proceedings of the IEEE, Vol. 92 No. 12, pp. 1947-1954, CiteSeerX2004).Learning Through the KRKa2 Chess Ending. CIARP 20042004).Comparison of TDLeaf and TD learning in Game Playing Domain. 11. ICONIP, pdf2004).Multiagent Reinforcement Learning in Stochastic Games with Continuous Action Spaces. pdf2004).Learning to play chess using TD(λ)-learning with database games. Cognitive Artiﬁcial Intelligence, Utrecht University, Benelearn’04## 2005 ...

2005).Tuning evaluation functions by maximizing concordanceTheoretical Computer Science, Volume 349, Issue 2, pp. 202-229, pdf2005).Further Evolution of a Self-Learning Chess Program. IEEE Symposium on Computational Intelligence & Games, CiteSeerX2005).Chess by Imitation. Department of Computer Science, University of Bath, pdf^{[21]}2005).Learning to Play Board Games using Temporal Difference Methods. Technical Report, Utrecht University, UU-CS-2005-048, pdf2005).Scalable learning in many layers. University of Massachusetts Amherst, TR-05-02, pdf2005).RSPSA: Enhanced Parameter Optimization in Games. Advances in Computer Games 11, pdf2005).Optimal strategies — Learning from examples — Boolean equations. in Klaus P. Jantke, Steffen Lange (eds.) (2005). Algorithmic Learning for Knowledge-Based Systems, Lecture Notes in Computer Science 961, Springer20062006).Universal Parameter Optimisation in Games Based on SPSA. Machine Learning, Special Issue on Machine Learning and Games, Vol. 63, No. 32006)Value Back-Propagation vs. Backtracking in Real-Time Search.In Proceedings of the National Conference on Artificial Intelligence (AAAI), Workshop on Learning For Search, pp. 136–141, AAAI Press, Boston, Massachusetts, USA, July 2006. pdf2006).Universal Consistency and Bloat in GP. Some theoretical considerations about Genetic Programming from a Statistical Learning Theory viewpoint.pdf2006).Learning for stochastic dynamic programming. pdf2006).General lower bounds for evolutionary algorithms.pdf2006).Automatic Construction of Static Evaluation Functions for Computer Game Players. ALT ’062006).The Discipline of Machine Learning. CMU-ML-06-108, Carnegie Mellon University, pdf2006).Human and Machine Learning. Carnegie Mellon University, slides as pdf2006).Playing Stronger by learning. AI Factory, Winter 20062006).Temporal Difference Learning versus Co-Evolution for Acquiring Othello Position Evaluation. IEEE Symposium on Computational Intelligence and Games » Othello2006).Prediction, Learning, and Games. Cambridge University Press2006).Scalable Knowledge Acquisition through Cumulative Learning and Memory Organization. Ph.D. thesis, University of Massachusetts Amherst, advisor Paul E. Utgoff, pdf20072007).Active learning in regression, with application to stochastic dynamic programming. ICINCO and CAP, pdf2007).A Contribution to Reinforcement Learning; Application to Computer Go.Ph.D. thesis, pdf2007).Tuning Bandit Algorithms in Stochastic Environments. pdf2007).Automatic Generation of Evaluation Features for Computer Game Players. pdf2007).State Space Partition for Reinforcement Learning Based on Fuzzy Min-Max Neural Network. ISNN 20072007).Reinforcement Learning of Evaluation Functions Using Temporal Difference-Monte Carlo learning method. 12th Game Programming Workshop2007).Machine Learning and Data Mining: Introduction to Principles and Algorithms.2007).Generative Learning of Visual Concepts using Multiobjective Genetic Programming. Pattern Recognition Letters, Vol. 28, No. 162007).Learning to play Othello with N-tuple systems. Australian Journal of Intelligent Information Processing Systems, Special Issue on Game Technology, Vol. 9, No. 4 » Othello2007).Temporal Difference Learning of an Othello Evaluation Function for a Small Neural Network with Shared Weights. IEEE Symposium on Computational Intelligence and AI in Games » Othello2007).Randomized Feature Selection. in Huan Liu, Hiroshi Motoda (eds.) Computational Methods of Feature Selection. CRC Press, pdf2007).Recent advances in machine learning and game playing. ÖGAI Journal, Vol. 26, No. 2, Computer Game Playing, pdf20082008).Using Reinforcement Learning in Chess Engines. CONCIBE SCIENCE 2008, Research in Computing Science: Special Issue in Electronics and Biomedical Engineering, Computer Science and Informatics, ISSN:1870-4069, Vol. 35, pp. 31-40, Guadalajara, Mexico, pdf2008).Learning of Piece Values for Chess Variants.Technical Report TUD–KE–2008-07, Knowledge Engineering Group, TU Darmstadt, pdf2008).Learning the Piece Values for three Chess Variants. ICGA Journal, Vol 31, No. 42008).A Convergent O(n) Algorithm for Off-policy Temporal-difference Learning with Linear Function Approximation, pdf (draft)2008).Learning Positional Features for Annotating Chess Games: A Case Study. CG 2008, pdf2008).Fighting Knowledge Acquisition Bottleneck with Argument Based Machine Learning. 18th European Conference on Artificial Intelligence (ECAI 2008), Patras, Greece. pdf2008).Grid Differentiated Services: a Reinforcement Learning Approach. In 8th IEEE Symposium on Cluster Computing and the Grid. Lyon, pdf2008).An Othello Evaluation Function Based on Temporal Difference Learning using Probability of Winning. CIG'08, pdf2008).BayesChess: A computer chess program based on Bayesian networks. Pattern Recognition Letters, Vol. 29, No. 82008).Learning from the Past with Experiment Databases. PRICAI 2008, pdf2008).Mimicking Go Experts with Convolutional Neural Networks. ICANN 2008, pdf » Go2008).Chunk Learning and Move Prompting: Making Moves in Chess. Technical Report CSR-08-12, University of Birmingham2008).Hypernetworks: A molecular evolutionary architecture for cognitive learning and memory. IEEE Computational Intelligence Magazine, Vol. 3, No. 3, pdf20092009).Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.Accepted in Advances in Neural Information Processing Systems 22, Vancouver, BC. December 2009. MIT Press. pdf2009).Bootstrapping from Game Tree Search. Neural Information Processing Systems (NIPS), 2009, pdf2009).Argument Based Machine Learning, PhD Thesis, pdf2009).Reinforcement Learning and Simulation-Based Search. Ph.D. thesis, University of Alberta. pdf2009).Simulating Human Grandmasters: Evolution and Coevolution of Evaluation Functions. ACM Genetic and Evolutionary Computation Conference (GECCO '09), pp. 1483 - 1489, Montreal, Canada, pdf2009).Genetic Algorithms Based Learning for Evolving Intelligent Organisms. Ph.D. Thesis^{[22]}2009).A Statistical Learning Perspective of Genetic Programming. EuroGP 2009, pdf2009).Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation. In Proceedings of the 26th International Conference on Machine Learning (ICML-09). pdf2009).Feature Learning Using State Differences. pdf2009).Monte-Carlo Simulation Balancing. ICML 2009, pdf^{[23]}2009).Playing Chess with Matlab. M.Sc. thesis supervised by Nello Cristianini, pdf^{[24]}2009).Coevolutionary Temporal Difference Learning for Othello. IEEE Symposium on Computational Intelligence and Games, pdf » Othello2009).Reinforcement Learning and Simulation-Based Search. Ph.D. thesis, University of Alberta, pdf2009).A Methodology for Learning Players' Styles from Game Records. arXiv:0904.2595v12009).The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Second Edition, Springer## 2010 ...

2010).Knowledge-Free and Learning-Based Methods in Intelligent Game Playing. Studies in Computational Intelligence, Vol. 276, Springer2012).Sequence Learning with Artificial Recurrent Neural Networks. (Aiming to become the definitive textbook on RNN.) Invited by Cambridge University Press2010).Reinforcement Learning via AIXI Approximation. Association for the Advancement of Artificial Intelligence (AAAI), pdf2010).Expert-Driven Genetic Algorithms for Simulating Evaluation Functions. pdf2010).Genetic Algorithms for Automatic Classification of Moving Objects. ACM Genetic and Evolutionary Computation Conference (GECCO '10), Portland, OR, pdf2010).Genetic Algorithms for Automatic Search Tuning. ICGA Journal, Vol 33, No. 22010).Feature Learning using State Differences. Master's thesis, Department of Computing Science, University of Alberta, pdf » General Game Playing2010).Parameter Tuning by Simple Regret Algorithms and Multiple Simultaneous Hypothesis Testing. pdf2010).Multi-objective Reinforcement Learning for Responsive Grids. In The Journal of Grid Computing. pdf2010).PAC-Bayesian aggregation and multi-armed bandits. Habilitation thesis, Université Paris Est, pdf, slides as pdf2010).GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces. In Proceedings of the Third Conference on Artificial General Intelligence2010).The Layered Learning method and its Application to Generation of Evaluation Functions for the Game of Checkers. 11. PPSN, pdf » Checkers2010).Coevolutionary Temporal Difference Learning for small-board Go. IEEE Congress on Evolutionary Computation » Go2010).Using Resource-Limited Nash Memory to Improve an Othello Evaluation Function. IEEE Transactions on Computational Intelligence and AI in Games, Vol. 2, No. 1 » Othello2010).Coevolution in a Large Search Space using Resource-limited Nash Memory. GECCO '10 » Othello2010).Self-play and using an expert to learn to play backgammon with temporal difference learning. Journal of Intelligent Learning Systems and Applications, Vol. 2, No. 220112011).Approximate Universal Artificial Intelligence and Self-Play Learning for Games. Ph.D. thesis, University of New South Wales, supervisors: Kee Siong Ng, Marcus Hutter, Alan Blair, William Uther, John Lloyd; pdf2011).A GGP Feature Learning Algorithm. KI 25(1): 35-42, pdf » General Game Playing2011).Temporal Difference Learning for Connect6. Advances in Computer Games 132011).Analysis of Evaluation-Function Learning by Comparison of Sibling Nodes. Advances in Computer Games 132011).4*4-Pattern and Bayesian Learning in Monte-Carlo Go. Advances in Computer Games 132011).Reinforcement Learning with a Bilinear Q Function. EWRL 20112011).Learning N-Tuple Networks for Othello by Coevolutionary Gradient Search. GECCO 2011, pdf2011).Evolving small-board Go players using Coevolutionary Temporal Difference Learning with Archives. Applied Mathematics and Computer Science, Vol. 21, No. 42011).Learning Board Evaluation Function for Othello by Hybridizing Coevolution with Temporal Difference Learning. Control and Cybernetics, Vol. 40, No. 3, pdf » Othello2011).Gradient Temporal-Difference Learning Algorithms. Ph.D. thesis, University of Alberta, advisor Richard Sutton, pdf20122012).Reinforcement learning: State-of-the-art. Adaptation, Learning, and Optimization, Vol. 12, SpringerIstván Szita (

2012).Reinforcement Learning in Games. Chapter 172012).Neural-fitted TD-leaf learning for playing Othello with structured neural networks. IEEE Transactions on Neural Networks and Learning Systems, Vol. 23, No. 112012).Automatic Learning of Evaluation, with Applications to Computer Chess. Discussion Paper 613, The Hebrew University of Jerusalem - Center for the Study of Rationality, Givat Ram2012).Learning a Move-Generator for Upper Confidence Trees. ICS 2012, Hualien, Taiwan, December 2012 » UCT2012).Boosting: Foundations and Algorithms. MIT Press2012).Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search. NIPS 2012, pdf2012).How to set the switches on this thing. Current Opinion in Neurobiology, Vol. 22, pdf20132013).Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search. Journal of Artificial Intelligence Research, Vol. 48, pdf2013).Algorithmic Progress in Six Domains. Technical report 2013-3, Machine Intelligence Research Institute, Berkeley, CA, pdf, 5 Game Playing, 5.1 Chess, 5.2 Go, 9 Machine Learning2013).Shaping Fitness Function for Evolutionary Learning of Game Strategies. GECCO 2013, pdf2013).On Scalability, Generalization, and Hybridization of Coevolutionary Learning: a Case Study for Othello. IEEE Transactions on Computational Intelligence and AI in Games, Vol. 5, No. 3 » Othello2013).Reinforcement Learning in the Game of Othello: Learning Against a Fixed Opponent and Learning from Self-Play. ADPRL 20132013).Reinforcement Learning to Train Ms. Pac-Man Using Higher-order Action-relative Inputs. ADPRL 2013^{[25]}2013).Reinforcement Learning. Dagstuhl Reports, Vol. 3, No. 8, DOI: 10.4230/DagRep.3.8.1, URN: urn:nbn:de:0030-drops-434092013).Playing Atari with Deep Reinforcement Learning. arXiv:1312.5602^{[26]}^{[27]}20142014).Genetic Algorithms for Evolving Computer Chess Programs. IEEE Transactions on Evolutionary Computation, pdf^{[28]}2014).Multi-Criteria Comparison of Coevolution and Temporal Difference Learning on Othello. EvoApplications 2014, Springer, volume 8602 » Othello2014).Temporal Difference Learning of N-Tuple Networks for the Game 2048. IEEE Conference on Computational Intelligence and Games, pdf^{[29]}2014).Coevolutionary Shaping for Reinforcement Learning. Ph.D. thesis, Poznań University of Technology, supervisor Krzysztof Krawiec, co-supervisor Wojciech Jaśkowski, pdf2014).Systematic n-Tuple Networks for Othello Position Evaluation. ICGA Journal, Vol. 37, No. 2, preprint as pdf » Othello2014).Teaching Deep Convolutional Neural Networks to Play Go. arXiv:1412.3409 » Neural Networks^{[30]}^{[31]}^{[32]}2014).Move Evaluation in Go Using Deep Convolutional Neural Networks. arXiv:1412.6564v12014).Multi-Stage Temporal Difference Learning for 2048. TAAI 2014, best paper award^{[33]}2014).Regret bounds for restless Markov bandits. Theoretical Computer Science 558, pdf## 2015 ...

2015).Human-level control through deep reinforcement learning. Nature, Vol. 5182015).Adaptive Playouts in Monte Carlo Tree Search with Policy Gradient Reinforcement Learning. Advances in Computer Games 142015).Transfer Learning by Inductive Logic Programming. Advances in Computer Games 142015).Machine-Learning of Shape Names for the Game of Go. Advances in Computer Games 142015).Massively Parallel Methods for Deep Reinforcement Learning. arXiv:1507.042962015).Giraffe: Using Deep Reinforcement Learning to Play Chess. M.Sc. thesis, Imperial College London, arXiv:1509.01549v1 » Giraffe2015).Deep Reinforcement Learning with Double Q-learning. arXiv:1509.064612015).Prioritized Experience Replay. arXiv:1511.059522015).An Introduction to Machine Learning. Springer20162016).Dueling Network Architectures for Deep Reinforcement Learning. arXiv:1511.065812016).Fast seed-learning algorithms for games. CG 20162016).DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess. ICAAN 2016, Lecture Notes in Computer Science, Vol. 9887, Springer, pdf preprint » DeepChess^{[34]}^{[35]}2016).Deep Learning. MIT Press2016).Reinforcement Learning with Unsupervised Auxiliary Tasks. arXiv:1611.05397v120172017).Style Transfer Generative Adversarial Networks: Learning to Play Chess Differently. arXiv:1702.06762v1^{[36]}» Neural Networks## Forum Posts

## 1998 ...

## 2000 ...

## 2005 ...

^{[37]}## 2010 ...

^{[38]}^{[39]}^{[40]}## 2015 ...

## External Links

## Machine Learning

## AI

Learning I

Learning II

## Chess

## Supervised Learning

AdaBoost from Wikipedia

## Unsupervised Learning

## Reinforcement Learning

## TD Learning

## Statistics

Naive Bayes classifier from Wikipedia

Probabilistic classification from Wikipedia

Outline of regression analysis from Wikipedia

Linear regression from Wikipedia

Logistic regression from Wikipedia

Normal distribution from Wikipedia

Pseudorandom number generator from Wikipedia

Pseudo-random number sampling from Wikipedia

Statistical randomness from Wikipedia

## Markov Models

## NNs

## ANNs

TopicsRNNsBlogsThe Single Layer Perceptron

Hidden Neurons and Feature Space

Training Neural Networks Using Back Propagation in C#

Data Mining with Artificial Neural Networks (ANN)

## Courses

## References

1987).A Chess Program that uses its Transposition Table to Learn from Experience.ICCA Journal, Vol. 10, No. 21999).Book Learning - a Methodology to Tune an Opening Book Automatically. ICCA Journal, Vol. 22, No. 11997).Learning Piece Values Using Temporal Differences. ICCA Journal, Vol. 20, No. 31999).Learning Piece-Square Values using Temporal Differences.ICCA Journal, Vol. 22, No. 42001).Learning Search Control in Adversary Games. Advances in Computer Games 9, pdf2000).Learning Time Allocation using Neural Networks. CG 2000, postscript1962).Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books2010).Mimicking the Black Box - Genetically evolving evaluation functions and search algorithms. Review on Omid David's Ph.D. Thesis, ICGA Journal, Vol 33, No. 12012).Automatic Learning of Evaluation, with Applications to Computer Chess. Discussion Paper 613, The Hebrew University of Jerusalem - Center for the Study of Rationality, Givat Ram2014).Teaching Deep Convolutional Neural Networks to Play Go. arXiv:1412.34092014).Move Evaluation in Go Using Deep Convolutional Neural Networks. arXiv:1412.6564v1Up one Level