Automated Tuning,
an automated adjustment of evaluation parameters or weights, and less commonly, search parameters ^{[1]}, with the aim to improve the playing strength of a chess engine or game playing program. Evaluation tuning can be applied by mathematical optimization or machine learning, both fields with huge overlaps. Learning approaches are subdivided into supervised learning using labeled data, and reinforcement learning to learn from trying, facing the exploration (of uncharted territory) and exploitation (of current knowledge) dilemma. Johannes Fürnkranz gives a comprehensive overview in Machine Learning in Games: A Survey published in 2000 ^{[2]}, covering evaluation tuning in chapter 4.

A difficulty in tuning and automated tuning of engine parameters is measuring playing strength. Using small sets of test-positions, which was quite common in former times to estimate relative strength of chess programs, lacks adequate diversity for a reliable strength predication. In particular, solving test-positions does not necessarily correlate with practical playing strength in matches against other opponents. Therefore, measuring strength requires to play many games against a reference opponent to determine the win rate with a certain confidence. The closer the strength of two opponents, the more games are necessary to determine whether changed parameters or weights in one of them are improvements or not, up to several tens of thousands. Playing many games with ultra short time controls has became de facto standard with todays strong programs, as for instance applied in Stockfish'sFishtest, using the sequential probability ratio test (SPRT) to possibly terminate a match early ^{[4]}.

It is one of the best arts to find the right SMALL set of parameters and to tune them.

Some 12 years ago I had a technical article on this ("On telescoping linear evaluation functions") in the ICCA Journal, Vol. 16, No. 2, pp. 91-94, describing a theorem (of existence) which says that in case of linear evaluation functions with lots of terms there is always a small subset of the terms such that this set with the right parameters is almost as good as the full evaluation function.

Works with all engine parameters, including search

Takes search-eval interaction into account

Disadvantages

Time complexity issues with increasing number of weights to tune

Reinforment Learning

Reinforcement learning, in particular temporal difference learning, has a long history in tuning evaluation weights in game programming, first seeen in the late 50s by Arthur Samuel in his Checkers player ^{[7]}. In self play against a stable copy of itself, after each move, the weights of the evaluation function were adjusted in a way that the score of the root position after a quiescence search became closer to the score of the full search. This TD method was generalized and formalized by Richard Sutton in 1988 ^{[8]}, who introduced the decay parameter λ, where proportions of the score came from the outcome of Monte Carlo simulated games, tapering between bootstrapping (λ = 0) and Monte Carlo (λ = 1). TD-λ was famously applied by Gerald Tesauro in his Backgammon program TD-Gammon^{[9]}^{[10]}, its minimax adaption TD-Leaf was successful used in eval tuning of chess programs ^{[11]}, with KnightCap^{[12]} and CilkChess^{[13]} as prominent samples.

One supervised learning method considers desired moves from a set of positions, likely from grandmaster games, and tries to adjust their evaluation weights so that for instance a one-ply search agrees with the desired move. Already pioneering in reinforcement learning some years before, move adaption was described by Arthur Samuel in 1967 as used in the second version of his checkers player ^{[15]}, where a structure of stacked linear evaluation functions was trained by computing a correlation measure based on the number of times the feature rated an alternative move higher than the desired move played by an expert ^{[16]}. In chess, move adaption was first described by Thomas Nitsche in 1982 ^{[17]}, and with some extensions by Tony Marsland in 1985 ^{[18]}. Eval Tuning in Deep Thought as mentioned by Feng-hsiung Hsu et al. in 1990 ^{[19]}, and later published by Andreas Nowatzyk, is also based on an extended form of move adaption ^{[20]}. Jonathan Schaeffer's and Paul Lu's efforts to make Deep Thought's approach work for Chinook in 1990 failed ^{[21]} - nothing seemed to produce results that were as good than their hand-tuned effort ^{[22]}.

Value Adaption

A second supervised learning approach used to tune evaluation weights is based on regression of the desired value, i.e. using the final outcome from huge sets of positions from quality games, or other information supplied by a supervisor, i.e. in form of annotations from position evaluation symbols. Often, value adaption is reinforced by determining an expected outcome by self play ^{[23]}.

Advantages

Can modify any number of weights simultaneously - constant time complexity

Disadvantages

Requires a source for the labeled data

Can only be used for evaluation weights or anything else that can be labeled

Donald H. Mitchell (1984). Using Features to Evaluate Positions in Experts' and Novices' Othello Games. Masters thesis, Department of Psychology, Northwestern University, Evanston, IL

Jens Christensen, Richard Korf (1986). A Unified Theory of Heuristic Evaluation functions and Its Applications to Learning. Proceedings of the AAAI-86, pp. 148-152, pdf

Dap Hartmann (1987). How to Extract Relevant Knowledge from Grandmaster Games. Part 1: Grandmasters have Insights - the Problem is what to Incorporate into Practical Problems.ICCA Journal, Vol. 10, No. 1

Bruce Abramson (1988). Learning Expected-Outcome Evaluators in Chess. Proceedings of the 1988 AAAI Spring Symposium Series: Computer Game Playing, 26-28.

Bruce Abramson (1989). On Learning and Testing Evaluation Functions. Proceedings of the Sixth Israeli Conference on Artificial Intelligence, 1989, 7-16.

Bruce Abramson (1991). The Expected-Outcome Model of Two-Player Games. Part of the series, Research Notes in Artificial Intelligence (San Mateo: Morgan Kaufmann, 1991).

Johannes Fürnkranz (2007). Recent advances in machine learning and game playing. ÖGAI Journal, Vol. 26, No. 2, Computer Game Playing, pdf

2008

Omid David, Moshe Koppel, Nathan S. Netanyahu (2008). Genetic Algorithms for Mentor-Assisted Evaluation Function Optimization. ACM Genetic and Evolutionary Computation Conference (GECCO '08), pp. 1469-1475, Atlanta, GA, July 2008.

^Donald H. Mitchell (1984). Using Features to Evaluate Positions in Experts' and Novices' Othello Games. Masters thesis, Department of Psychology, Northwestern University, Evanston, IL

Home * Automated TuningAutomated Tuning,an automated adjustment of evaluation parameters or weights, and less commonly, search parameters

^{[1]}, with the aim to improve the playing strength of a chess engine or game playing program. Evaluation tuning can be applied by mathematical optimization or machine learning, both fields with huge overlaps. Learning approaches are subdivided into supervised learning using labeled data, and reinforcement learning to learn from trying, facing the exploration (of uncharted territory) and exploitation (of current knowledge) dilemma. Johannes Fürnkranz gives a comprehensive overview inMachine Learning in Games: A Surveypublished in 2000^{[2]}, covering evaluation tuning in chapter 4.^{[3]}## Table of Contents

## Playing Strength

A difficulty in tuning and automated tuning of engine parameters is measuring playing strength. Using small sets of test-positions, which was quite common in former times to estimate relative strength of chess programs, lacks adequate diversity for a reliable strength predication. In particular, solving test-positions does not necessarily correlate with practical playing strength in matches against other opponents. Therefore, measuring strength requires to play many games against a reference opponent to determine the win rate with a certain confidence. The closer the strength of two opponents, the more games are necessary to determine whether changed parameters or weights in one of them are improvements or not, up to several tens of thousands. Playing many games with ultra short time controls has became de facto standard with todays strong programs, as for instance applied in Stockfish's Fishtest, using the sequential probability ratio test (SPRT) to possibly terminate a match early^{[4]}.## Parameter

Quote by Ingo Althöfer^{[5]}^{[6]}:It is one of the best arts to find the right SMALL set of parameters and to tune them.

Some 12 years ago I had a technical article on this ("On telescoping linear evaluation functions") in the ICCA Journal, Vol. 16, No. 2, pp. 91-94, describing a theorem (of existence) which says that in case of linear evaluation functions with lots of terms there is always a small subset of the terms such that this set with the right parameters is almost as good as the full evaluation function.

## Mathematical Optimization

Mathematical optimization methods in tuning consider the engine as a black box.## Methods

## Instances

## Advantages

## Disadvantages

## Reinforment Learning

Reinforcement learning, in particular temporal difference learning, has a long history in tuning evaluation weights in game programming, first seeen in the late 50s by Arthur Samuel in his Checkers player^{[7]}. In self play against a stable copy of itself, after each move, the weights of the evaluation function were adjusted in a way that the score of the root position after a quiescence search became closer to the score of the full search. This TD method was generalized and formalized by Richard Sutton in 1988^{[8]}, who introduced the decay parameterλ, where proportions of the score came from the outcome of Monte Carlo simulated games, tapering between bootstrapping (λ = 0) and Monte Carlo (λ = 1). TD-λ was famously applied by Gerald Tesauro in his Backgammon program TD-Gammon^{[9]}^{[10]}, its minimax adaption TD-Leaf was successful used in eval tuning of chess programs^{[11]}, with KnightCap^{[12]}and CilkChess^{[13]}as prominent samples.## Instances

## Engines

^{[14]}## Supervised Learning

## Move Adaption

One supervised learning method considers desired moves from a set of positions, likely from grandmaster games, and tries to adjust their evaluation weights so that for instance a one-ply search agrees with the desired move. Already pioneering in reinforcement learning some years before, move adaption was described by Arthur Samuel in 1967 as used in the second version of his checkers player^{[15]}, where a structure of stacked linear evaluation functions was trained by computing a correlation measure based on the number of times the feature rated an alternative move higher than the desired move played by an expert^{[16]}. In chess, move adaption was first described by Thomas Nitsche in 1982^{[17]}, and with some extensions by Tony Marsland in 1985^{[18]}. Eval Tuning in Deep Thought as mentioned by Feng-hsiung Hsu et al. in 1990^{[19]}, and later published by Andreas Nowatzyk, is also based on an extended form of move adaption^{[20]}. Jonathan Schaeffer's and Paul Lu's efforts to make Deep Thought's approach work for Chinook in 1990 failed^{[21]}- nothing seemed to produce results that were as good than their hand-tuned effort^{[22]}.## Value Adaption

A second supervised learning approach used to tune evaluation weights is based on regression of the desired value, i.e. using the final outcome from huge sets of positions from quality games, or other information supplied by a supervisor, i.e. in form of annotations from position evaluation symbols. Often, value adaption is reinforced by determining an expected outcome by self play^{[23]}.## Advantages

## Disadvantages

## Regression

Regression analysis is a statistical process with a substantial overlap with machine learning to predict the value of an Y variable (output), given known value pairs of the X and Y variables. Parameter estimation in regression analysis can be formulated as the minimization of a cost or loss function over a training set^{[24]}, such as mean squared error or cross-entropy error function for binary classification^{[25]}. The minimization is implemented by iterative optimization algorithms or metaheuristics such as Iterated local search, Gauss–Newton algorithm, or conjugate gradient method.## Linear Regression

^{[26]}^{[27]}. Jens Christensen applied linear regression to chess in 1986 to learn point values in the domain of temporal difference learning^{[28]}.^{[29]}## Logistic Regression

^{[30]}. Logistic regression in evaluation tuning was first elaborated by Michael Buro in 1995^{[31]}, and proved successful in the game of Othello in comparison with Fisher's linear discriminant and quadratic discriminant function for normally distributed features, and served as eponym of his Othello programLogistello^{[32]}. In computer chess, logistic regression was proposed by Miguel A. Ballicora in a 2009 CCC post, as applied to Gaviota^{[33]}, was independently described by Amir Ban in 2012 for Junior's evaluation learning^{[34]}, and explicitly mentioned by Álvaro Begué in a January 2014 CCC discussion^{[35]}, when Peter Österlund explained Texel's Tuning Method^{[36]}, which subsequently popularized logistic regression tuning in computer chess. Vladimir Medvedev's Point Value by Regression Analysis^{[37]}^{[38]}experiments showed why the logistic function is appropriate, and further used cross-entropy and regularization.^{[39]}## Instances

## See also

## Publications

## 1959

1959).Some Studies in Machine Learning Using the Game of Checkers. IBM Journal July 1959## 1960 ...

1966).A new Machine-Learning Technique applied to the Game of Checkers. MIT, Project MAC, MAC-M-2931967).Some Studies in Machine Learning. Using the Game of Checkers. II-Recent Progress. pdf## 1970 ...

1974).A Comparison and Evaluation of Three Machine Learning Procedures as Applied to the Game of Checkers. Artificial Intelligence, Vol. 5, No. 2## 1980 ...

1982).A Learning Chess Program.Advances in Computer Chess 31984).Using Features to Evaluate Positions in Experts' and Novices' Othello Games. Masters thesis, Department of Psychology, Northwestern University, Evanston, IL## 1985 ...

1985).Evaluation-Function Factors. ICCA Journal, Vol. 8, No. 2, pdf1986).A Unified Theory of Heuristic Evaluation functions and Its Applications to Learning.Proceedings of the AAAI-86, pp. 148-152, pdf1986).Learning Static Evaluation Functions by Linear Regression. in Tom Mitchell, Jaime Carbonell, Ryszard Michalski (1986).Machine Learning: A Guide to Current Research. The Kluwer International Series in Engineering and Computer Science, Vol. 121987).How to Extract Relevant Knowledge from Grandmaster Games. Part 1: Grandmasters have Insights - the Problem is what to Incorporate into Practical Problems.ICCA Journal, Vol. 10, No. 11987).How to Extract Relevant Knowledge from Grandmaster Games. Part 2: the Notion of Mobility, and the Work of De Groot and Slater. ICCA Journal, Vol. 10, No. 21987).A Model of Two-Player Evaluation Functions.AAAI-87. pdf1988).Learning Expected-Outcome Evaluators in Chess.Proceedings of the 1988 AAAI Spring Symposium Series: Computer Game Playing, 26-28.1988).A Pattern Classification Approach to Evaluation Function Learning. Artificial Intelligence, Vol. 36, No. 11988).Learning to Predict by the Methods of Temporal Differences. Machine Learning, Vol. 3, No. 1, pdf1989).On Learning and Testing Evaluation Functions.Proceedings of the Sixth Israeli Conference on Artificial Intelligence, 1989, 7-16.1989).Weight Assessment in Evaluation Functions. Advances in Computer Chess 5## 1990 ...

1990).Expected-Outcome: A General Model of Static Evaluation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 12, No. 21990).An Analysis of Expected-Outcome.Journal of Experimental and Theoretical Artificial Intelligence 2: 55-73.1990).On Learning and Testing Evaluation Functions.Journal of Experimental and Theoretical Artificial Intelligence, Vol. 21990).A Grandmaster Chess Machine. Scientific American, Vol. 263, No. 4, pp. 44-50. ISSN 0036-8733.1991).The Expected-Outcome Model of Two-Player Games.Part of the series, Research Notes in Artificial Intelligence (San Mateo: Morgan Kaufmann, 1991).1991).Neural Networks as a Guide to Optimization - The Chess Middle Game Explored. ICCA Journal, Vol. 14, No. 31991).Genetic Algorithms Optimizing Evaluation Functions. ICCA Journal, Vol. 14, No. 31991).Two Kinds of Training Information for Evaluation Function Learning. University of Massachusetts, Amherst, Proceedings of the AAAI 19911992).Temporal Difference Learning of Backgammon Strategy. ML 19921993).On Telescoping Linear Evaluation Functions.ICCA Journal, Vol. 16, No. 2, pp. 91-941994).Konstruktion und Optimierung von Bewertungsfunktionen beim Schach.Ph.D. thesis (German)## 1995 ...

1995).Statistical Feature Combination for the Evaluation of Game Positions. JAIR, Vol. 31995).Tuning Evaluation Functions for Search. ps or pdf from CiteSeerX1995).Tuning Evaluation Functions for Search(Talk), ps1996).Machine Learning in Computer Chess: The Next Generation.ICCA Journal, Vol. 19, No. 3, zipped ps1997).Learning Piece Values Using Temporal Differences. ICCA Journal, Vol. 20, No. 31997).Evaluation Tuning for Computer Chess: Linear Discriminant Methods. ICCA Journal, Vol. 20, No. 41998).Experiments in Parameter Learning Using Temporal Differences. ICCA Journal, Vol. 21, No. 2, pdf1998).From Simple Features to Sophisticated Evaluation Functions. CG 1998, pdf1998).Implementation of the Simultaneous Perturbation Algorithm for Stochastic Optimization. IEEE Transactions on Aerospace and Electronic Systems, pdf^{[40]}1999).Learning Piece-Square Values using Temporal Differences.ICCA Journal, Vol. 22, No. 4## 2000 ...

2000).Machine Learning in Games: A Survey. Austrian Research Institute for Artificial Intelligence, OEFAI-TR-2000-3, pdf2000).Chess Neighborhoods, Function Combination, and Reinforcement Learning. CG 2000, pdf2001).Machines that Learn to Play Games. Advances in Computation: Theory and Practice, Vol. 8,. NOVA Science PublishersGerald Tesauro (

2001).Comparison Training of Chess Evaluation Functions. » SCP, Deep Blue2001).An Evolutionary Approach for the Tuning of a Chess Evaluation Function using Population Dynamics. Proceedings of the 2001 Congress on Evolutionary Computation, Vol. 2, pdf2001).Learning Search Control in Adversary Games. Advances in Computer Games 9, pp. 157-174. pdf2002).Improving Mini-max Search by Supervised Learning.Artificial Intelligence, Vol. 134, No. 1, pdf2003).Evaluation Function Tuning via Ordinal Correlation. Advances in Computer Games 10, pdf2004).Tuning Evaluation Functions by Maximizing Concordance. M.Sc. Thesis, University of Alberta2004).Genetic Algorithms and Evolutionary Computationfrom the TalkOrigins Archive2004).Genetic algorithms for optimising chess position scoring, Master's thesis, pdf2004).Evaluation of Chess Position by Modular Neural network Generated by Genetic Algorithm. EuroGP 20042004).Learning to play chess using TD(λ)-learning with database games. Cognitive Artiﬁcial Intelligence, Utrecht University, Benelearn’04## 2005 ...

2005).Tuning Evaluation Functions by Maximizing Concordance. Theoretical Computer Science, Vol. 349, No. 2, pdf2005).Evaluation by Hill-climbing: Getting the right move by solving micro-problems. AI Factory, Autumn 20052005).RSPSA: Enhanced Parameter Optimization in Games. Advances in Computer Games 11, pdf20062006).Universal Parameter Optimisation in Games Based on SPSA. Machine Learning, Special Issue on Machine Learning and Games, Vol. 63, No. 32006).Using an Evolutionary Algorithm for the Tuning of a Chess Evaluation Function Based on a Dynamic Boundary Strategy. Proceedings of the 2006 IEEE Conference on Cybernetics and Intelligent Systems, pdf2006).Automatic Construction of Static Evaluation Functions for Computer Game Players. ALT ’062006).A Differential Evolution for the Tuning of a Chess Evaluation Function. IEEE Congress on Evolutionary Computation, 2006.2006).Optimal control of minimax search result to learn positional evaluation. 11th Game Programming Workshop (Japanese)20072007).Visualization and Adjustment of Evaluation Functions Based on Evaluation Values and Win Probability. AAAI 2007, pdf2007).Automatic Generation of Evaluation Features for Computer Game Players. pdf2007).Recent advances in machine learning and game playing. ÖGAI Journal, Vol. 26, No. 2, Computer Game Playing, pdf20082008).Genetic Algorithms for Mentor-Assisted Evaluation Function Optimization. ACM Genetic and Evolutionary Computation Conference (GECCO '08), pp. 1469-1475, Atlanta, GA, July 2008.2008).An Adaptive Differential Evolution Algorithm with Opposition-Based Mechanisms, Applied to the Tuning of a Chess Program. Advances in Differential Evolution, Studies in Computational Intelligence, ISBN: 978-3-540-68827-320092009).Bootstrapping from Game Tree Search. Neural Information Processing Systems (NIPS), 2009, pdf2009).Simulating Human Grandmasters: Evolution and Coevolution of Evaluation Functions. ACM Genetic and Evolutionary Computation Conference (GECCO '09), pp. 1483 - 1489, Montreal, Canada, July 2009.2009).Genetic Algorithms Based Learning for Evolving Intelligent Organisms. Ph.D. Thesis.2009).Playing Chess with Matlab. M.Sc. thesis supervised by Nello Cristianini, pdf^{[41]}2009).A Methodology for Learning Players' Styles from Game Records. arXiv:0904.2595v12009).The Automatically Tuning System of Evaluation Function for Computer Chinese Chess. Master thesis, National Chiao Tung University, pdf (Chinese)## 2010 ...

2010).Parameter Tuning by Simple Regret Algorithms and Multiple Simultaneous Hypothesis Testing. pd2010).Genetic Algorithms for Automatic Search Tuning. ICGA Journal, Vol. 33, No. 22010).Differential evolution for the Tuning of a Chess Evaluation Function. Ph.D. thesis, University of Maribor20112011).Expert-Driven Genetic Algorithms for Simulating Evaluation Functions. Genetic Programming and Evolvable Machines, Vol. 12, No. 12011).Tuning Chess Evaluation Function Parameters using Differential Evolution. Algorithm. Informatica, 35, No. 2, pdf2011).History mechanism supported differential evolution for chess evaluation function tuning. Soft Computing, Vol. 15, No. 42011).An Evolutionary Algorithm for Tuning a Chess Evaluation Function. CEC 2011, pdf2011).An Adaptive Evolutionary Algorithm Based on Typical Chess Problems for Tuning a Chess Evaluation Function. GECCO 2011, pdf2011).CLOP: Confident Local Optimization for Noisy Black-Box Parameter Tuning. Advances in Computer Games 13^{[42]}^{[43]}2011).The Global Landscape of Objective Functions for the Optimization of Shogi Piece Values with a Game-Tree Search. Advances in Computer Games 13 » Shogi20122012).Automatic Learning of Evaluation, with Applications to Computer Chess. Discussion Paper 613, The Hebrew University of Jerusalem - Center for the Study of Rationality, Givat Ram2012).Design and Implementation of Bonanza Method for the Evaluation in the Game of Arimaa. IPSJ SIG Technical Report, Vol. 2012-GI-27, No. 4, pdf » Arimaa20132013).A Supervised Learning Method for Chinese Chess Programs. JSAI2013, pdf2013).Comparison Training of Shogi Evaluation Functions with Self-Generated Training Positions and Moves. CG 2013, slides as pdf2013).Optimizing Objective Function Parameters for Strength in Computer Game-Playing. AAAI 20132013).Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods. Lecture Notes in Control and Information Sciences, Vol. 434, Springer » SPSA2013).Arimaa challenge - Static Evaluation Function. Master Thesis, Charles University in Prague, pdf » Arimaa^{[44]}20142014).Large-Scale Optimization for Evaluation Functions with Minimax Search. JAIR Vol. 49, pdf » Shogi^{[45]}2014).RES: Regularized Stochastic BFGS Algorithm. arXiv:1401.7625^{[46]}2014).ROCK∗ — Efficient black-box optimization for policy learning. Humanoids, 2014 » Rockstar2014, 2017).New insights and perspectives on the natural gradient method. arXiv:1412.1193## 2015 ...

2015).Adam: A Method for Stochastic Optimization. arXiv:1412.6980v8, ICLR 2015^{[47]}2015).Global Convergence of Online Limited Memory BFGS. Journal of Machine Learning Research, Vol. 16, pdf^{[48]}^{[49]}20172017).An overview of gradient descent optimization algorithms. arXiv:1609.04747v2^{[50]}## Forum Posts

## 1997 ...

^{[51]}## 2000 ...

## 2005 ...

Re: Insanity... or Tal style? by Miguel A. Ballicora, CCC, April 02, 2009

^{[52]}## 2010 ...

^{[53]}^{[54]}2014Re: How Do You Automatically Tune Your Evaluation Tables by Álvaro Begué, CCC, January 08, 2014

The texel evaluation function optimization algorithm by Peter Österlund, CCC, January 31, 2014 » Texel's Tuning Method

Re: The texel evaluation function optimization algorithm by Álvaro Begué, CCC, January 31, 2014 » Cross-entropy

## 2015 ...

^{[55]}^{[56]}Re: txt: automated chess engine tuning by Sergei S. Markoff, CCC, February 15, 2016 » SmarThink

Re: Piece weights with regression analysis (in Russian) by Fabien Letouzey, CCC, May 04, 2015

Re: Genetical tuning by Ferdinand Mosca, CCC, August 20, 2015

^{[57]}2016Re: CLOP: when to stop? by Álvaro Begué, CCC, November 08, 2016

^{[58]}^{[59]}2017Re: Texel tuning method question by Peter Österlund, CCC, June 07, 2017

Re: Texel tuning method question by Ferdinand Mosca, CCC, July 20, 2017 » Python

Re: Texel tuning method question by Jon Dart, CCC, July 23, 2017

## External Links

Engine tuning from Wikipedia

Self-tuning from Wikipedia

## Optimization

optimize - Wiktionary

^{[60]}^{[61]}Entropy maximization from Wikipedia

Linear programming from Wikipedia

Simplex algorithm from Wikipedia

^{[62]}^{[63]}^{[64]}^{[65]}Simultaneous perturbation stochastic approximation (SPSA) - Wikipedia

SPSA Algorithm

Stochastic approximation from Wikipedia

Stochastic gradient descent from Wikipedia

## Machine Learning

reinforcement - Wiktionary

reinforce - Wiktionary

supervisor - Wiktionary

temporal - Wiktionary

## Regression

regression - Wiktionary

regress - Wiktionary

## Code

^{[66]}## Misc

## References

2001).Learning Search Control in Adversary Games. Advances in Computer Games 9, pp. 157-174. pdf2000).Machine Learning in Games: A Survey. Austrian Research Institute for Artificial Intelligence, OEFAI-TR-2000-3, pdf - Chapter 4, Evaluation Function Tuning1993).On Telescoping Linear Evaluation Functions.ICCA Journal, Vol. 16, No. 2, pp. 91-941959).Some Studies in Machine Learning Using the Game of Checkers. IBM Journal July 19591988).Learning to Predict by the Methods of Temporal Differences. Machine Learning, Vol. 3, No. 1, pdf1992).Temporal Difference Learning of Backgammon Strategy. ML 19921994).TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play. Neural Computation Vol. 6, No. 21999).Learning Piece-Square Values using Temporal Differences.ICCA Journal, Vol. 22, No. 41998).Experiments in Parameter Learning Using Temporal Differences. ICCA Journal, Vol. 21, No. 2, pdf1967).Some Studies in Machine Learning. Using the Game of Checkers. II-Recent Progress. pdf2000).Machine Learning in Games: A Survey. Austrian Research Institute for Artificial Intelligence, OEFAI-TR-2000-3, pdf1982).A Learning Chess Program.Advances in Computer Chess 31985).Evaluation-Function Factors. ICCA Journal, Vol. 8, No. 2, pdf1990).A Grandmaster Chess Machine. Scientific American, Vol. 263, No. 4, pp. 44-50. ISSN 0036-8733.2.1 Learning from Desired Moves in Chessin Kunihito Hoki, Tomoyuki Kaneko (2014).Large-Scale Optimization for Evaluation Functions with Minimax Search. JAIR Vol. 491992).A World Championship Caliber Checkers Program. Artificial Intelligence, Vol. 53, Nos. 2-3,ps1997, 2009).One Jump Ahead. 7. The Case for the Prosecution, pp. 111-1141990).Expected-Outcome: A General Model of Static Evaluation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 12, No. 21995).Statistical Feature Combination for the Evaluation of Game Positions. JAIR, Vol. 31984).Using Features to Evaluate Positions in Experts' and Novices' Othello Games. Masters thesis, Department of Psychology, Northwestern University, Evanston, IL1986).Learning Static Evaluation Functions by Linear Regression. in Tom Mitchell, Jaime Carbonell, Ryszard Michalski (1986).Machine Learning: A Guide to Current Research. The Kluwer International Series in Engineering and Computer Science, Vol. 121995).Statistical Feature Combination for the Evaluation of Game Positions. JAIR, Vol. 32012).Automatic Learning of Evaluation, with Applications to Computer Chess. Discussion Paper 613, The Hebrew University of Jerusalem - Center for the Study of Rationality, Givat Ram1997).Evaluation Tuning for Computer Chess: Linear Discriminant Methods. ICCA Journal, Vol. 20, No. 42011).CLOP: Confident Local Optimization for Noisy Black-Box Parameter Tuning. Advances in Computer Games 132012).Automatic Learning of Evaluation, with Applications to Computer Chess. Discussion Paper 613, The Hebrew University of Jerusalem - Center for the Study of Rationality, Givat Ram2014).Large-Scale Optimization for Evaluation Functions with Minimax Search. JAIR Vol. 49, pdf## What links here?

Up one Level