UCT

toc
 * Home * Search * Monte-Carlo Tree Search * UCT**

a popular algorithm that deals with the flaw of Monte-Carlo Tree Search, when a program may favor a losing move with only one or a few forced refutations, but due to the vast majority of other moves provides a better random playout score than other, better moves. UCT was introduced by Levente Kocsis and Csaba Szepesvári in 2006, which accelerated the Monte-Carlo revolution in computer Go and games difficult to evaluate statically. If given infinite time and memory, UCT theoretically converges to Minimax.
 * UCT** (**U**pper **C**onfidence bounds applied to **T**rees),

=Selection= In UCT, upper [|confidence bounds] (UCB1) guide the selection of a node, treating selection as a [|multi-armed bandit problem], where the crucial tradeoff the [|gambler] faces at each trial is between [|exploration and exploitation] - [|exploitation] of the [|slot machine] that has the highest expected payoff and [|exploration] to get more information about the expected payoffs of the other machines. Child node j is selected which maximizes the UCT Evaluation:

> math \displaystyle UCT_j = X_j + C * \sqrt{\frac{ln(n)}{n_j}} math

where:
 * Xj is the win ratio of the child
 * n is the number of times the parent has been visited
 * nj is the number of times the child has been visited
 * C is a constant to adjust the amount of exploration and incorporates the sqrt(2) from the UCB1 formula

The first component of the UCB1 formula above corresponds to exploitation, as it is high for moves with high average win ratio. The second component corresponds to exploration, since it is high for moves with few simulations. Most contemporary implementations of MCTS are based on some variant of UCT. Modifications have been proposed, with the aim of shortening the time to find good moves. They can be divided into improvements based on expert knowledge and into domain-independent improvements in the playouts, and in building the tree in modifying the exploitation part of the UCB1 formula, for instance in [|Rapid Action Value Estimation] (**RAVE**) considering transpositions.

=Quotes=

Gian-Carlo Pascutto
Quote by Gian-Carlo Pascutto in 2010 :

||

Raghuram Ramanujan et al.
Quote by Raghuram Ramanujan, Ashish Sabharwal, and Bart Selman from their abstract //On Adversarial Search Spaces and Sampling-Based Planning// : =See also=
 * Match Statistics
 * MC and UCT poster by Jakob Erdmann
 * MCαβ
 * Reinforcement Learning

=Publications=

2000 ...

 * Peter Auer, Nicolò Cesa-Bianchi, Paul Fischer (**2002**). //[|Finite-time Analysis of the Multiarmed Bandit Problem]//. [|Machine Learning], Vol. 47, No. 2, [|pdf]

2005 ...

 * 2006**
 * Levente Kocsis, Csaba Szepesvári (**2006**). //[|Bandit based Monte-Carlo Planning]//. ECML-06, LNCS/LNAI 4212, [|pdf]
 * Levente Kocsis, Csaba Szepesvári, Jan Willemson (**2006**). //Improved Monte-Carlo Search//. [|pdf]
 * Sylvain Gelly, Yizao Wang (**2006**). //Exploration exploitation in Go: UCT for Monte-Carlo Go//. [|pdf]
 * Sylvain Gelly, Yizao Wang, Rémi Munos, Olivier Teytaud (**2006**). //Modiﬁcation of UCT with Patterns in Monte-Carlo Go//. [|INRIA]
 * 2007**
 * Yizao Wang, Sylvain Gelly (**2007**). //Modifications of UCT and Sequence-Like Simulations for Monte-Carlo Go//. IEEE Symposium on Computational Intelligence and Games, Honolulu, USA, 2007, [|pdf]
 * Shugo Nakamura, Makoto Miwa, Takashi Chikayama (**2007**). //Improvement of UCT using evaluation function//. [|12th Game Programming Workshop 2007]
 * Tristan Cazenave, Nicolas Jouandeau (**2007**). //On the Parallelization of UCT//. CGW 2007, [|pdf] » Parallel Search
 * Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári (**2007**). //Tuning Bandit Algorithms in Stochastic Environments//. [|pdf]
 * 2008**
 * Nathan Sturtevant (**2008**). //[|An Analysis of UCT in Multi-player Games]//. CG 2008
 * Guillaume Chaslot, Mark Winands, Jaap van den Herik (**2008**). //[|Parallel Monte-Carlo Tree Search]//. CG 2008, [|pdf]
 * 2009**
 * Rémi Coulom (**2009**). //The Monte-Carlo Revolution in Go//. JFFoS'2008: Japanese-French Frontiers of Science Symposium, [|slides as pdf]
 * Fabien Teytaud, Olivier Teytaud (**2009**). //Creating an Upper-Confidence-Tree program for Havannah//. Advances in Computer Games 12, [|inria-00380539 as pdf] » Havannah

2010 ...

 * Jean Méhat, Tristan Cazenave (**2010**). //Combining UCT and Nested Monte-Carlo Search for Single-Player General Game Playing//. IEEE Transactions on Computational Intelligence and AI in Games, Vol. 2, No. 4, [|pdf 2009] » General Game Playing
 * Raghuram Ramanujan, Ashish Sabharwal, Bart Selman (**2010**). //[|On Adversarial Search Spaces and Sampling-Based Planning]//. [|ICAPS 2010]
 * Damien Pellier, Bruno Bouzy, Marc Métivier (**2010**). //An UCT Approach for Anytime Agent-based Planning//. [|PAAMS'10], [|pdf]
 * Thomas J. Walsh, Sergiu Goschin, Michael L. Littman (**2010**). //Integrating sample-based planning and model-based reinforcement learning.// AAAI, [|pdf] » Reinforcement Learning
 * 2011**
 * Junichi Hashimoto, Akihiro Kishimoto, Kazuki Yoshizoe, Kokolo Ikeda (**2011**). //[|Accelerated UCT and Its Application to Two-Player Games]//. Advances in Computer Games 13
 * Jeff Rollason (**2011**). //[|Mixing MCTS with Conventional Static Evaluation: Experiments and shortcuts en-route to full UCT]//. AI Factory, Winter 2011 » Evaluation
 * Sylvain Gelly, David Silver (**2011**). //Monte-Carlo tree search and rapid action value estimation in computer Go//. [|Artificial Intelligence], Vol. 175, No. 11, [|preprint as pdf]
 * Lars Schaefers, Marco Platzner, Ulf Lorenz (**2011**). //UCT-Treesplit - Parallel MCTS on Distributed Memory//. [|ICAPS 2011], [|pdf] » Parallel Search
 * Raghuram Ramanujan, Bart Selman (**2011**). //[|Trade-Offs in Sampling-Based Adversarial Planning]//. [|ICAPS 2011]
 * 2012**
 * Oleg Arenz (**2012**). //Monte Carlo Chess//. B.Sc. thesis, Darmstadt University of Technology, advisor Johannes Fürnkranz, [|pdf] » Stockfish
 * Cameron Browne, Simon Lucas, et al. (**2012**). //[|A Survey of Monte Carlo Tree Search Methods]//. IEEE Transactions on Computational Intelligence and AI in Games, Vol. 4, [|pdf]
 * Sylvain Gelly, Marc Schoenauer, Michèle Sebag, Olivier Teytaud, Levente Kocsis, David Silver, Csaba Szepesvári (**2012**). //[|The Grand Challenge of Computer Go: Monte Carlo Tree Search and Extensions]//. Communications of the ACM, Vol. 55, No. 3, [|pdf preprint]
 * Adrien Couetoux, Olivier Teytaud, Hassen Doghmen (**2012**). //Learning a Move-Generator for Upper Confidence Trees//. [|ICS 2012], [|Hualien], [|Taiwan], December 2012 » Learning, Move Generation
 * Ashish Sabharwal, Horst Samulowitz, Chandra Reddy (**2012**). //[|Guiding Combinatorial Optimization with UCT]//. [|CPAIOR 2012], [|abstract as pdf], [|draft as pdf]
 * Raghuram Ramanujan, Ashish Sabharwal, Bart Selman (**2012**). //Understanding Sampling Style Adversarial Search Methods//. [|arXiv:1203.4011]
 * 2013**
 * Cheng-Wei Chou, Ping-Chiang Chou, Chang-Shing Lee, David L. Saint-Pierre, Olivier Teytaud, Mei-Hui Wang, Li-Wen Wu, Shi-Jim Yen (**2013**). //Strategic Choices: Small Budgets and Simple Regret//. TAAI 2012, [|Excellent Paper Award], [|pdf]
 * Simon Viennot, Kokolo Ikeda (**2013**). //[|Efficiency of Static Knowledge Bias in Monte-Carlo Tree Search]//. CG 2013
 * Rui Li, Yueqiu Wu, Andi Zhang, Chen Ma, Bo Chen, Shuliang Wang (**2013**). //[|Technique Analysis and Designing of Program with UCT Algorithm for NoGo]//. [|CCDC2013]
 * Ari Weinstein, Michael L. Littman, Sergiu Goschin (**2013**). //[|Rollout-based Game-tree Search Outprunes Traditional Alpha-beta]//. [|PMLR], Vol. 24 » MCTS
 * 2014**
 * Rémi Munos (**2014**). //From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning//. [|Foundations and Trends in Machine Learning, Vol. 7, No 1], [|hal-00747575v5], [|slides as pdf]
 * David W. King (**2014**). //Complexity, Heuristic, and Search Analysis for the Games of Crossings and Epaminondas//. Masters thesis, [|Air Force Institute of Technology], [|pdf]
 * David W. King, Gilbert L. Peterson (**2014**). //Epaminondas: Exploring Combat Tactics//. ICGA Journal, Vol. 37, No. 3

2015 ...

 * Xi Liang, Ting-Han Wei, I-Chen Wu (**2015**). //Job-level UCT search for solving Hex//. [|CIG 2015]
 * Yusaku Mandai, Tomoyuki Kaneko (**2015**). //LinUCB Applied to Monte Carlo Tree Search//. Advances in Computer Games 14
 * Yun-Ching Liu, Yoshimasa Tsuruoka (**2015**). //Adapting Improved Upper Confidence Bounds for Monte-Carlo Tree Search//. Advances in Computer Games 14
 * S. Ali Mirsoleimani, Aske Plaat, Jaap van den Herik (**2015**). //Ensemble UCT Needs High Exploitation//. [|CoRR abs/1509.08434]
 * Johannes Heinrich, David Silver (**2015**). //Smooth UCT Search in Computer Poker//. IJCAI 2015, [|pdf]
 * Xi Liang, Ting-Han Wei, I-Chen Wu (**2015**). //Solving Hex Openings Using Job-Level UCT Search//. ICGA Journal, Vol. 38, No. 3
 * Naoki Mizukami, Jun Suzuki, Hirotaka Kameko, Yoshimasa Tsuruoka (**2017**). //Exploration Bonuses Based on Upper Confidence Bounds for Sparse Reward Games//. Advances in Computer Games 15

=Forum Posts=
 * [|Re: Chess vs Go // AI vs IA] by Gian-Carlo Pascutto, June 02, 2010
 * [|Monte Carlo (upper confidence bounds applied to trees)] by ChessGO, [|Computer Go], October 22, 2010
 * [|UCT surprise for checkers !] by Daniel Shawul, CCC, March 25, 2011
 * [|uct on gpu] by Daniel Shawul, CCC, February 24, 2012 » GPU
 * [|My new book] by Daniel Shawul, CCC, January 02, 2014 » Opening Book
 * [|Nebiyu-MCTS vs TSCP 1-0] by Daniel Shawul, CCC, December 10, 2017 » Nebiyu
 * [|Search traps in MCTS and chess] by Daniel Shawul, CCC, December 25, 2017 » Sampling-Based Planning
 * [|MCTS weakness wrt AB (via Daniel Shawul)] by Chris Whittington, Rybka Forum, December 25, 2017
 * [|Announcing lczero] by Gary, CCC, January 09, 2018 » LCZero

=External Links= > [|UCT Monte Carlo Tree Search algorithm] in Python 2.7 by Peter Cowling, Ed Powley, and Daniel Whitehouse > [|UCT MCTS implementation] in Java by Simon Lucas
 * [|Exploration and exploitation - in Monte Carlo tree search from Wikipedia]
 * [|Confidence interval from Wikipedia]
 * [|Multi-armed bandit from Wikipedia]
 * [|Monte Carlo Tree Search] by Cameron Browne et al.
 * [|Sensei's Library: UCT]
 * [|Lange Nacht der Wissenschaften - Long Night of Sciences Jena - 2007] by Ingo Althöfer, MC and UCT poster by Jakob Erdmann

=References= =What links here?= include page="UCT" component="backlinks" limit="100"
 * Up one level**