Home * People * Nicolò Cesa-Bianchi
nicolocesabianchi.jpg

Nicolò Cesa-Bianchi,
an Italian computer scientist and professor at department of computer science, University of Milan. His research interests include a wide range in the fields of machine learning and computational learning theory, such as reinforcement learning, game-theoretic learning, statistical learning theory, prediction with expert advice, and bandit problems. Along with Gábor Lugosi, he authored Prediction, Learning, and Games in 2006 [1].
Nicolò Cesa-Bianchi [2]

Bandit Problems

In probability theory, the multi-armed bandit problem faces the tradeoff between exploitation of the slot machine that has the highest expected payoff and exploration to get more information about the expected payoffs of the other machines. The trade-off between exploration and exploitation is also topic in reinforcement learning [3]. The gambler has to decide at time steps t = 1, 2, ... which of the finitely many available arms to pull. Each arm produces a reward in a stochastic manner. The goal is to maximize the reward accumulated over time. In 2002, along with Peter Auer and Paul Fischer, Nicolò Cesa-Bianchi introduced the UCB1 (Upper Confidence Bounds) bandit algorithm [4], which was applied as selection algorithm UCT to Monte-Carlo Tree Search as elaborated by Levente Kocsis and Csaba Szepesvári in 2006 [5].

Selected Publications

[6] [7]

1990 ...

2000 ...

2010 ...


External Links


References

  1. ^ Nicolò Cesa-Bianchi, Gábor Lugosi (2006). Prediction, Learning, and Games. Cambridge University Press
    "...beware of mathematicians, and all those who make empty prophecies. The danger already exists that the mathematicians have made a covenant with the devil to darken the spirit and to confine man in the bonds of Hell". St. Augustine, De Genesi ad Litteram libri duodecim. Liber Secundus, 17, 37.
  2. ^ Nicolò Cesa-Bianchi - Google Scholar Citations
  3. ^ Multi-armed bandit from Wikipedia
  4. ^ Peter Auer, Nicolò Cesa-Bianchi, Paul Fischer (2002). Finite-time Analysis of the Multiarmed Bandit Problem. Machine Learning, Vol. 47, No. 2, pdf
  5. ^ Levente Kocsis, Csaba Szepesvári (2006). Bandit based Monte-Carlo Planning ECML-06, LNCS/LNAI 4212, pdf
  6. ^ dblp: Nicolò Cesa-Bianchi
  7. ^ Nicolò Cesa-Bianchi - Selected papers:

What links here?


Up one Level