Skip to main content
guest
Join
|
Help
|
Sign In
chessprogramming
Home
guest
|
Join
|
Help
|
Sign In
Wiki Home
Recent Changes
Pages and Files
Members
Home
Basics
Getting Started
Board Representation
Search
Evaluation
Principle Topics
Chess
Programming
Artificial Intelligence
Knowledge
Learning
Testing
Tuning
User Interface
Protocols
Dictionary
Lists
Arts
Cartoons
CC Forums
Conferences
Dedicated CC
Engines
Games
Hardware
History
Organizations
Papers
People
Periodical
Samples
Software
Timeline
Tournaments and Matches
Videos
Misc
Acknowledgments
On New Pages
Recommended Reading
Wikispaces Help
Nicolò Cesa-Bianchi
Edit
0
10
…
2
Tags
people
researcher
Notify
RSS
Backlinks
Source
Print
Export (PDF)
Home
*
People
* Nicolò Cesa-Bianchi
Nicolò Cesa-Bianchi
,
an Italian computer scientist and professor at department of computer science,
University of Milan
. His research interests include a wide range in the fields of
machine learning
and
computational learning theory
, such as
reinforcement learning
,
game-theoretic
learning,
statistical learning theory
,
prediction
with expert advice, and
bandit problems
. Along with
Gábor Lugosi
, he authored
Prediction, Learning, and Games
in 2006
[1]
.
Nicolò Cesa-Bianchi
[2]
Table of Contents
Bandit Problems
Selected Publications
1990 ...
2000 ...
2010 ...
External Links
References
What links here?
Bandit Problems
In
probability theory
, the
multi-armed bandit problem
faces the tradeoff between
exploitation
of the
slot machine
that has the highest expected payoff and
exploration
to get more information about the expected payoffs of the other machines. The trade-off between exploration and exploitation is also topic in
reinforcement learning
[3]
. The
gambler
has to decide at time steps t = 1, 2, ... which of the finitely many available arms to pull. Each arm produces a
reward
in a
stochastic
manner. The goal is to maximize the reward accumulated over time. In 2002, along with
Peter Auer
and
Paul Fischer
, Nicolò Cesa-Bianchi introduced the
UCB1
(Upper Confidence Bounds) bandit algorithm
[4]
, which was applied as selection algorithm
UCT
to
Monte-Carlo Tree Search
as elaborated by
Levente Kocsis
and
Csaba Szepesvári
in 2006
[5]
.
Selected Publications
[6]
[7]
1990 ...
Nicolò Cesa-Bianchi
(
1990
).
Learning the Distribution in the Extended PAC Model
.
ALT 1990
Peter Auer
,
Nicolò Cesa-Bianchi
(
1994
).
On-line learning with malicious noise and the closure algorithm
.
Algorithmic Learning Theory
,
LNAI
,
Springer
Nicolò Cesa-Bianchi
,
Yoav Freund
,
David P. Helmbold
,
Manfred K. Warmuth
(
1996
).
On-line Prediction and Conversion Strategies
.
Machine Learning
, Vol. 25, No 1,
pdf
Nicolò Cesa-Bianchi
,
Yoav Freund
,
David P. Helmbold
,
David Haussler
,
Robert Schapire
,
Manfred K. Warmuth
(
1997
).
How to Use Expert Advice
.
Journal of the ACM
, Vol. 44, No. 3,
pdf
Peter Auer
,
Nicolò Cesa-Bianchi
,
Yoav Freund
,
Robert Schapire
(
1998
).
Gambling in a rigged casino: The adversarial multi-arm bandit problem
.
NeuroCOLT2
,
pdf
Nicolò Cesa-Bianchi
,
Paul Fischer
(
1998
).
Finite-Time Regret Bounds for the Multiarmed Bandit Problem
.
ICML 1998
,
CiteSeerX
2000 ...
Peter Auer
,
Nicolò Cesa-Bianchi
,
Yoav Freund
,
Robert Schapire
(
2002
).
The Nonstochastic Multiarmed Bandit Problem
.
SIAM Journal on Computing
, Vol. 32, No. 1,
2001 pdf
Peter Auer
,
Nicolò Cesa-Bianchi
,
Paul Fischer
(
2002
).
Finite-time Analysis of the Multiarmed Bandit Problem
.
Machine Learning
, Vol. 47, No. 2,
pdf
Nicolò Cesa-Bianchi
,
Gábor Lugosi
(
2003
).
Potential-based Algorithms in On-line Prediction and Game Theory
.
Machine Learning
, Vol. 51,
pdf
Nicolò Cesa-Bianchi
,
Gábor Lugosi
(
2006
).
Prediction, Learning, and Games
.
Cambridge University Press
Nicolò Cesa-Bianchi
(
2009
).
Online discriminative learning: theory and applications
.
ASRU 2009
2010 ...
Fabio Vitale
,
Nicolò Cesa-Bianchi
,
Claudio Gentile
,
Giovanni Zappella
(
2011
).
See the Tree Through the Lines: The Shazoo Algorithm
.
NIPS 2011
,
pdf
Nicolò Cesa-Bianchi
(
2011
).
The Game-Theoretic Approach to Machine Learning and Adaptation
.
ICAIS 2011
Nicolò Cesa-Bianchi
(
2011
).
Ensembles and Multiple Classifiers: A Game-Theoretic View
.
MCS 2011
Nicolò Cesa-Bianchi
,
Gábor Lugosi
(
2012
).
Combinatorial Bandits
.
Journal of Computer and System Sciences
, Vol. 78,
preprint as pdf
Sébastien Bubeck
,
Nicolò Cesa-Bianchi
(
2012
).
Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
.
Foundations and Trends in Machine Learning, Vol. 5
, No. 1,
pdf
Sébastien Bubeck
,
Nicolò Cesa-Bianchi
,
Gábor Lugosi
(
2013
).
Bandits With Heavy Tail
.
IEEE Transactions on Information Theory
, Vol. 59, No. 11,
arXiv:1209.1727v1
Nicolò Cesa-Bianchi
(
2015
).
Multi-armed Bandit Problem
.
Encyclopedia of Algorithms 2015
External Links
Nicolò Cesa-Bianchi
Nicolò Cesa-Bianchi from Wikipedia
Nicolò Cesa-Bianchi - Google Scholar Citations
Nicolò Cesa-Bianchi - University of Milan - VideoLectures.NET
References
^
Nicolò Cesa-Bianchi
,
Gábor Lugosi
(
2006
).
Prediction, Learning, and Games
.
Cambridge University Press
"...beware of mathematicians, and all those who make empty prophecies. The danger already exists that the mathematicians have made a covenant with the devil to darken the spirit and to confine man in the bonds of Hell".
St. Augustine
,
De Genesi ad Litteram libri duodecim. Liber Secundus, 17, 37.
^
Nicolò Cesa-Bianchi - Google Scholar Citations
^
Multi-armed bandit from Wikipedia
^
Peter Auer
,
Nicolò Cesa-Bianchi
,
Paul Fischer
(
2002
).
Finite-time Analysis of the Multiarmed Bandit Problem
.
Machine Learning
, Vol. 47, No. 2,
pdf
^
Levente Kocsis
,
Csaba Szepesvári
(
2006
).
Bandit based Monte-Carlo Planning
ECML-06, LNCS/LNAI 4212,
pdf
^
dblp: Nicolò Cesa-Bianchi
^
Nicolò Cesa-Bianchi - Selected papers:
What links here?
Page
Date Edited
Gábor Lugosi
Apr 29, 2015
Games
Feb 20, 2018
Learning
Feb 20, 2018
Nicolò Cesa-Bianchi
May 30, 2015
Paul Fischer
May 20, 2015
People
Feb 28, 2018
Peter Auer
May 29, 2015
Robert Schapire
Jun 1, 2015
UCT
Jan 22, 2018
Yoav Freund
Sep 20, 2015
Up one Level
Javascript Required
You need to enable Javascript in your browser to edit pages.
help on how to format text
Turn off "Getting Started"
Home
...
Loading...
an Italian computer scientist and professor at department of computer science, University of Milan. His research interests include a wide range in the fields of machine learning and computational learning theory, such as reinforcement learning, game-theoretic learning, statistical learning theory, prediction with expert advice, and bandit problems. Along with Gábor Lugosi, he authored Prediction, Learning, and Games in 2006 [1].
Table of Contents
Bandit Problems
In probability theory, the multi-armed bandit problem faces the tradeoff between exploitation of the slot machine that has the highest expected payoff and exploration to get more information about the expected payoffs of the other machines. The trade-off between exploration and exploitation is also topic in reinforcement learning [3]. The gambler has to decide at time steps t = 1, 2, ... which of the finitely many available arms to pull. Each arm produces a reward in a stochastic manner. The goal is to maximize the reward accumulated over time. In 2002, along with Peter Auer and Paul Fischer, Nicolò Cesa-Bianchi introduced the UCB1 (Upper Confidence Bounds) bandit algorithm [4], which was applied as selection algorithm UCT to Monte-Carlo Tree Search as elaborated by Levente Kocsis and Csaba Szepesvári in 2006 [5].Selected Publications
[6] [7]1990 ...
2000 ...
2010 ...
External Links
References
"...beware of mathematicians, and all those who make empty prophecies. The danger already exists that the mathematicians have made a covenant with the devil to darken the spirit and to confine man in the bonds of Hell". St. Augustine, De Genesi ad Litteram libri duodecim. Liber Secundus, 17, 37.
What links here?
Up one Level