Skip to main content
guest
Join

Help

Sign In
chessprogramming
Home
guest

Join

Help

Sign In
Wiki Home
Recent Changes
Pages and Files
Members
Home
Basics
Getting Started
Board Representation
Search
Evaluation
Principle Topics
Chess
Programming
Artificial Intelligence
Knowledge
Learning
Testing
Tuning
User Interface
Protocols
Dictionary
Lists
Arts
Cartoons
CC Forums
Conferences
Dedicated CC
Engines
Games
Hardware
History
Organizations
Papers
People
Periodical
Samples
Software
Timeline
Tournaments and Matches
Videos
Misc
Acknowledgments
On New Pages
Recommended Reading
Wikispaces Help
Peter Auer
Edit
0
5
…
2
Tags
mathematician
people
Notify
RSS
Backlinks
Source
Print
Export (PDF)
Home
*
People
* Peter Auer
Peter Auer
,
an Austrian mathematician and computer scientist, full professor and chair for information technology at
University of Leoben
, also associated with the Institute for Theoretical Computer Science,
Graz University of Technology
^{[1]}
. He holds a Ph.D. in technical mathematics in 1992 from
Vienna University of Technology
, and a
Habilitation
in 1997 from Graz University of Technology on the topic of
information processing
and
probability theory
^{[2]}
. His research interests include on
machine learning
,
neural networks
,
symbolic computation
, and
computational complexity theory
.
Peter Auer
^{[3]}
Table of Contents
Bandit Problems
Selected Publications
1990 ...
2000 ...
2010 ...
External Links
References
What links here?
Bandit Problems
In
probability theory
, the
multiarmed bandit problem
faces the tradeoff between
exploitation
of the
slot machine
that has the highest expected payoff and
exploration
to get more information about the expected payoffs of the other machines. The tradeoff between exploration and exploitation is also topic in
reinforcement learning
^{[4]}
. The
gambler
has to decide at time steps t = 1, 2, ... which of the finitely many available arms to pull. Each arm produces a
reward
in a
stochastic
manner. The goal is to maximize the reward accumulated over time. In 2002, along with
Nicolò CesaBianchi
and
Paul Fischer
, Peter Auer introduced the
UCB1
(Upper Confidence Bounds) bandit algorithm
^{[5]}
, which was applied as selection algorithm
UCT
to
MonteCarlo Tree Search
as elaborated by
Levente Kocsis
and
Csaba Szepesvári
in 2006
^{[6]}
.
Selected Publications
^{[7]}
^{[8]}
1990 ...
Peter Auer
(
1991
).
Solving String Equations with Constant Restrictions
.
IWWERT 1991
Peter Auer
(
1991
).
Unification in the Combination of Disjoint Theories
.
IWWERT 1991
Peter Auer
(
1993
).
Online learning of rectangles in noisy environments
. COLT 1993, ACM Press
Peter Auer
,
Nicolò CesaBianchi
(
1994
).
Online learning with malicious noise and the closure algorithm
.
Algorithmic Learning Theory
,
LNAI
,
Springer
Peter Auer
(
1997
).
Learning Nested Differences in the Presence of Malicious Noise
.
Theoretical Computer Science
, Vol. 185, No. 1
Peter Auer
,
Manfred K. Warmuth
(
1998
).
Tracking the best Disjunction
.
Machine Learning
, Vol. 32,
pdf
Peter Auer
,
Nicolò CesaBianchi
,
Yoav Freund
,
Robert Schapire
(
1998
).
Gambling in a rigged casino: The adversarial multiarm bandit problem
.
NeuroCOLT2
,
pdf
2000 ...
Peter Auer
,
Stephen Kwek
,
Wolfgang Maass
,
Manfred K. Warmuth
(
2000
).
Learning of Depth Two Neural Networks with Constant Fanin at the Hidden Nodes
.
Electronic Colloquium on Computational Complexity, Vol. 7
Peter Auer
(
2000
).
Using Upper Confidence Bounds for Online Learning
.
FOCS 2000
Peter Auer
,
Nicolò CesaBianchi
,
Yoav Freund
,
Robert Schapire
(
2002
).
The Nonstochastic Multiarmed Bandit Problem
.
SIAM Journal on Computing
, Vol. 32, No. 1,
2001 pdf
Peter Auer
,
Nicolò CesaBianchi
,
Paul Fischer
(
2002
).
Finitetime Analysis of the Multiarmed Bandit Problem
.
Machine Learning
, Vol. 47, No. 2,
pdf
Peter Auer
(
2002
).
Using Confidence Bounds for ExploitationExploration Tradeoffs
.
Journal of Machine Learning Research
, Vol. 3, Special Issue on Computational Learning Theory,
pdf
Peter Auer
,
Ronald Ortner
(
2006
).
Logarithmic online regret bounds for undiscounted reinforcement learning
.
NIPS 2006
,
pdf
Peter Auer
(
2008
).
Learning with Malicious Noise
.
Encyclopedia of Algorithms
2010 ...
Peter Auer
,
Ronald Ortner
(
2010
).
UCB revisited: Improved regret bounds for the stochastic multiarmed bandit problem
.
Periodica Mathematica Hungarica
, Vol. 61, Nos. 12,
2011 pdf
Peter Auer
(
2011
).
Exploration and Exploitation in Online Learning
.
ICAIS 2011
Peter Auer
(
2011
).
UCRL and Autonomous Exploration
.
EWRL 2011
Peter Auer
,
Marcus Hutter
,
Laurent Orseau
(
2013
).
Reinforcement Learning
.
Dagstuhl Reports, Vol. 3, No. 8
, DOI:
10.4230/DagRep.3.8.1
, URN:
urn:nbn:de:0030drops43409
Ronald Ortner
,
Daniil Ryabko
,
Peter Auer
,
Rémi Munos
(
2014
).
Regret bounds for restless Markov bandits
.
Theoretical Computer Science
558,
pdf
External Links
Homepage Peter Auer
CiT  Chair of Information Technology at the MUL
The Mathematics Genealogy Project  Peter Auer
Peter Auer: Informationstechnologie, Montanuniversität Leoben  21.02.2003  APAOTS
(German)
References
^
Homepage Peter Auer
^
Peter Auer: Informationstechnologie, Montanuniversität Leoben  21.02.2003  APAOTS
(German)
^
Homepage Peter Auer
^
Multiarmed bandit from Wikipedia
^
Peter Auer
,
Nicolò CesaBianchi
,
Paul Fischer
(
2002
).
Finitetime Analysis of the Multiarmed Bandit Problem
.
Machine Learning
, Vol. 47, No. 2,
pdf
^
Levente Kocsis
,
Csaba Szepesvári
(
2006
).
Bandit based MonteCarlo Planning
ECML06, LNCS/LNAI 4212,
pdf
^
Publikationen von Prof. Dr. Peter Auer
^
dblp: Peter Auer
What links here?
Page
Date Edited
Learning
Feb 20, 2018
Marcus Hutter
May 23, 2016
Mathematician
Apr 9, 2018
Neural Networks
Mar 12, 2018
Nicolò CesaBianchi
May 30, 2015
Paul Fischer
May 20, 2015
People
Feb 28, 2018
Peter Auer
May 29, 2015
Reinforcement Learning
Feb 12, 2018
Rémi Munos
Dec 8, 2017
Robert Schapire
Jun 1, 2015
UCT
Jan 22, 2018
Vienna University of Technology
Dec 7, 2017
Yoav Freund
Sep 20, 2015
Up one Level
Javascript Required
You need to enable Javascript in your browser to edit pages.
help on how to format text
Turn off "Getting Started"
Home
...
Loading...
an Austrian mathematician and computer scientist, full professor and chair for information technology at University of Leoben, also associated with the Institute for Theoretical Computer Science, Graz University of Technology ^{[1]} . He holds a Ph.D. in technical mathematics in 1992 from Vienna University of Technology, and a Habilitation in 1997 from Graz University of Technology on the topic of information processing and probability theory ^{[2]} . His research interests include on machine learning, neural networks, symbolic computation, and computational complexity theory.
Table of Contents
Bandit Problems
In probability theory, the multiarmed bandit problem faces the tradeoff between exploitation of the slot machine that has the highest expected payoff and exploration to get more information about the expected payoffs of the other machines. The tradeoff between exploration and exploitation is also topic in reinforcement learning ^{[4]}. The gambler has to decide at time steps t = 1, 2, ... which of the finitely many available arms to pull. Each arm produces a reward in a stochastic manner. The goal is to maximize the reward accumulated over time. In 2002, along with Nicolò CesaBianchi and Paul Fischer, Peter Auer introduced the UCB1 (Upper Confidence Bounds) bandit algorithm ^{[5]}, which was applied as selection algorithm UCT to MonteCarlo Tree Search as elaborated by Levente Kocsis and Csaba Szepesvári in 2006 ^{[6]}.Selected Publications
^{[7]} ^{[8]}1990 ...
2000 ...
2010 ...
External Links
References
What links here?
Up one Level