Peter+Auer

an Austrian mathematician and computer scientist, full professor and chair for information technology at [|University of Leoben], also associated with the Institute for Theoretical Computer Science, [|Graz University of Technology]. He holds a Ph.D. in technical mathematics in 1992 from Vienna University of Technology, and a [|Habilitation] in 1997 from Graz University of Technology on the topic of [|information processing] and [|probability theory]. His research interests include on machine learning, neural networks, [|symbolic computation], and [|computational complexity theory]. || toc =Bandit Problems= In [|probability theory], the [|multi-armed bandit problem] faces the tradeoff between [|exploitation] of the [|slot machine] that has the highest expected payoff and [|exploration] to get more information about the expected payoffs of the other machines. The trade-off between exploration and exploitation is also topic in reinforcement learning. The [|gambler] has to decide at time steps t = 1, 2, ... which of the finitely many available arms to pull. Each arm produces a [|reward] in a [|stochastic] manner. The goal is to maximize the reward accumulated over time. In 2002, along with Nicolò Cesa-Bianchi and Paul Fischer, Peter Auer introduced the **UCB1** (Upper Confidence Bounds) bandit algorithm, which was applied as selection algorithm UCT to Monte-Carlo Tree Search as elaborated by Levente Kocsis and Csaba Szepesvári in 2006.
 * Home * People * Peter Auer**
 * [[image:auer-9784-klein.jpg width="160" link="http://personal.unileoben.ac.at/auer/"]] ||~  || **Peter Auer**,
 * Peter Auer ||~  ||^   ||

=Selected Publications=

1990 ...

 * Peter Auer (**1991**). //Solving String Equations with Constant Restrictions//. [|IWWERT 1991]
 * Peter Auer (**1991**). //Unification in the Combination of Disjoint Theories//. [|IWWERT 1991]
 * Peter Auer (**1993**). //On-line learning of rectangles in noisy environments//. COLT 1993, ACM Press
 * Peter Auer, Nicolò Cesa-Bianchi (**1994**). //On-line learning with malicious noise and the closure algorithm//. [|Algorithmic Learning Theory], [|LNAI], [|Springer]
 * Peter Auer (**1997**). //Learning Nested Differences in the Presence of Malicious Noise//. [|Theoretical Computer Science], Vol. 185, No. 1
 * Peter Auer, Manfred K. Warmuth (**1998**). //Tracking the best Disjunction//. [|Machine Learning], Vol. 32, [|pdf]
 * Peter Auer, Nicolò Cesa-Bianchi, Yoav Freund, Robert Schapire (**1998**). //Gambling in a rigged casino: The adversarial multi-arm bandit problem//. [|NeuroCOLT2], [|pdf]

2000 ...

 * Peter Auer, Stephen Kwek, Wolfgang Maass, Manfred K. Warmuth (**2000**). //[|Learning of Depth Two Neural Networks with Constant Fan-in at the Hidden Nodes]//. [|Electronic Colloquium on Computational Complexity, Vol. 7]
 * Peter Auer (**2000**). //Using Upper Confidence Bounds for Online Learning//. [|FOCS 2000]
 * Peter Auer, Nicolò Cesa-Bianchi, Yoav Freund, Robert Schapire (**2002**). //The Nonstochastic Multiarmed Bandit Problem//. [|SIAM Journal on Computing], Vol. 32, No. 1, [|2001 pdf]
 * Peter Auer, Nicolò Cesa-Bianchi, Paul Fischer (**2002**). //[|Finite-time Analysis of the Multiarmed Bandit Problem]//. [|Machine Learning], Vol. 47, No. 2, [|pdf]
 * Peter Auer (**2002**). //Using Confidence Bounds for Exploitation-Exploration Trade-offs//. [|Journal of Machine Learning Research], Vol. 3, Special Issue on Computational Learning Theory, [|pdf]
 * Peter Auer, Ronald Ortner (**2006**). //Logarithmic online regret bounds for undiscounted reinforcement learning//. [|NIPS 2006], [|pdf]
 * Peter Auer (**2008**). //Learning with Malicious Noise//. [|Encyclopedia of Algorithms]

2010 ...

 * Peter Auer, Ronald Ortner (**2010**). //[|UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem]//. [|Periodica Mathematica Hungarica], Vol. 61, Nos. 1-2, [|2011 pdf]
 * Peter Auer (**2011**). //Exploration and Exploitation in Online Learning//. [|ICAIS 2011]
 * Peter Auer (**2011**). //UCRL and Autonomous Exploration//. [|EWRL 2011]
 * Peter Auer, Marcus Hutter, Laurent Orseau (**2013**). //[|Reinforcement Learning]//. [|Dagstuhl Reports, Vol. 3, No. 8], DOI: [|10.4230/DagRep.3.8.1], URN: [|urn:nbn:de:0030-drops-43409]
 * Ronald Ortner, Daniil Ryabko, Peter Auer, Rémi Munos (**2014**). //Regret bounds for restless Markov bandits//. [|Theoretical Computer Science] 558, [|pdf]

=External Links=
 * [|Homepage Peter Auer]
 * [|CiT - Chair of Information Technology at the MUL]
 * [|The Mathematics Genealogy Project - Peter Auer]
 * [|Peter Auer: Informationstechnologie, Montanuniversität Leoben | 21.02.2003 | APA-OTS] (German)

=References= =What links here?= include page="Peter Auer" component="backlinks" limit="40"
 * Up one Level**