Csaba+Szepesvári

a Hungarian computer scientiest with research interests in applications of statistical techniques in AI, and Reinforcement Learning.
 * Home * People * Csaba Szepesvári**
 * [[image:CsabaSzepesvari.jpg link="http://www.ualberta.ca/~szepesva/"]] ||~  || **Csaba Szepesvári**,

Csaba Szepesvári worked at the [|Computer and Automation Research Institute] of the [|Hungarian Academy of Sciences], and is actually Associate Professor at the [|Department of Computing Science], University of Alberta and is principal investigator of the RLAI group.

In 2006, together with Levente Kocsis, Csaba Szepesvári introduced UCT (Upper Confidence bounds applied to Trees), a new algorithm that applies [|bandit] ideas to guide Monte-Carlo planning. || toc =Selected Publications=
 * Csaba Szepesvári  ||~   ||^  ||

1994 ...

 * Csaba Szepesvári, [|Lászlo Balázs], [|András Lõrincz] (**1994**). //Topology learning solved by extended objects: a neural network model//. [|pdf]
 * Csaba Szepesvári (**1998**). //Reinforcement Learning: Theory and Practice//. in Proceedings of the 2nd Slovak Conference on Artificial Neural Networks, [|zipped ps]

2005 ...

 * Levente Kocsis, Csaba Szepesvári, Mark Winands (**2005**). //[|RSPSA: Enhanced Parameter Optimization in Games]//. Advances in Computer Games 11, [|pdf]
 * Levente Kocsis, Csaba Szepesvári (**2006**). //[|Universal Parameter Optimisation in Games Based on SPSA]//. [|Machine Learning], Special Issue on Machine Learning and Games, Vol. 63, No. 3
 * Levente Kocsis, Csaba Szepesvári (**2006**). //[|Bandit based Monte-Carlo Planning]//. ECML-06, LNCS/LNAI 4212, pp. 282-293. introducing UCT, [|pdf]
 * Levente Kocsis, Csaba Szepesvári, Jan Willemson (**2006**). //Improved Monte-Carlo Search//. [|pdf]
 * [|András György], Levente Kocsis, [|Ivett Szabó], Csaba Szepesvári (**2007**). //Continuous Time Associative Bandit Problems// IJCAI-07, 830-835. [|pdf]
 * Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári (**2007**). //Tuning Bandit Algorithms in Stochastic Environments//. [|pdf]
 * Richard Sutton, Csaba Szepesvári, Hamid Reza Maei (**2008**). //A Convergent O(n) Algorithm for Off-policy Temporal-difference Learning with Linear Function Approximation//, [|pdf] (draft)
 * Rémi Munos, Csaba Szepesvári (**2008**). //Finite time bounds for sampling based fitted value iteration//. Journal of Machine Learning Research, 9:815-857, 2008. [|pdf], [|pdf]
 * Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard Sutton (**2009**). //Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.// Accepted  in Advances in Neural Information Processing Systems 22, Vancouver, BC. December 2009. MIT Press. [|pdf]
 * Richard Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora. (**2009**). //Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation//. In Proceedings of the 26th International Conference on Machine Learning (ICML-09). [|pdf]
 * Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári (**2009**). //Exploration-exploitation trade-off using variance estimates in multi-armed bandits//. Theoretical Computer Science, 410:1876-1902, 2009, [|pdf]

2010 ...

 * István Szita, Csaba Szepesvári (**2010**). //Model-based reinforcement learning with nearly tight exploration complexity bounds//. [|ICML 2010]
 * István Szita, Csaba Szepesvári (**2011**). //Agnostic KWIK learning and efficient approximate reinforcement learning//. [|Journal of Machine Learning Research - Proceedings Track 19]
 * Sylvain Gelly, Marc Schoenauer, Michèle Sebag, Olivier Teytaud, Levente Kocsis, David Silver, Csaba Szepesvári (**2012**). //[|The Grand Challenge of Computer Go: Monte Carlo Tree Search and Extensions]//. Communications of the ACM, Vol. 55, No. 3, [|pdf preprint]

=External Links=
 * [|Homepage of Csaba Szepesvári] from University of Alberta
 * [|Csaba, Szepesvári, PhD. Senior Research Scientist] from [|Hungarian Academy of Sciences]
 * [|Introduction to Reinforcement Learning], [|videolecture] by Csaba Szepesvári, 2008

=References= =What links here?= include page="Csaba Szepesvári" component="backlinks" limit="40"
 * Up one level**