Andrew G. Barto,
an American computer scientist, AI-researcher and Professor of Computer Science, University of Massachusetts, Amherst. His research centers on learning in natural and artificial systems, and he has studied machine learning algorithms since 1977, contributing to the development of the computational theory and practice of reinforcement learning^{[1]} .

Richard Sutton, Andrew Barto (1981). Toward a modern theory of adaptive networks: Expectation and prediction, Psychological Review, Vol. 88, pp. 135-170. pdf

Richard Sutton, Andrew Barto (1981). Toward a modern theory of adaptive networks: Expectation and prediction, Psychological Review, Vol. 88, pp. 135-170. pdf
Andrew Barto, Richard Sutton (1990). Time-Derivative Models of Pavlovian Reinforcement. in Michael Gabriel, John Moore (eds.) (1990). Learning and Computational Neuroscience: Foundations of Adaptive Networks. MIT Press, pdf
Andrew Barto, Steven Bradtke, Satinder Singh (1990). Explaining Temporal Differences to Create Useful Concepts for Evaluating States. AAAI 1990, pdf
Steven Bradtke, Andrew Barto (1996). Linear Least-Squares Algorithms for Temporal Difference Learning. Machine Learning, Vol. 22, Nos. 1/2/3, pdf
Richard Sutton, Andrew Barto (1998). Reinforcement Learning: An Introduction. MIT Press

