site:www.cs.utexas.edu

www.cs.utexas.edu2d

The Perils of Trial-and-Error Reward Design: Misdesign through Overfitting and Invalid Task Specifications

In reinforcement learning (RL), a reward function that aligns exactly with a task's true performance metric is often sparse. For example, a true task metric might encode a reward of 1 upon success and ...

www.cs.utexas.edu2d

Generative Adversarial Imitation from Observation

Imitation from observation (IfO) is the problem of learning directly from state-only demonstrations without having access to the demonstrator's actions.The lack of action information both ...

www.cs.utexas.edu7d

TAMER: Training an Agent Manually via Evaluative Reinforcement

Though computers have surpassed humans at many tasks, especially computationally intensive ones, there are many tasks for which human expertise remains necessary and/or useful. For such tasks, it is ...

www.cs.utexas.edu10d

William D. (Bill) Young

If you are considering taking a CS370 course with me, please take a look at this page: CS370 Syllabus. I will accept only a very limited number of CS370 students each semester.

www.cs.utexas.edu10d

Artificial Intelligence and Life in 2030

Artificial Intelligence and Life in 2030. Peter Stone, Rodney Brooks, Erik Brynjolfsson, Ryan Calo, Oren Etzioni, Greg Hager, Julia Hirschberg, Shivaram ...

www.cs.utexas.edu10d

Overlapping Layered Learning

Patrick MacAlpine and Peter Stone.

www.cs.utexas.edu7d

Reinforcement Learning from Simultaneous Human and MDP Reward

Reinforcement Learning from Simultaneous Human and MDP Reward. W. Bradley Knox and Peter Stone. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), ...

www.cs.utexas.edu10d

Transfer Learning for Reinforcement Learning Domains: A Survey

Transfer Learning for Reinforcement Learning Domains: A Survey. Matthew E. Taylor and Peter Stone. Journal of Machine Learning Research, 10(1):1633–1685, 2009.

www.cs.utexas.edu10d

Multiagent Traffic Management: A Reservation-Based Intersection Control Mechanism

Multiagent Traffic Management: A Reservation-Based Intersection Control Mechanism. Kurt Dresner and Peter Stone. In The Third International Joint Conference on Autonomous Agents and Multiagent Systems ...

www.cs.utexas.edu7d

The right music at the right time: adaptive personalized playlists based on sequence modeling

The right music at the right time: adaptive personalized playlists based on sequence modeling. Elad Liebman, Maytal Saar-Tsechansky, and Peter Stone Peter Stone. Management Information Systems ...

www.cs.utexas.edu10d

This is page C

This page has 2 outgoing links its pagerank should be 0.373.

www.cs.utexas.edu10d

Characterizing Reinforcement Learning Methods through Parameterized Learning Problems

Characterizing Reinforcement Learning Methods through Parameterized Learning Problems. Shivaram Kalyanakrishnan and Peter Stone. Machine Learning (MLJ), 84(1--2):205–247, July 2011.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results