ACL HLT 2011

The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

Held at the Portland Marriott Downtown Waterfront in
Portland, Oregon, USA, June 19-24, 2011
Beyond Structured Prediction: Inverse Reinforcement LearningPRESENTER: Hal Daume III


ABSTRACT:

Machine learning is all about making predictions; language is full of
complex rich structure.  Structured prediction marries these two.
However, structured prediction isn't always enough: sometimes the
world throws even more complex data at us, and we need reinforcement
learning techniques.  This tutorial is all about the *how* and the
*why* of structured prediction and inverse reinforcement learning (aka
inverse optimal control): participants should walk away comfortable
that they could implement many structured prediction and IRL
algorithms, and have a sense of which ones might work for which
problems.

The first half of the tutorial will cover the "basics" of structured
prediction the structured perceptron and Magerman's incremental
parsing algorithm.  It will then build up to more advanced algorithms
that are shockingly reminiscent of these simple approaches: maximum
margin techniques and search-based structured prediction.

The second half of the tutorial will ask the question: what happens
when our standard assumptions about our data are violated?  This is
what leads us into the world of reinforcement learning (the basics of
which we'll cover) and then to inverse reinforcement learning and
inverse optimal control.

Throughout the tutorial, we will see examples ranging from simple
(part of speech tagging, named entity recognition, etc.) through
complex (parsing, machine translation).

The tutorial does not assume attendees know anything about structured
prediction or reinforcement learning (though it will hopefully be
interesting even to those who know some!), but *does* assume some
knowledge of simple machine learning (eg., binary classification).


OUTLINE:


 PART I (90 minutes):

   (10 mins)  What is structure?  What is structured prediction?

   (20 mins)  Refresher on binary classification
               - What does it mean to learn?
               - Linear models for classification
               - Batch versus stochastic optimization

   (30 mins)  From perceptron to structured perceptron
               - Linear models for structured prediction
               - The "argmax" problem
               - From perceptron to margins

   (30 mins)  Search-based structured prediction
               - Stacking
               - Training classifiers to make parsing decisions
               - Searn and generalizations


 PART II (90 minutes):

   (25 mins)  Refersher on reinforcement learning
               - Markov decision processes
               - Q learning

   (30 mins)  Inverse optimal control and A* search
               - Maximum margin planning
               - Learning to search
               - Dagger

   (25 mins)  Apprenticeship learning

   (10 mins)  Open problems



INSTRUCTOR:

Hal Daume III
	Department of Computer Science
	University of Maryland
	A.V. Williams Building #3227
	College Park, MD 20742
	me@hal3.name

Hal Daume III is an Assistant Professor in Computer Science at the
University of Maryland, with a joint appointment in Linguistics.  He
is primarily interested in the interface between natural language
processing, computational linguistics and machine learning.  His work
in statistical modeling spans multiple aspects of language processing,
including structured prediction, Bayesian methods, domain adaptation,
and linguistic typology.  He associates himself most with conferences
like ACL, ICML, NIPS and EMNLP. He earned his PhD at the University of
Southern Californian with a thesis on structured prediction for
language (his advisor was Daniel Marcu). He spent the summer of 2003
working with Eric Brill in the machine learning and applied statistics
group at Microsoft Research. Prior to that, he studied math (mostly
logic) at Carnegie Mellon University.


PREVIOUS PRESENTATIONS

A previous version of this tutorial was given at ACL 2010 in
Stockholm, Sweden.  The attendance was over 100, making it by far the
largest tutorial that year.

The first half of these two tutorials is 90\% the same; the second
half is only 50\% the same.  The current version will differ in
several ways from the version that was presented at ACL.  Several new
techniques (such as "Dagger") will be discussed that make the SP/IRL
connection even more obvious, and some new work on unsupervised
structured prediction will be added.
acl2011.conference@gmail.com ♦ Oregon Health & Science University