Accepted Tutorials
October 25, 2014 - Morning
October 25, 2014 - Afternoon
October 29, 2014 - Morning
October 29, 2014 - Afternoon
Sentiment Analysis of Social Media Texts
Saif M. Mohammad and Xiaodan Zhu
October 25, 2014 - Morning
Abstract:
Automatically detecting sentiment of product reviews, blogs, tweets,
and SMS messages has attracted extensive interest from both the
academia and industry. It has a number of applications, including:
tracking sentiment towards products, movies, politicians, etc.;
improving customer relation models; detecting happiness and
well-being; and improving automatic dialogue systems. In this
tutorial, we will describe how you can create a state-of-the-art
sentiment analysis system, with a focus on social media posts.
We begin with an introduction to sentiment analysis and its various
forms: term level, message level, document level, and aspect level.
We will describe how sentiment analysis systems are evaluated,
especially through recent SemEval shared tasks: Sentiment Analysis of
Twitter (SemEval-2013 Task 2, SemEval 2014-Task 9) and Aspect Based
Sentiment Analysis (SemEval-2014 Task 4).
We will give an overview of the best sentiment analysis systems at
this point of time, including those that are conventional statistical
systems as well as those using deep learning approaches. We will
describe in detail the NRC-Canada systems, which were the overall best
performing systems in all three SemEval competitions listed
above. These are simple lexical- and sentiment-lexicon features based
systems, which are relatively easy to re-implement.
We will discuss features that had the most impact (those derived from
sentiment lexicons and negation handling). We will present how large
tweet-specific sentiment lexicons can be automatically generated and
evaluated. We will also show how negation impacts sentiment
differently depending on whether the scope of the negation is positive
or negative. Finally, we will flesh out limitations of current
approaches and promising future directions.
Instructors:
- Saif M. Mohammad, Researcher, National Research Council Canada
Saif Mohammad is a Research Officer at the National Research Council
Canada. His research interests are in Computational Linguistics,
especially Lexical Semantics. He develops computational models for
sentiment analysis, emotion detection, semantic distance, and
lexical-semantic relations such as word-pair antonymy.
- Xiaodan Zhu, Researcher, National Research Council Canada
Xiaodan Zhu is a Research Officer at the National Research Council
Canada. His research interests are in Natural Language Processing,
Spoken Language Understanding, and Machine Learning. His recent work
focuses on sentiment analysis, emotion detection, speech
summarization, and deep learning.
-
The instructors, along with Svetlana Kiritchenko, developed the
NRC-Canada Sentiment Analysis System, which was the top-performing
system in recent SemEval shared-task competitions (SemEval-2013, Task
2, SemEval-2014 Task 9, and SemEval-2014 Task 4).
top of the page
Spectral Learning Techniques for Weighted Automata, Transducers, and Grammars
Borja Balle, Ariadna Quattoni and Xavier Carreras
October 25, 2014 - Morning
Abstract:
In recent years we have seen the development of efficient and provably
correct algorithms for learning weighted automata and closely related
function classes such as weighted transducers and weighted
context-free grammars. The common denominator of all these algorithms
is the so-called spectral method, which gives an efficient and robust
way to estimate recursively defined functions from empirical
estimations of observable statistics. These algorithms are appealing
because of the of existence of theoretical guarantees (e.g. they are
not susceptible to local minima) and because of their
efficiency. However, despite their simplicity and wide applicability
to real problems, their impact in NLP applications is still
moderate. One of the goals of this tutorial is to remedy this
situation.
The contents that will be presented in this tutorial will offer a
complementary perspective with respect to previous tutorials on
spectral methods presented at ICML-2012, ICML-2013 and
NAACL-2013. Rather than using the language of graphical models and
signal processing, we tell the story from the perspective of formal
languages and automata theory (without assuming a background in formal
algebraic methods). Our presentation highlights the common intuitions
lying behind different spectral algorithms by presenting them in a
unified framework based on the concepts of low-rank factorizations and
completions of Hankel matrices. In addition, we provide an
interpretation of the method in terms of forward and backward
recursions for automata and grammars. This provides extra intuitions
about the method and stresses the importance of matrix factorization
for learning automata and grammars. We believe that this complementary
perspective might be appealing for an NLP audience and serve to put
spectral learning in a wider and, perhaps for some, more familiar
context. Our hope is that this will broaden the understanding of these
methods by the NLP community and empower many researchers to apply
these techniques to novel problems.
The content of the tutorial will be divided into four blocks of 45
minutes each, as follows. The first block will introduce the basic
definitions of weighted automata and Hankel matrices, and present a
key connection between the fundamental theorem of weighted automata
and learning. In the second block we will discuss the case of
probabilistic automata in detail, touching upon all aspects from the
underlying theory to the tricks required to achieve accurate and
scalable learning algorithms. The third block will present extensions
to related models, including sequence tagging models, finite-state
transducers and weighted context-free grammars. The last block will
describe a general framework for using spectral techniques in more
general situations where a matrix completion pre-processing step is
required; several applications of this approach will be described.
Instructors:
- Borja Balle, postdoctoral fellow, McGill University
Borja Balle is currently a postdoctoral fellow at McGill University,
and prior to that he obtained his PhD from Universitat Politecnica de
Catalunya (UPC) in July 2013. His research interests lie on the
intersection between automata theory and machine learning, in
particular on applications of spectral learning techniques to natural
language processing, grammatical inference, and reinforcement
learning. He is area chair for NIPS 2014, program committee member for
ICGI 2014, and has recently organized three workshops (at ICML 2013,
NIPS 2013 and ICML 2014) on methods of moments and spectral learning.
- Ariadna Quattoni, Researcher, Xerox Research Centre Europe (XRCE)
Ariadna Quattoni is currently a researcher at Xerox Research Centre
Europe (XRCE), prior to that she was a researcher at the Universitat
Politecnica de Catalunya (UPC). She obtained her PhD from MIT in
2009. Her main research focuses on latent variable models for
structured prediction with applications to natural language processing
and computer vision. On the last years her work has centered on
spectral learning techninques for structured prediction problems with
applications to sequence tagging, learning general transductions, and
parsing.
- Xavier Carreras, senior researcher, Xerox Research Centre Europe
Xavier Carreras research is in natural language processing and machine
learning. He is interested in grammatical induction and parsing
methods for syntactic-semantic analysis and translation of natural
languages. In 2005 he completed his PhD at the Universitat Politecnica
de Catalunya (UPC). From 2006 to 2009 he was a postdoctoral researcher
at MIT/CSAIL. From 2009 to 2014 he was a researcher at UPC and since
June 2014 he is senior researcher at Xerox Research Centre Europe.
top of the page
Semantic Parsing with Combinatory Categorial Grammars
Yoav Artzi, Nicholas FitzGerald, and Luke Zettlemoyer
October 25, 2014 - Afternoon
Abstract:
Semantic parsers map natural language sentences to formal
representations of their underlying meaning. Building accurate
semantic parsers without prohibitive engineering costs is a
long-standing, open research problem.
The tutorial will describe general principles for building semantic
parsers. The presentation will be divided into two main parts:
learning and modeling. In the learning part, we will describe a
unified approach for learning Combinatory Categorial Grammar (CCG)
semantic parsers, that induces both a CCG lexicon and the parameters
of a parsing model. The approach learns from data with labeled meaning
representations, as well as from more easily gathered weak
supervision. It also enables grounded learning where the semantic
parser is used in an interactive environment, for example to read and
execute instructions. The modeling section will include best practices
for grammar design and choice of semantic representation. We will
motivate our use of lambda calculus as a language for building and
representing meaning with examples from several domains.
The ideas we will discuss are widely applicable. The semantic modeling
approach, while implemented in lambda calculus, could be applied to
many other formal languages. Similarly, the algorithms for inducing
CCG focus on tasks that are formalism independent, learning the
meaning of words and estimating parsing parameters. No prior knowledge
of CCG is required. The tutorial will be backed by implementation and
experiments in the University of Washington Semantic Parsing Framework
(UW SPF, http://yoavartzi.com/spf).
Instructors:
- Yoav Artzi, Ph.D. candidate, University of Washington
Yoav Artzi is a Ph.D. candidate in the Computer Science & Engineering
department at the University of Washington. His research studies
semantic meaning and machine learning methods for natural language
understanding within situated interactive systems. He is currently
supported by the 2014 Microsoft Research PhD Fellowship, and is a
recipient of the 2012 Yahoo KSC award.
- Nicholas FitzGerald, Ph.D. student, University of Washington
Nicholas FitzGerald is a Ph.D. student at the University of
Washington. His research interests are grounded natural language
understanding and generation. He is a recipient of an Intel Science
and Technology Center Fellowship and an NSERC Postgraduate
Scholarship.
- Luke Zettlemoyer, assistant professor, University of Washington
Luke Zettlemoyer is an Assistant Professor in the Computer Science &
Engineering department at the University of Washington. His research
interests are in the intersections of natural language processing,
machine learning and decision making under uncertainty. Honors include
best paper awards at UAI 2005 and ACL 2009, selection to the DARPA
CSSG, and an NSF CAREER Award.
top of the page
Linear Programming Decoders in Natural Language Processing: From Integer Programming to Message Passing and Dual Decomposition
André F. T. Martins
October 25, 2014 - Afternoon
Abstract:
This tutorial will cover the theory and practice of linear programming
decoders. This class of decoders encompasses a variety of techniques
that have enjoyed great success in devising structured models for
natural language processing (NLP). Along the tutorial, we provide a
unified view of different algorithms and modeling techniques,
including belief propagation, dual decomposition, integer linear
programming, Markov logic, and constrained conditional models. Various
applications in NLP will serve as a motivation.
There is a long string of work using integer linear programming (ILP)
formulations in NLP, for example in semantic role labeling, machine
translation, summarization, dependency parsing, coreference
resolution, and opinion mining, to name just a few. At the heart of
these approaches is the ability to encode logic and budget constraints
(common in NLP and information retrieval) as linear
inequalities. Thanks to general purpose solvers (such as Gurobi,
CPLEX, or GLPK), the practitioner can abstract away from the decoding
algorithm and focus on developing a powerful model. A disadvantage,
however, is that general solvers do not scale well to large problem
instances, since they fail to exploit the structure of the problem.
This is where graphical models come into play. In this tutorial, we
show that most logic and budget constraints that arise in NLP can be
cast in this framework. This opens the door for the use of
message-passing algorithms, such as belief propagation and variants
thereof. An alternative are algorithms based on dual decomposition,
such as the subgradient method or AD3. These algorithms have achieved
great success in a variety of applications, such as parsing,
corpus-wide tagging, machine translation, summarization, joint
coreference resolution and quotation attribution, and semantic role
labeling. Interestingly, most decoders used in these works can be
regarded as structure-aware solvers for addressing relaxations of
integer linear programs. All these algorithms have a similar
consensus-based architecture: they repeatedly perform certain "local"
operations in the graph, until some form of local agreement is
achieved. The local operations are performed at each factor, and they
range between computing marginals, max-marginals, an optimal
configuration, or a small quadratic problem, all of which are commonly
tractable and efficient in a wide range of problems.
As a companion of this tutorial, we provide an open-source
implementation of some of the algorithms described above, available at
http://www.ark.cs.cmu.edu/AD3.
Instructors:
- André F. T. Martins, research scientist, Instituto de Telecomunicações, Instituto Superior Técnico, and Priberam Informática
A. Martins is a research scientist at Priberam Labs. He received his
dual-degree PhD in Language Technologies in 2012 from Carnegie Mellon
University and Instituto Superior Técnico. His PhD dissertation was
awarded Honorable Mention in CMU’s SCS Dissertation Award
competition. Martins' research interests include natural language
processing, machine learning, structured prediction, sparse modeling,
and optimization. His paper "Concise Integer Linear Programming
Formulations for Dependency Parsing" received a best paper award at
ACL 2009.
top of the page
Syntax-Based Statistical Machine Translation
Philip Williams and Philipp Koehn
October 29, 2014 - Morning
Abstract:
The tutorial explains in detail syntax-based statistical machine
translation with synchronous context free grammars (SCFG). It is
aimed at researchers who have little background in this area, and
gives a comprehensive overview about the main models and methods.
While syntax-based models in statistical machine translation have a
long history, spanning back almost 20 years, they have only recently
shown superior translation quality over the more commonly used
phrase-based models, and are now considered state of the art for some
language pairs, such as Chinese-English (since ISI's submission to
NIST 2006), and English-German (since Edinburgh's submission to WMT
2012).
While the field is very dynamic, there is a core set of methods that
have become dominant. Such SCFG models are implemented in the open
source machine translation toolkit Moses, and the tutors draw from the
practical experience of its development.
The tutorial focuses on explaining core established concepts in
SCFG-based approaches, which are the most popular in this area. The
main goal of the tutorial is for the audience to understand how these
systems work end-to-end. We review as much relevant literature as
necessary, but the tutorial is not a primarily research survey.
The tutorial is rounded up with open problems and advanced topics,
such as computational challenges, different formalisms for
syntax-based models and inclusion of semantics.
Instructors:
- Philip Williams, University of Edinburgh
Philip Williams completed his Ph.D. at the University of Edinburgh in
2014, with a thesis on unification-based formalisms applied to
syntax-based statistical machine translation. He has been recently the
main contributor to the syntax-based MT model in the Moses decoder.
- Philipp Koehn, Johns Hopkins University/University of Edinburgh
Philipp Koehn has worked on statistical machine translation since
1997, is leading the development of the open source machine
translation toolkit Moses, and wrote the textbook for the field
(published in 2010). He has worked on many aspects on machine
translation, including syntax-based approaches.
top of the page
Embedding Methods for Natural Language Processing
Antoine Bordes and Jason Weston
October 29, 2014 - Morning
Abstract:
Embedding-based models are popular tools in Natural Language
Processing these days. In this tutorial, our goal is to provide an
overview of the main advances in this domain. These methods learn
latent representations of words, as well as database entries that can
then be used to do semantic search, automatic knowledge base
construction, natural language understanding, etc. Our current plan is
to split the tutorial into 2 sessions of 90 minutes, with a 30 minutes
coffee break in the middle, so that we can cover in a first session
the basics of learning embeddings and advanced models in the second
session. This is detailed in the following.
Part 1: Unsupervised and Supervised Embeddings
We introduce models that embed tokens (words, database entries) by
representing them as low dimensional embedding vectors. Unsupervised
and supervised methods will be discussed, including SVD, Word2Vec,
Paragraph Vectors, SSI, Wsabie and others. A comparison between
methods will be made in terms of applicability, type of loss function
(ranking loss, reconstruction loss, classification loss),
regularization, etc. The use of these models in several NLP tasks will
be discussed, including question answering, frame identification,
knowledge extraction and document retrieval.
Part 2: Embeddings for Multi-relational Data
This second part will focus mostly on the construction of embeddings
for multi-relational data, that is when tokens can be interconnected
in different ways in the data such as in knowledge bases for
instance. Several methods based on tensor factorization, collective
matrix factorization, stochastic block models or energy-based learning
will be presented. The task of link prediction in a knowledge base
will be used as an application example. Multiple empirical results on
the use of embedding models to align textual information to knowledge
bases will also be presented, together with some demos if time
permits.
Instructors:
-
- Antoine Bordes, Facebook
- Jason Weston, Facebook
top of the page
Natural Language Processing of Arabic and its Dialects
Mona Diab and Nizar Habash
October 29, 2014 - Afternoon
This tutorial introduces the different challenges and current
solutions to the automatic processing of Arabic and its dialects. The
tutorial has two parts: First, we present a discussion of generic
issues relevant to Arabic NLP and detail dialectal linguistic issues
and the challenges they pose for NLP. In the second part, we review
the state-of-the-art in Arabic processing covering several enabling
technologies and applications, e.g., dialect identification,
morphological processing (analysis, disambiguation, tokenization, POS
tagging), parsing, and machine translation.
Instructors:
-
- Mona Diab, Associate Professor, George Washington University
- Nizar Habash, New York University Abu Dhabi
top of the page
Text Quantification
Fabrizio Sebastiani
October 29, 2014 - Afternoon
Tutorial notes
watch video
Abstract:
In recent years it has been pointed out that, in a number of applications involving (text) classification, the final goal is not determining which class (or classes) individual unlabelled data items belong to, but determining the prevalence (or “relative frequency”) of each class in the unlabelled data. The latter task is known as quantification.
Assume a market research agency runs a poll in which they ask the question “What do you think of the recent ad campaign for product X?” Once the poll is complete, they may want to classify the resulting textual answers according to whether they belong or not to the class LovedTheCampaign. The agency is likely not interested in whether a specific individual belongs to the class LovedTheCampaign, but in knowing how many respondents belong to it, i.e., in knowing the prevalence of the class. In other words, the agency is interested not in classification, but in quantification. Essentially, quantification is classification tackled at the aggregate (rather than at the individual) level.
The research community has recently shown a growing interest in tackling quantification as a task in its own right. One of the reasons is that, since the goal of quantification is different than that of classification, quantification requires evaluation measures different than for classification. A second, related reason is that using a method optimized for classification accuracy is suboptimal when quantification accuracy is the real goal. A third reason is the growing awareness that quantification is going to be more and more important; with the advent of big data, more and more application contexts are going to spring up in which we will simply be happy with analyzing data at the aggregate (rather than at the individual) level.
The goal of this tutorial is to introduce the audience to the problem of quantification, to the techniques that have been proposed for solving it, to the
metrics used to evaluate them, and to the problems that are still open in the area.
Instructor:
-
- Fabrizio Sebastiani, Principal Scientist, Qatar Computing Research Institute
Fabrizio Sebastiani is a Principal Scientist at QCRI; he was (until June 2014) a Senior Researcher at the Italian National Council of Research (from which he is currently on leave), and (until February 2006) an Associate Professor at the Department of Pure and Applied Mathematics of the University of Padova, Italy. His main current research interests are at the intersection of information retrieval, machine learning, and human language technologies, with particular emphasis on text classification, information extraction, opinion mining, and their applications.
top of the page