EMNLP 2014 - Tutorials

EMNLP 2014: Conference on Empirical Methods in Natural Language Processing — October 25–29, 2014 — Doha, Qatar.

Sentiment Analysis of Social Media Texts

Saif M. Mohammad and Xiaodan Zhu

October 25, 2014 - Morning

Abstract:

Automatically detecting sentiment of product reviews, blogs, tweets, and SMS messages has attracted extensive interest from both the academia and industry. It has a number of applications, including: tracking sentiment towards products, movies, politicians, etc.; improving customer relation models; detecting happiness and well-being; and improving automatic dialogue systems. In this tutorial, we will describe how you can create a state-of-the-art sentiment analysis system, with a focus on social media posts.

We begin with an introduction to sentiment analysis and its various forms: term level, message level, document level, and aspect level. We will describe how sentiment analysis systems are evaluated, especially through recent SemEval shared tasks: Sentiment Analysis of Twitter (SemEval-2013 Task 2, SemEval 2014-Task 9) and Aspect Based Sentiment Analysis (SemEval-2014 Task 4).

We will give an overview of the best sentiment analysis systems at this point of time, including those that are conventional statistical systems as well as those using deep learning approaches. We will describe in detail the NRC-Canada systems, which were the overall best performing systems in all three SemEval competitions listed above. These are simple lexical- and sentiment-lexicon features based systems, which are relatively easy to re-implement.

We will discuss features that had the most impact (those derived from sentiment lexicons and negation handling). We will present how large tweet-specific sentiment lexicons can be automatically generated and evaluated. We will also show how negation impacts sentiment differently depending on whether the scope of the negation is positive or negative. Finally, we will flesh out limitations of current approaches and promising future directions.

Instructors:

Saif M. Mohammad, Researcher, National Research Council Canada
Saif Mohammad is a Research Officer at the National Research Council Canada. His research interests are in Computational Linguistics, especially Lexical Semantics. He develops computational models for sentiment analysis, emotion detection, semantic distance, and lexical-semantic relations such as word-pair antonymy.
Xiaodan Zhu, Researcher, National Research Council Canada
Xiaodan Zhu is a Research Officer at the National Research Council Canada. His research interests are in Natural Language Processing, Spoken Language Understanding, and Machine Learning. His recent work focuses on sentiment analysis, emotion detection, speech summarization, and deep learning.
The instructors, along with Svetlana Kiritchenko, developed the NRC-Canada Sentiment Analysis System, which was the top-performing system in recent SemEval shared-task competitions (SemEval-2013, Task 2, SemEval-2014 Task 9, and SemEval-2014 Task 4).

top of the page

Spectral Learning Techniques for Weighted Automata, Transducers, and Grammars

Borja Balle, Ariadna Quattoni and Xavier Carreras

October 25, 2014 - Morning

Tutorial notes

Video: part 1; part 2

Dedicated page

Abstract:

In recent years we have seen the development of efficient and provably correct algorithms for learning weighted automata and closely related function classes such as weighted transducers and weighted context-free grammars. The common denominator of all these algorithms is the so-called spectral method, which gives an efficient and robust way to estimate recursively defined functions from empirical estimations of observable statistics. These algorithms are appealing because of the of existence of theoretical guarantees (e.g. they are not susceptible to local minima) and because of their efficiency. However, despite their simplicity and wide applicability to real problems, their impact in NLP applications is still moderate. One of the goals of this tutorial is to remedy this situation.

The contents that will be presented in this tutorial will offer a complementary perspective with respect to previous tutorials on spectral methods presented at ICML-2012, ICML-2013 and NAACL-2013. Rather than using the language of graphical models and signal processing, we tell the story from the perspective of formal languages and automata theory (without assuming a background in formal algebraic methods). Our presentation highlights the common intuitions lying behind different spectral algorithms by presenting them in a unified framework based on the concepts of low-rank factorizations and completions of Hankel matrices. In addition, we provide an interpretation of the method in terms of forward and backward recursions for automata and grammars. This provides extra intuitions about the method and stresses the importance of matrix factorization for learning automata and grammars. We believe that this complementary perspective might be appealing for an NLP audience and serve to put spectral learning in a wider and, perhaps for some, more familiar context. Our hope is that this will broaden the understanding of these methods by the NLP community and empower many researchers to apply these techniques to novel problems.

The content of the tutorial will be divided into four blocks of 45 minutes each, as follows. The first block will introduce the basic definitions of weighted automata and Hankel matrices, and present a key connection between the fundamental theorem of weighted automata and learning. In the second block we will discuss the case of probabilistic automata in detail, touching upon all aspects from the underlying theory to the tricks required to achieve accurate and scalable learning algorithms. The third block will present extensions to related models, including sequence tagging models, finite-state transducers and weighted context-free grammars. The last block will describe a general framework for using spectral techniques in more general situations where a matrix completion pre-processing step is required; several applications of this approach will be described.

Instructors:

Borja Balle, postdoctoral fellow, McGill University
Borja Balle is currently a postdoctoral fellow at McGill University, and prior to that he obtained his PhD from Universitat Politecnica de Catalunya (UPC) in July 2013. His research interests lie on the intersection between automata theory and machine learning, in particular on applications of spectral learning techniques to natural language processing, grammatical inference, and reinforcement learning. He is area chair for NIPS 2014, program committee member for ICGI 2014, and has recently organized three workshops (at ICML 2013, NIPS 2013 and ICML 2014) on methods of moments and spectral learning.
Ariadna Quattoni, Researcher, Xerox Research Centre Europe (XRCE)
Ariadna Quattoni is currently a researcher at Xerox Research Centre Europe (XRCE), prior to that she was a researcher at the Universitat Politecnica de Catalunya (UPC). She obtained her PhD from MIT in 2009. Her main research focuses on latent variable models for structured prediction with applications to natural language processing and computer vision. On the last years her work has centered on spectral learning techninques for structured prediction problems with applications to sequence tagging, learning general transductions, and parsing.
Xavier Carreras, senior researcher, Xerox Research Centre Europe
Xavier Carreras research is in natural language processing and machine learning. He is interested in grammatical induction and parsing methods for syntactic-semantic analysis and translation of natural languages. In 2005 he completed his PhD at the Universitat Politecnica de Catalunya (UPC). From 2006 to 2009 he was a postdoctoral researcher at MIT/CSAIL. From 2009 to 2014 he was a researcher at UPC and since June 2014 he is senior researcher at Xerox Research Centre Europe.

top of the page

Semantic Parsing with Combinatory Categorial Grammars

Yoav Artzi, Nicholas FitzGerald, and Luke Zettlemoyer

October 25, 2014 - Afternoon

Tutorial notes

Dedicated page

Abstract:

Semantic parsers map natural language sentences to formal representations of their underlying meaning. Building accurate semantic parsers without prohibitive engineering costs is a long-standing, open research problem.

The tutorial will describe general principles for building semantic parsers. The presentation will be divided into two main parts: learning and modeling. In the learning part, we will describe a unified approach for learning Combinatory Categorial Grammar (CCG) semantic parsers, that induces both a CCG lexicon and the parameters of a parsing model. The approach learns from data with labeled meaning representations, as well as from more easily gathered weak supervision. It also enables grounded learning where the semantic parser is used in an interactive environment, for example to read and execute instructions. The modeling section will include best practices for grammar design and choice of semantic representation. We will motivate our use of lambda calculus as a language for building and representing meaning with examples from several domains.

The ideas we will discuss are widely applicable. The semantic modeling approach, while implemented in lambda calculus, could be applied to many other formal languages. Similarly, the algorithms for inducing CCG focus on tasks that are formalism independent, learning the meaning of words and estimating parsing parameters. No prior knowledge of CCG is required. The tutorial will be backed by implementation and experiments in the University of Washington Semantic Parsing Framework (UW SPF, http://yoavartzi.com/spf).

Instructors:

Yoav Artzi, Ph.D. candidate, University of Washington
Yoav Artzi is a Ph.D. candidate in the Computer Science & Engineering department at the University of Washington. His research studies semantic meaning and machine learning methods for natural language understanding within situated interactive systems. He is currently supported by the 2014 Microsoft Research PhD Fellowship, and is a recipient of the 2012 Yahoo KSC award.
Nicholas FitzGerald, Ph.D. student, University of Washington
Nicholas FitzGerald is a Ph.D. student at the University of Washington. His research interests are grounded natural language understanding and generation. He is a recipient of an Intel Science and Technology Center Fellowship and an NSERC Postgraduate Scholarship.
Luke Zettlemoyer, assistant professor, University of Washington
Luke Zettlemoyer is an Assistant Professor in the Computer Science & Engineering department at the University of Washington. His research interests are in the intersections of natural language processing, machine learning and decision making under uncertainty. Honors include best paper awards at UAI 2005 and ACL 2009, selection to the DARPA CSSG, and an NSF CAREER Award.

top of the page

Linear Programming Decoders in Natural Language Processing: From Integer Programming to Message Passing and Dual Decomposition

André F. T. Martins

October 25, 2014 - Afternoon

Tutorial notes

Video: part 1; part 2

Abstract:

This tutorial will cover the theory and practice of linear programming decoders. This class of decoders encompasses a variety of techniques that have enjoyed great success in devising structured models for natural language processing (NLP). Along the tutorial, we provide a unified view of different algorithms and modeling techniques, including belief propagation, dual decomposition, integer linear programming, Markov logic, and constrained conditional models. Various applications in NLP will serve as a motivation.

There is a long string of work using integer linear programming (ILP) formulations in NLP, for example in semantic role labeling, machine translation, summarization, dependency parsing, coreference resolution, and opinion mining, to name just a few. At the heart of these approaches is the ability to encode logic and budget constraints (common in NLP and information retrieval) as linear inequalities. Thanks to general purpose solvers (such as Gurobi, CPLEX, or GLPK), the practitioner can abstract away from the decoding algorithm and focus on developing a powerful model. A disadvantage, however, is that general solvers do not scale well to large problem instances, since they fail to exploit the structure of the problem.

This is where graphical models come into play. In this tutorial, we show that most logic and budget constraints that arise in NLP can be cast in this framework. This opens the door for the use of message-passing algorithms, such as belief propagation and variants thereof. An alternative are algorithms based on dual decomposition, such as the subgradient method or AD3. These algorithms have achieved great success in a variety of applications, such as parsing, corpus-wide tagging, machine translation, summarization, joint coreference resolution and quotation attribution, and semantic role labeling. Interestingly, most decoders used in these works can be regarded as structure-aware solvers for addressing relaxations of integer linear programs. All these algorithms have a similar consensus-based architecture: they repeatedly perform certain "local" operations in the graph, until some form of local agreement is achieved. The local operations are performed at each factor, and they range between computing marginals, max-marginals, an optimal configuration, or a small quadratic problem, all of which are commonly tractable and efficient in a wide range of problems.

As a companion of this tutorial, we provide an open-source implementation of some of the algorithms described above, available at http://www.ark.cs.cmu.edu/AD3.

Instructors:

André F. T. Martins, research scientist, Instituto de Telecomunicações, Instituto Superior Técnico, and Priberam Informática
A. Martins is a research scientist at Priberam Labs. He received his dual-degree PhD in Language Technologies in 2012 from Carnegie Mellon University and Instituto Superior Técnico. His PhD dissertation was awarded Honorable Mention in CMU’s SCS Dissertation Award competition. Martins' research interests include natural language processing, machine learning, structured prediction, sparse modeling, and optimization. His paper "Concise Integer Linear Programming Formulations for Dependency Parsing" received a best paper award at ACL 2009.

top of the page

Syntax-Based Statistical Machine Translation

Philip Williams and Philipp Koehn

October 29, 2014 - Morning

Tutorial notes

Abstract:

The tutorial explains in detail syntax-based statistical machine translation with synchronous context free grammars (SCFG). It is aimed at researchers who have little background in this area, and gives a comprehensive overview about the main models and methods.

While syntax-based models in statistical machine translation have a long history, spanning back almost 20 years, they have only recently shown superior translation quality over the more commonly used phrase-based models, and are now considered state of the art for some language pairs, such as Chinese-English (since ISI's submission to NIST 2006), and English-German (since Edinburgh's submission to WMT 2012).

While the field is very dynamic, there is a core set of methods that have become dominant. Such SCFG models are implemented in the open source machine translation toolkit Moses, and the tutors draw from the practical experience of its development.

The tutorial focuses on explaining core established concepts in SCFG-based approaches, which are the most popular in this area. The main goal of the tutorial is for the audience to understand how these systems work end-to-end. We review as much relevant literature as necessary, but the tutorial is not a primarily research survey.

The tutorial is rounded up with open problems and advanced topics, such as computational challenges, different formalisms for syntax-based models and inclusion of semantics.

Instructors:

Philip Williams, University of Edinburgh
Philip Williams completed his Ph.D. at the University of Edinburgh in 2014, with a thesis on unification-based formalisms applied to syntax-based statistical machine translation. He has been recently the main contributor to the syntax-based MT model in the Moses decoder.
Philipp Koehn, Johns Hopkins University/University of Edinburgh
Philipp Koehn has worked on statistical machine translation since 1997, is leading the development of the open source machine translation toolkit Moses, and wrote the textbook for the field (published in 2010). He has worked on many aspects on machine translation, including syntax-based approaches.

top of the page

Embedding Methods for Natural Language Processing

Antoine Bordes and Jason Weston

October 29, 2014 - Morning

Tutorial notes

Abstract:

Embedding-based models are popular tools in Natural Language Processing these days. In this tutorial, our goal is to provide an overview of the main advances in this domain. These methods learn latent representations of words, as well as database entries that can then be used to do semantic search, automatic knowledge base construction, natural language understanding, etc. Our current plan is to split the tutorial into 2 sessions of 90 minutes, with a 30 minutes coffee break in the middle, so that we can cover in a first session the basics of learning embeddings and advanced models in the second session. This is detailed in the following.

Part 1: Unsupervised and Supervised Embeddings

We introduce models that embed tokens (words, database entries) by representing them as low dimensional embedding vectors. Unsupervised and supervised methods will be discussed, including SVD, Word2Vec, Paragraph Vectors, SSI, Wsabie and others. A comparison between methods will be made in terms of applicability, type of loss function (ranking loss, reconstruction loss, classification loss), regularization, etc. The use of these models in several NLP tasks will be discussed, including question answering, frame identification, knowledge extraction and document retrieval.

Part 2: Embeddings for Multi-relational Data

This second part will focus mostly on the construction of embeddings for multi-relational data, that is when tokens can be interconnected in different ways in the data such as in knowledge bases for instance. Several methods based on tensor factorization, collective matrix factorization, stochastic block models or energy-based learning will be presented. The task of link prediction in a knowledge base will be used as an application example. Multiple empirical results on the use of embedding models to align textual information to knowledge bases will also be presented, together with some demos if time permits.

Instructors:

Antoine Bordes, Facebook
Jason Weston, Facebook

top of the page

Natural Language Processing of Arabic and its Dialects

Mona Diab and Nizar Habash

October 29, 2014 - Afternoon

Tutorial notes

watch video

Abstract:

This tutorial introduces the different challenges and current solutions to the automatic processing of Arabic and its dialects. The tutorial has two parts: First, we present a discussion of generic issues relevant to Arabic NLP and detail dialectal linguistic issues and the challenges they pose for NLP. In the second part, we review the state-of-the-art in Arabic processing covering several enabling technologies and applications, e.g., dialect identification, morphological processing (analysis, disambiguation, tokenization, POS tagging), parsing, and machine translation.

Instructors:

Mona Diab, Associate Professor, George Washington University
Nizar Habash, New York University Abu Dhabi

top of the page

Text Quantification

Fabrizio Sebastiani

October 29, 2014 - Afternoon

Tutorial notes

watch video

Abstract:

In recent years it has been pointed out that, in a number of applications involving (text) classification, the final goal is not determining which class (or classes) individual unlabelled data items belong to, but determining the prevalence (or “relative frequency”) of each class in the unlabelled data. The latter task is known as quantification. Assume a market research agency runs a poll in which they ask the question “What do you think of the recent ad campaign for product X?” Once the poll is complete, they may want to classify the resulting textual answers according to whether they belong or not to the class LovedTheCampaign. The agency is likely not interested in whether a specific individual belongs to the class LovedTheCampaign, but in knowing how many respondents belong to it, i.e., in knowing the prevalence of the class. In other words, the agency is interested not in classification, but in quantification. Essentially, quantification is classification tackled at the aggregate (rather than at the individual) level. The research community has recently shown a growing interest in tackling quantification as a task in its own right. One of the reasons is that, since the goal of quantification is different than that of classification, quantification requires evaluation measures different than for classification. A second, related reason is that using a method optimized for classification accuracy is suboptimal when quantification accuracy is the real goal. A third reason is the growing awareness that quantification is going to be more and more important; with the advent of big data, more and more application contexts are going to spring up in which we will simply be happy with analyzing data at the aggregate (rather than at the individual) level. The goal of this tutorial is to introduce the audience to the problem of quantification, to the techniques that have been proposed for solving it, to the metrics used to evaluate them, and to the problems that are still open in the area.

Instructor:

Fabrizio Sebastiani, Principal Scientist, Qatar Computing Research Institute
Fabrizio Sebastiani is a Principal Scientist at QCRI; he was (until June 2014) a Senior Researcher at the Italian National Council of Research (from which he is currently on leave), and (until February 2006) an Associate Professor at the Department of Pure and Applied Mathematics of the University of Padova, Italy. His main current research interests are at the intersection of information retrieval, machine learning, and human language technologies, with particular emphasis on text classification, information extraction, opinion mining, and their applications.

top of the page

Accepted Tutorials

October 25, 2014 - Morning

October 25, 2014 - Afternoon

October 29, 2014 - Morning

October 29, 2014 - Afternoon

Sentiment Analysis of Social Media Texts

Saif M. Mohammad and Xiaodan Zhu

October 25, 2014 - Morning

Video: part 1; part 2

Abstract:

Instructors:

Spectral Learning Techniques for Weighted Automata, Transducers, and Grammars

Borja Balle, Ariadna Quattoni and Xavier Carreras

October 25, 2014 - Morning

Video: part 1; part 2

Abstract:

Instructors:

Semantic Parsing with Combinatory Categorial Grammars

Yoav Artzi, Nicholas FitzGerald, and Luke Zettlemoyer

October 25, 2014 - Afternoon

Abstract:

Instructors:

Linear Programming Decoders in Natural Language Processing: From Integer Programming to Message Passing and Dual Decomposition

André F. T. Martins

October 25, 2014 - Afternoon

Video: part 1; part 2

Abstract:

Instructors:

Syntax-Based Statistical Machine Translation

Philip Williams and Philipp Koehn

October 29, 2014 - Morning

Abstract:

Instructors:

Embedding Methods for Natural Language Processing

Antoine Bordes and Jason Weston

October 29, 2014 - Morning

Abstract:

Part 1: Unsupervised and Supervised Embeddings

Part 2: Embeddings for Multi-relational Data

Instructors:

Natural Language Processing of Arabic and its Dialects

Mona Diab and Nizar Habash

October 29, 2014 - Afternoon

Abstract:

Instructors:

Text Quantification

Fabrizio Sebastiani

October 29, 2014 - Afternoon

Abstract:

Instructor: