Tutorial: Recent advances in dependency parsing

Sunday 27 April, morning

Presenters: Ryan McDonald, Joakim Nivre

Tutorial contents

Syntactic parsing is a fundamental problem in natural language processing which has been tackled using a wide variety of approaches. In recent years, there has been a surge of interest in parsers that make use of dependency structures, which offer a simple and transparent encoding of predicate-argument structure and can be derived accurately and efficiency using parsers trained on annotated corpora. Thanks to their simplicity, transparency and efficiency, dependency parsers are in widespread use for applications such as information extraction, question answering, machine translation, language modeling, semantic role labeling, and textual entailment.

This tutorial will focus on advances in dependency parsing that are not covered in textbooks or previous tutorials, which means roughly work from 2008 and onwards. However, in order to make the material accessible to participants without a background in dependency parsing, we will spend roughly the first quarter of the tutorial going over basic concepts and techniques in the field, including the theoretical foundations of dependency grammar and basic definitions of representations, tasks, and evaluation metrics. After reviewing the basic concepts, we will introduce the two dominant paradigms in early work on data-driven dependency parsing -- global, exhaustive, graph-based parsing and local, greedy, transition-based parsing -- and review the contrastive error analysis presented in McDonald and Nivre (2007), which highlighted the strengths and weaknesses of the two models and set the challenge to improve both graph-based and transition-based methods. This provides a basis for understanding many of the later developments covered in the tutorial. The rest of the tutorial will be divided into two main parts, covering advances in graph-based parsing and related approaches, on the one hand, and advances based on transition-based parsing, on the other. We will finish off with a synthesizing conclusion and outlook for the future.

Research on graph-based dependency parsing in recent years has to a large extent been driven by the wish to make efficient use of higher-order models, thereby overcoming the limitations of strictly local feature representations found in early models. As a consequence, there has been developments towards specialized exact inference and approximate inference methods, the latter especially for non-projective parsing. In addition, there has been work on trying to find exact dynamic programming solutions for restricted subsets of non-projective structures, often referred to as mildly non-projective dependency trees.

Recent work on transition-based dependency parsing has focused on two lines of research, often in combination. The first line has been concerned with improving the search techniques through the use of beam search, dynamic programming, and easy-first inference, thereby overcoming the limitations of greedy left-to-right search. The second line has been to improve the learning methods by moving to global structured learning and or imitation learning with exploration, thereby countering the negative effects of local classifier learning. In addition, there has been work on joint morphological and syntactic analysis.

Biography

Ryan McDonald is a Research Scientist at Google Inc. His research interests include parsing techniques and complexity for various linguistic formalisms, information extraction, machine learning for structured and large-scale natural language processing, domain adaptation and cross-lingual projection of linguistic structure, and universal syntactic annotation. He is one of the leading experts on dependency parsing and has done ground-breaking work on graph-based dependency parsing, non-projective parsing techniques, and cross-lingual transfer parsing. He is also a co-author of the standard textbook on dependency parsing.

Joakim Nivre is Professor of Computational Linguistics at Uppsala University. His research interests include robust and efficient techniques for syntactic parsing, search-based learning for natural language processing, joint morphological and syntactic processing, and universal syntactic annotation. He is one of the leading experts on dependency parsing has done ground-breaking work on transition-based dependency parsing, non-projective parsing techniques, and joint morphological and syntactic analysis. He is also a co-author of the standard textbook on dependency parsing.