Using Supertags and Encoded Annotation Principles for Improved Dependency to Phrase Structure Conversion

Seth Kulick,  Ann Bies,  Justin Mott
Linguistic Data Consortium, University of Pennsylvania


Abstract

We investigate the problem of automatically converting from a dependency representation to a phrase structure representation, a key aspect of understanding the relationship between these two representations for NLP work. We implement a new approach to this problem, based on a small number of supertags, along with an encoding of some of the underlying principles of the Penn Treebank guidelines. The resulting system significantly outperforms previous work in such automatic conversion. We also achieve comparable results to a system using a phrase-structure parser for the conversion. A comparison with our system using either the part-of-speech tags or the supertags provides some indication of what the parser is contributing.