Morphological Segmentation Inside-Out

Ryan Cotterell1, Arun Kumar2, Hinrich Schütze3
1Johns Hopkins University, 2Universitat Oberta Catalonia, UPC, Barcelona, 3Center for Information and Language Processing, University of Munich


Abstract

Morphological segmentation has traditionally been modeled with non-hierarchical models, which yield flat segmentations as output. In many cases, however, proper morphological analysis requires hierarchical structure---especially in the case of derivational morphology. In this work, we introduce a discriminative joint model of morphological segmentation along with the orthographic changes that occur during word formation. To the best of our knowledge, this is the first attempt to approach discriminative segmentation with a context-free model. Additionally, we release an annotated treebank of 7454 English words with constituency parses, encouraging future research in this area.