Neural Morphological Analysis: Encoding-Decoding Canonical Segments

Katharina Kann1, Ryan Cotterell2, Hinrich Schütze3
1LMU Munich, 2Johns Hopkins University, 3Center for Information and Language Processing, University of Munich


Abstract

Canonical morphological segmentation aims to divide words into a sequence of standardized segments. In this work, we propose a character-based neural encoder-decoder model for this task. Additionally, we extend our model to include morpheme-level and lexical information through a neural reranker. We set the new state of the art for the task improving previous results by up to 21% accuracy. Our experiments cover three languages: English, German and Indonesian.