A Neural Network Architecture for Multilingual Punctuation Generation

Miguel Ballesteros1 and Leo Wanner2
1Pompeu Fabra University, 2ICREA and Pompeu Fabra University


Abstract

Even syntactically correct sentences are perceived as awkward if they do not contain correct punctuation. Still, the problem of automatic generation of punctuation marks has been largely neglected for a long time. We present a novel model introduces punctuation marks into raw text material using a transition-based stack LSTM model. Unlike the state-of-the-art approaches, our model is language-independent and also neutral with respect to the intended use of the punctuation. Multilingual experiments show that it achieves high accuracy on the full range of punctuation marks across languages.