Abstract: A Beam-Search Decoder for Normalization of Social Media Text with Application to Machine Translation
Pidong Wang and Hwee Tou Ng
Social media texts are written in an informal style, which hinders
other natural language processing (NLP) applications such as machine
translation. Text normalization is thus important for processing of
social media text. Previous work mostly focused on normalizing words
by replacing an informal word with its formal form. In this paper, to
further improve other downstream NLP applications, we argue that other
normalization operations should also be performed, e.g., missing word
recovery and punctuation correction. A novel beam-search decoder is
proposed to effectively integrate various normalization
operations. Empirical results show that our system obtains
statistically significant improvements over two strong baselines in
both normalization and translation tasks, for both Chinese and
English.
Back to Papers Accepted