Simultaneous Word-Morpheme Alignment for Statistical Machine Translation
Elif Eyigoz, Daniel Gildea and Kemal Oflazer
Current word alignment models for statistical machine
translation do not address morphology beyond
merely splitting words. We present a twolevel
alignment model that distinguishes between
words and morphemes, in which we embed an
IBM Model 1 inside an HMM based word alignment
model. The model jointly induces word and
morpheme alignments using an EM algorithm. We
evaluated our model on Turkish-English parallel
data. We obtained significant improvement of BLEU
scores over IBM Model 4. Our results indicate that
utilizing information from morphology improves the
quality of word alignments.
Back to Papers Accepted