Adaptation of Reordering Models for Statistical Machine Translation
Boxing Chen, George Foster and Roland Kuhn
Previous research on domain adaptation (DA) for statistical machine translation
(SMT) has mainly focused on the translation model (TM) and the language model
(LM). To the best of our knowledge, there is no previous work on reordering
model (RM) adaptation for phrase-based SMT. In this paper, we demonstrate that
mixture model adaptation of a lexicalized RM can significantly improve SMT
performance, even when the system already contains a domain-adapted TM and LM.
We ï¬nd that, surprisingly, different training corpora can vary widely in
their reordering characteristics for particular phrase pairs. Furthermore,
particular training corpora may be highly suitable for training the TM or the
LM, but unsuitable for training the RM, or vice versa, so mixture weights for
these models should be estimated separately. An additional contribution of the
paper is to propose two improvements to mixture model adaptation: smoothing the
in-domain sample, and weighting instances by document frequency. Applied to
mixture RMs in our experiments, these techniques (especially smoothing) yield
significant performance improvements.
Back to Papers Accepted