Discriminative Training of 150 Million Translation Parameters and Its Application to Pruning
Hendra Setiawan and Bowen Zhou
Until recently, the application of discriminative training to log linear-based
statistical machine translation has been limited to tuning the weights of a
handful of features
or training features with a limited number of parameters.
In this paper, we propose to scale up discriminative training to train features
with 150 million parameters,
which is one order of magnitude higher than previously published system, and to
apply discriminative training to redistribute probability mass that
is lost due to model pruning. The
experimental results confirm the effectiveness of our proposals on NIST MT06
test set
over a strong hierarchical phrase-based baseline.
Back to Papers Accepted