Preliminary Program

Unsupervised Word Alignment by Agreement Under ITG Constraint

Hidetaka Kamigaito¹, Akihiro Tamura², Hiroya Takamura¹, Manabu Okumura¹, Eiichiro Sumita³
¹Tokyo Institute of Technology, ²National Institute of Information and Communications Technology, ³National Institute of Information and Communication Technology

Abstract

We propose a novel unsupervised word alignment method that uses a constraint based on Inversion Transduction Grammar (ITG) parse trees to jointly unify two directional models. Previous agreement methods are not helpful for locating alignments with long distances because they do not use any syntactic structures. In contrast, the proposed method symmetrizes alignments in consideration of their structural coherence by using the ITG constraint softly in the posterior regularization framework (Ganchev et al., 2010). The ITG constraint is also compatible with word alignments that are not covered by ITG parse trees. Hence, the proposed method is robust to ITG parse errors compared to other alignment methods that directly use an ITG model. Compared to the baseline agreement method (Ganchev et al., 2010), the experimental results show that the proposed method significantly improves alignment performance regarding Japanese-English KFTT and BTEC corpus, and machine translation performance on the Japanese-English IWSLT 2007 corpus.