Supersense Tagging for Arabic: the MT-in-the-Middle Attack
Nathan Schneider, Behrang Mohit, Chris Dyer, Kemal Oflazer and Noah A. Smith
We consider the task of tagging Arabic nouns with WordNet supersenses. Three
approaches are evaluated. The first uses an expert-crafted but limited-coverage
lexicon, Arabic WordNet, and heuristics. The second uses unsupervised sequence
modeling. The third and most successful approach uses machine translation to
translate the Arabic into English, which is automatically tagged with English
supersenses, the results of which are then projected back into Arabic. Analysis
shows gains and remaining obstacles in four Wikipedia topical domains.
Back to Papers Accepted