Minimally Supervised Method for Multilingual Paraphrase Extraction from Definition Sentences on the Web
Yulan Yan, Chikara Hashimoto, Kentaro Torisawa, Takao Kawai, Jun’ichi Kazama and Stijn De Saeger
We propose a minimally supervised method for multilingual paraphrase extraction
from definition sentences on the Web. Hashimoto et al. (2011) extracted
paraphrases from Japanese definition sentences on the Web, assuming that
definition sentences defining the same concept tend to contain paraphrases.
However, their method requires manually annotated data and is language
dependent. We extend their framework and develop a minimally supervised method
applicable to multiple languages. Our experiments show that our method is
comparable to Hashimoto et al.’s for Japanese and outperforms previous
unsupervised methods for English, Japanese, and Chinese, and that our method
extracts 10,000 paraphrases with 92% precision for English, 82.5% precision for
Japanese, and 82% precision for Chinese.
Back to Papers Accepted