Bilingually-constrained Synthetic Data for Implicit Discourse Relation Recognition

Changxing Wu1, xiaodong shi1, Yidong Chen2, Yanzhou Huang1, jinsong su1
1Xiamen University, 2Department of Cognitive Science, School of Information Science and Technology, Xiamen University


Abstract

To alleviate the shortage of labeled data, we propose to use bilingually-constrained synthetic implicit data for implicit discourse relation recognition. These data are extracted from a bilingual sentence-aligned corpus according to the implicit/explicit mismatch between different languages. Incorporating these data via a multi-task neural network model achieves significant improvements over baselines, on both the English PDTB and Chinese CDTB data sets.