Compound Embedding Features for Semi-supervised Learning
Mo Yu, Tiejun Zhao, Daxiang Dong, Hao Tian and Dianhai Yu
To solve data sparsity problem, recently there has been a trend in
discriminative methods of NLP to use representations of lexical items learned
from unlabeled data as features. In this paper, we investigated the usage of
word representations learned by neural language models, i.e. word embeddings.
The direct usage has disadvantages such as large amount of computation,
inadequacy with dealing word ambiguity and rare-words, and the problem of
linear non-separability. To overcome these problems, we instead built compound
features from continuous word embeddings based on clustering. Experiments
showed that the compound features not only improved the performances on several
NLP tasks, but also ran faster, suggesting the potential of embeddings.
Back to Papers Accepted