Minimum-Risk Training of Approximate CRF-Based NLP Systems

Veselin Stoyanov and Jason Eisner
Johns Hopkins University


Conditional Random Fields (CRFs) are a popular formalism for structured prediction in NLP. It is well known how to train low-treewidth CRFs, such as linear-chain CRFs. Some NLP phenomena, however, suggest CRFs with more complex topologies. Should such models be used, considering that they make exact inference intractable? Previous work recently argued for training parameters to minimize the task-specific loss of whatever approximate inference and decoding methods will be used at test time. We apply their method to three NLP problems, showing that (i) using more complex CRFs leads to improved performance, and that (ii) minimum-risk training learns more accurate models.