START Conference Manager    

The Effects of Lexical Resource Quality on Preference Violation Detection

Jesse Dunietz, Lori Levin and Jaime Carbonell

The 51st Annual Meeting of the Association for Computational Linguistics - Short Papers (ACL Short Papers 2013)
Sofia, Bulgaria, August 4-9, 2013


Abstract

Lexical resources such as WordNet and Verb-Net are widely used in a multitude of NLP tasks, as are annotated corpora such as tree-banks. Often, the resources are used as-is, without question or examination. This practice risks missing significant performance gains and even entire techniques.

This paper addresses the importance of resource quality through the lens of a challenging NLP task: detecting selectional preference violations. We present DAVID, a simple, lexical resource-based preference violation detector. With as-is lexical resources, DAVID achieves an F1 -measure of just 28.27%. When the resource entries and parser outputs for to a small sample are corrected, however, the F1-measure on that sample jumps from 40% to 61.54%, and performance on other examples rises, suggesting that the algorithm becomes practical given cleaned-up resources. More broadly, this paper shows that resource quality matters tremendously, sometimes even more than algorithmic improvements.


START Conference Manager (V2.61.0 - Rev. 2792M)