ACL Short Papers 2013 - The Effects of Lexical Resource Quality on Preference Violation Detection

The Effects of Lexical Resource Quality on Preference Violation Detection

Jesse Dunietz, Lori Levin and Jaime Carbonell

The 51st Annual Meeting of the Association for Computational Linguistics - Short Papers (ACL Short Papers 2013)
Sofia, Bulgaria, August 4-9, 2013

Abstract

Lexical resources such as WordNet and Verb-Net are widely used in a multitude of NLP tasks, as are annotated corpora such as tree-banks. Often, the resources are used as-is, without question or examination. This practice risks missing significant performance gains and even entire techniques.

This paper addresses the importance of resource quality through the lens of a challenging NLP task: detecting selectional preference violations. We present DAVID, a simple, lexical resource-based preference violation detector. With as-is lexical resources, DAVID achieves an F1 -measure of just 28.27%. When the resource entries and parser outputs for to a small sample are corrected, however, the F1-measure on that sample jumps from 40% to 61.54%, and performance on other examples rises, suggesting that the algorithm becomes practical given cleaned-up resources. More broadly, this paper shows that resource quality matters tremendously, sometimes even more than algorithmic improvements.

START Conference Manager (V2.61.0 - Rev. 2792M)