Robust Systems for Preposition Error Correction Using Wikipedia Revisions

Aoife Cahill, Nitin Madnani, Joel Tetreault and Diane Napolitano

We show that existing methods for training preposition error correction systems, whether using well-edited text or error-annotated corpora, do not generalize across very different test sets. We present a new, large errorannotated corpus and use it to train systems that generalize across three different test sets, each from a different domain and with different error characteristics. This new corpus is automatically extracted from Wikipedia revisions and contains over one million instances of preposition corrections.

Back to Papers Accepted