Distant Supervision for Relation Extraction with an Incomplete Knowledge Base
Bonan Min, Ralph Grishman, Li Wan, Chang Wang and David Gondek
Distant supervision, heuristically labeling a corpus using a knowledge base,
has emerged as a popular choice for training relation extractors. In this
paper, we show that a significant number of “negative“ examples generated
by the labeling process are false negatives because the knowledge base is
incomplete. Therefore the heuristic for generating negative examples has a
serious flaw. Building on a state-of-the-art distantly-supervised extraction
algorithm, we proposed an algorithm that learns from only positive and
unlabeled labels at the pair-of-entity level. Experimental results demonstrate
its advantage over existing algorithms.
Back to Papers Accepted