Relation Extraction with Matrix Factorization and Universal Schemas
Sebastian Riedel, Limin Yao and Andrew McCallum
Traditional relation extraction predicts relations within some fixed and finite
target schema. Machine learning approaches to this task require either manual
annotation or, in the case of distant supervision, existing structured sources
of the same schema. The need for existing datasets can be avoided by using a
universal schema: the union of all involved schemas (surface form predicates as
in OpenIE, and relations in the schemas of pre-existing databases). This schema
has an almost unlimited set of relations (due to surface forms), and supports
integration with existing structured data (through the relation types of
existing databases). To populate a database of such schema we present a family
of matrix factorization models that predict affinity between database tuples
(the users) and relations (the items). We show that this achieves substantially
higher accuracy than the traditional classification approach. More importantly,
by operating simultaneously on relations observed in text and in pre-existing
structured DBs such as Freebase, we are able to reason about unstructured and
structured data in mutually-supporting ways. By doing so our approach
outperforms state-of-the-art distant supervision.
Back to Papers Accepted