Exploring Semi-Supervised Coreference Resolution of Medical Concepts using Semantic and Temporal Features

Preethi Raghavan1,  Eric Fosler-Lussier1,  Albert Lai2
1Department of Computer Science and Engineering, The Ohio State University, 2Department of Biomedical Informatics, The Ohio State University


Abstract

We investigate the task of medical concept coreference resolution in clinical text using two semi-supervised methods, co-training and multi-view learning with posterior regularization. By extracting semantic and temporal features of medical concepts found in clinical text, we create conditionally independent data views; co-training MaxEnt classifiers on this data works almost as well as supervised learning for the task of pairwise coreference resolution of medical concepts. We also train MaxEnt models with expectation constraints, using posterior regularization, and find that posterior regularization performs comparably to or slightly better than co-training. We describe the process of semantic and temporal feature extraction and demonstrate our methods on a corpus of case reports from the New England Journal of Medicine and a corpus of patient narratives obtained from The Ohio State University Wexner Medical Center.