DISCRIMINATIVE JOINT MODELING OF LEXICAL VARIATION AND ACOUSTIC CONFUSION FOR AUTOMATED NARRATIVE RETELLING ASSESSMENT
Maider Lehr, Izhak Shafran, Emily Prudhommeaux and Brian Roark
Automatically assessing the fidelity of a retelling to the original
narrative -- a task of growing clinical importance -- is
challenging, given extensive paraphrasing during retelling along with
cascading
automatic speech recognition (ASR) errors. We present a word tagging
approach using conditional random fields (CRFs) that allows a
diversity of features to be considered during inference, including some
capturing
acoustic confusions encoded in word confusion networks. We evaluate the
approach under several scenarios, including both supervised and
unsupervised training, the latter achieved by training on the output
of a baseline automatic word-alignment model. We also adapt the ASR
models to the domain, and evaluate the impact of error rate on
performance. We find strong robustness to ASR
errors, even
using
just the 1-best system output. A hybrid approach making use of both
automatic alignment and CRFs trained tagging models achieves the best
performance, yielding strong improvements over using either approach
alone.
Back to Papers Accepted