Three Knowledge-Free Methods for Automatic Lexical Chain Extraction
Steffen Remus and Chris Biemann
We present three approaches to lexical chaining based on the LDA topic model
and evaluate them intrinsically on a manually annotated set of German
documents. After motivating the choice of statistical methods for lexical
chaining with their adaptability to different languages and subject domains, we
describe our new two-level chain annotation scheme, which rooted in the concept
of cohesive harmony. Also, we propose a new measure for direct evaluation of
lexical chains. Our three LDA-based approaches outperform two
knowledge-based state-of-the art methods to lexical chaining by a large margin,
which can be attributed to lacking coverage of the knowledge resource.
Subsequent analysis shows that the three methods yield a different chaining
behavior, which could be utilized in tasks that use lexical chaining as a
component within NLP
applications.
Back to Papers Accepted