Measuring the Structural Importance through Rhetorical Structure Index
Narine Kokhlikyan, Yuqi Zhang, Alex Waibel and Joy Zhang
In this paper, we propose a novel Rhetorical
Structure Index (RSI) to measure the structural importance of a word or a
phrase.
Unlike TF-IDF and other content-driven measurements, RSI identifies words or
phrases that are structural cues in an unstructured document.
We show structurally motivated features with high RSI values are more useful
than content-driven features for applications such as segmenting the
unstructured lecture transcription into meaningful segments. Experiments show
that using RSI significantly improves the segmentation accuracy compared to
the traditional content-based feature weighting scheme such as TF-IDF.
Back to Papers Accepted