Topic Segmentation with a Structured Topic Model
Lan Du and Wray Buntine
We present a new hierarchical Bayesian model for unsupervised topic
segmentation. This new model integrates a point-wise boundary sampling
algorithm used in Bayesian segmentation into a structured topic model that can
capture a simple hierarchical topic structure latent in documents. We develop
an MCMC inference algorithm to split/merge segment(s). Experimental results
show that our model outperforms previous unsupervised segmentation methods
using only lexical information on Choi's datasets and two meeting transcripts
and has performance comparable to those previous methods on two written
datasets.
Back to Papers Accepted