Grouping Language Model Boundary Words to Speed K–Best Extraction from Hypergraphs
Kenneth Heafield, Philipp Koehn and Alon Lavie
We propose a new algorithm to approximately extract top-scoring hypotheses from
a hypergraph when the score includes an N-gram language model.
In
the
popular
cube pruning algorithm, every hypothesis is annotated with boundary words and
permitted to recombine only if all boundary words are equal. However, many
hypotheses share some, but not all, boundary words. We use these common
boundary words to group hypotheses and do so recursively, resulting in a tree
of hypotheses. This tree forms the basis for our
new
search
algorithm
that
iteratively refines groups of boundary words on demand. Machine translation
experiments show our algorithm makes translation 1.50 to 3.51 times as fast as
with cube pruning in common cases.
Back to Papers Accepted