Minibatch and Parallelization for Online Large-Margin Structured Learning
Kai Zhao and Liang Huang
Online learning algorithms such as perceptron and MIRA have become popular for
many NLP tasks thanks to their simpler architect-ure and faster convergence
over batch learning methods. However, while batch learning such as CRF is
easily parallelizable, online learning is much harder to parallelize: previous
efforts often witness a decrease in the converged accuracy, and the speedup is
typically very small (∼3) even with many (10+) processors. We instead present
a much simpler architecture based on “mini-batchesâ€, which is trivially
parallelizable. We show that, unlike previous methods, minibatch learning (in
serial mode) actually improves the converged accuracy for both perceptron and
MIRA learning, and when combined with simple parallelization, minibatch leads
to very significant speedups (up to 9x on 12 processors) on state-of-the-art
parsing and tagging systems.
Back to Papers Accepted