Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge

Nicholas Locascio1, Karthik Narasimhan2, Eduardo De Leon1, Nate Kushman3, Regina Barzilay1
1MIT, 2CSAIL, MIT, 3MSR


Abstract

This paper explores the task of translating natural language queries into regular expressions which embody their meaning. In contrast to prior work, the proposed neural model does not utilize domain-specific crafting, learning to translate directly from a parallel corpus. To fully explore the potential of neural models, we propose a methodology for collecting a large corpus of regular expression, natural language pairs. Our resulting model achieves a performance gain of 19.6% over previous state-of-the-art models.