ParaBank: paraphrase generation from bilingual data

Paraphrases are quite useful for understanding semantics, yet most resources are lexical. We combined back-translation (from ParaNMT) and constrained decoding in machine translation to produce a large-scale sentence-level paraphrase resource. We used human evaluation to show that this new resource is indeed of high-quality, and trained a generation model to produce paraphrases for any English sentences.

My first paper ever. Much thanks to my co-authors (Rachel, Matt, and Ben) for helping me pull this off.

The paper is titled “ParaBank: Monolingual Bitext Generation and Sentential Paraphrasing via Lexically-constrained Neural Machine Translation,” which I presented at AAAI 2019.

Please check it out on arXiv: or here.