ACL 2018: A Stochastic Decoder for Neural Machine Translation

A Stochastic Decoder for Neural Machine Translation

Philip Schulz, Wilker Aziz, Trevor Cohn

The process of translation is ambiguous, in that there are typically many valid translations for a given sentence. This gives rise to significant variation in parallel corpora, however, most current models of machine translation do not account for this variation, instead treating the problem as a deterministic process. To this end, we present a deep generative model of machine translation which incorporates a chain of latent variables, in order to account for local lexical and syntactic variation in parallel corpora. We provide an in-depth analysis of the pitfalls encountered in variational inference for training deep generative models. Experiments on several different language pairs demonstrate that the model consistently improves over strong baselines.

View in Schedule | Find in ACL Anthology | Find Video Recording