NASH: Toward End-to-End Neural Architecture for Generative Semantic Hashing

Dinghan Shen, Qinliang Su, Paidamoyo Chapfuwa, Wenlin Wang, Guoyin Wang, Ricardo Henao, Lawrence Carin

Semantic hashing has become a powerful paradigm for fast similarity search in many information retrieval systems. While fairly successful, previous techniques generally require two-stage training, and the binary constraints are handled \emph{ad-hoc}. In this paper, we present an \emph{end-to-end} Neural Architecture for Semantic Hashing (NASH), where the binary hashing codes are treated as \emph{Bernoulli} latent variables. A neural variational inference framework is proposed for training, where gradients are directly backpropagated through the discrete latent variable to optimize the hash function. We also draw the connections between proposed method and \emph{rate-distortion theory}, which provides a theoretical foundation for the effectiveness of our framework. Experimental results on three public datasets demonstrate that our method significantly outperforms several state-of-the-art models on both \emph{unsupervised} and \emph{supervised} scenarios.