Embedding Learning Through Multilingual Concept Induction

Philipp Dufter, Mengjie Zhao, Martin Schmitt, Alexander Fraser, Hinrich Schütze

We present a new method for estimating vector space representations of words: embedding learning by concept induction. We test this method on a highly parallel corpus and learn semantic representations of words in 1259 different languages in a single common space. An extensive experimental evaluation on crosslingual word similarity and sentiment analysis indicates that concept-based multilingual embedding learning performs better than previous approaches.