Multilingual Word Vector

Languages: Chinese English French Russian German

Word vector is a kind of word representation method commonly used in deep learning, which can express both the word itself and its semantic association with other words.


Word vector technology is an important way for an efficient quantitative expression of the natural language vocabulary. Based on the large-scale parallel corpus, we use the neural network model to construct the multi-word vector library with the Chinese or English as the core bridging language, and the monolingual corpus and the sentence alignment corpus as the training data, which can effectively solve a variety of cross-language tasks, including multi-language text classification, multi-language text clustering, multi-language sentiment analysis, and cross-language search engine.