I. Purpose of 2020 AI Lab
The “2020 AI Lab” is established for the purpose of constant technological innovation, development of core technologies, carrying out scientific research cooperation, attracting outstanding talents, maintaining the leading position in technology and building competitive advantage, so as to create a cognitive intelligence innovation platform featuring multi-language, big data and intelligence, to carry out basic technical research in such fields as multilingual natural language processing, multilingual machine translation, multilingual speech recognition, image recognition, video analysis and big data analysis, and promote the transformation and industry applications of the achievements of new technologies.
II. Main Research Orientations
1. Research on Multilingual Natural Language Processing Technology
With the research and development of multilingual word segmentation, POS tagging, syntactic analysis, semantic role annotation and semantic disambiguation, etc., it provides technical support for cross-language text data mining, carries out NLP algorithm optimization for statistical machine learning to enhance the accuracy and efficiency of natural language comprehension; researches on the ontology-based semantic search technology to achieve more intelligent information discovery; combines with machine translation, ambiguity resolution and other technologies for the research and development of cross-language search algorithm to expand the scope of user information acquisition; carries out innovative technologies and product development in various fields with the comprehensive use of knowledge graphs, statistical machine learning and neural networks, to enhance the system’s cognitive level and self-service capabilities.
2. Research on Multilingual Machine Translation Technology
The statistical machine translation technology optimization covers phrase model, hierarchical phrase model, tree-to-string and tree-to-tree model, etc. It can further enhance the translation performance of statistical models based on large-scale corpus training; it carries out field adaptive technology research, including data selection, data extraction, data mining, phrase list fusion and other technologies; it researches on the neural machine translation technology, including coding and decoding technology, parallel computing technology, unknown word translation and long sentence translation, etc, to enhance the efficiency of neural network training and decoding while improving the accuracy.
3. Research on Multilingual Speech Recognition Technology
Its construction of large-scale multilingual voice corpus has completed collection of voice data in six languages with no less than 10,000 hours for each language; on the basis of large-scale corpus, it researches and develops the deep learning-based voice wake-up, endpoint detection, speech recognition, semantic comprehension, speech synthesis and other technologies to enhance the human-computer interaction capability in the era of artificial intelligence; it researches on the voice noise reduction technology in complex environments to improve the robustness of mobile voice applications in complex environments; it combines with speech recognition, machine translation, natural language processing and other technologies, to develop cross-language artificial intelligence voice assistant and remove cross-language communication barriers.
4. Research on Big Data Application of Image and Video
Based on the deep learning technology, it researches and develops the content identification and target detection platform, which can accurately identify persons, objects, scenes, texts and other important targets in images and video images, providing technical support for the intelligent application of images and videos; it researches and develops the extraction, storage, indexing, retrieval and collaborative recommendation technologies of massive visual structured data, develops the storage, analysis and recommendation system of large-scale visual big data, providing the basic application platform of image and video big data for governments and enterprises; based on the video structured data, it researches on the big data application in images, videos, languages, texts and other fields, so as to enhance the commercial value and social value of data in the form of multi-dimensional integration.
III. Joint Laboratory
1.Joint laboratory for Chinese-Portuguese-English Machine Translation (with Macao Polytechnic Institute, Guangdong University of Foreign Studies)
2.Joint Laboratory of Industrial Big Data (with Haier)
3.Joint laboratory of News Big Data (Guangdong University of Foreign Studies)
4.Joint Laboratory of Law Big Data (China University of Political Science and Law)
5.The “Belt and Road” Big Data Laboratory (Xi’an International Studies University)