News Source:GTCOMDate: 20 February 2024views:951
Recently, GTCOM GeWu has delivered a new breakthrough. The GeWuMT-18B model, oriented towards machine translation tasks, supports 81 languages. Professional evaluation proves it better than Meta’s NLLB-54B model: GeWuMT-18B has a parameter scale one-third of the Meta model’s, yet its average BLEU value is 27% higher in the automatic evaluation for translation from 80 languages to Chinese. This marks new progress in the quality of the translation results delivered by this big mode in terms of fluency and accuracy.
Here is a comparison of translation results of GeWuMT-18B and other online translation engines:
GTCOM started the research and development of machine translation in 2014 and has since been dedicated to the research and development of independent and controllable AI-based machine translation engines, going through the phases of statistical machine translation, neural machine translation, and now big model machine translation. In 2020, it undertook the Ministry of Science and Technology’s2030 major project "Research on Multilingual Automatic Translation with Chinese as the Core". In 2021, it undertook the Ministry of Industry and Information Technology’stask of "Ultra-large-scale Multilingual Universal Machine Translation System". It also undertook Yunnan Provincial Department of Science and Technology’s project "Research and Industrial Application of Ultra-Large-Scale Neural Machine Translation Model with Chinese as the Core". Relevant experience has helped the company form a solid technological foundation in ultra-large-scale multi-language machine translation. In the WMT 2022 and 2023, with its large-scale multi-language models, GTCOM topped automatic evaluation rankings for up to seven language directions, one of the best in the industry.
GTCOM launched the research and development of the cross-language multi-modal big model technology in 2021 and released GeWu big model in November 2022. It covers four types of models: multi-language pre-training model, multi-language machine translation ultra-large model, multi-modal pre-training model and multi-language generative dialogue big model. GTCOM’s"Research and Demonstrative Application of Controlled Content Generation Big Model Technology Based on Multi-modal Thinking Chain Reasoning" mainly tackles problems such as the lack of aligned Chinese corpus in the four modes of texts, images, audio and video, and the limited single-model architecture capability and poor responsiveness in application. Breakthroughs in content understanding and controllable generation technology based on the multi-modal thinking chain has enabled the company to develop a four-modal big model that supports input and output in any of the four modes based on the multi-modal thinking chain. It has been in use in national defense and government affairs, and demonstrative application will be carried out in fields such as science and technology and finance. GTCOM also participated in the drafting of standards such as "Large-scale Pre-training Model Technology and Application Assessment Methods Part 1: Model Development", "Large-Scale Pre-training Model Technology and Application Assessment Methods Part 2: Model Capabilities", and “Model Technology and Application Assessment Methods Part 4: Model Application”.