Google has announced a new project to build an AI model that can support the world’s 1,000 most spoken languages. The company has presented an AI model that has been trained in over 400 languages, which it describes as the “largest language coverage seen in a speech model today.” This new project emphasizes Google’s commitment to language and AI.
Google has announced the development of a “giant” AI language model that can handle more than 1,000 global languages. The company has been working on the project for a while now, and it’s already made some progress. With the help of machine learning, Google has been able to translate between languages with “zero human intervention.” Now, with the new AI language model, the company is hoping to take things to the next level. The goal is to make it easier for people to communicate with each other, regardless of the language they speak.
As a Google senior executive mentioned, language is a key aspect of communication and making sense of the world. Around 7000 different languages are spoken globally, but only a few are represented in the right context.
Despite undertaking many ambitious projects, Google believes that this is the right one that would take a few years to complete and bring to life. Nevertheless, it is confident in reaching the goal. Google created a new Universal Speech Model that is usually trained over the likes of about 400 different languages and it provides a lot of coverage in terms of a speech model. The tech giant is joining hands with a few more communities to source data linked to speech.
It’s clear that Google is committed to expanding its language offerings. Recently, the company added 24 new languages to Google Translate and 9 new African languages to Gboard.
Similarly, the search engine giant is working hard with various NGOs and different academic places for collecting audio samples related to different dialects present in the region.
Other large tech companies are also creating their own mega language models. In July, Meta put forward a new AI model called “No Language Left Behind” that can translate over 200 different languages.
Meta has put forth new efforts in order to add content to communities that are not well represented across the web. The AI model that Meta currently possesses translations for is around 55 different African languages. This is a huge advancement.
Remember that today, less than 25 African languages are supported by a number of translation tools, so having 55 is a major achievement.
Google has already started integrating these language models into some of its products like Google Search, while fending off criticism about the systems’ functionality. Language models exhibit a number of flaws, which include a tendency to regurgitate harmful societal biases like racism and xenophobia, and an inability to parse language with human sensitivity. These models are capable of performing many tasks, though, from language generation (like OpenAI’s GPT-3) to translation (see Meta’s No Language Left Behind work). Google’s “1,000 Languages Initiative” is not targetting any particular functionality, but instead on generating a single system with a huge breadth of knowledge across the world’s languages.
Although the company did not give any specific examples, they did say that they expect this model to have a range of uses across Google’s products. Some potential examples could be Google Translate or YouTube captions.