To date, most artificial intelligence translation is focused on written languages because that is the easiest way to train machine learning systems to understand and translate them. However, over 40% of the world’s 7,000 living languages have no standard writing system.
Hokkien is a language widely spoken in southeastern mainland China and Taiwan, and within the Chinese diaspora in Singapore, Indonesia, Malaysia, the Philippines and other parts of Southeast Asia. It also has no standard writing system and is primarily only spoken. As a result, it made a perfect subject for a new real-time translation project that would open up new communication channels.
The effort is part of Meta AI’s Universal Speech Translator project, a machine language model that will eventually allow for the real-time translation of numerous different spoken and written languages that would allow anyone to easily communicate.
For Peng-Jen Chen, a Meta AI researcher, this particularly hit home because he grew up in Taiwan speaking Mandarin Chinese, however, his father spoke Taiwanese Hokkien. Although speakers of the two languages could understand each other, they were different enough that communication could get difficult when engaging in complex subjects.
“I have always wished my father could communicate with everyone in Taiwanese Hokkien, which is the language he’s most comfortable speaking,” said Chen. “He understands Mandarin well but speaks more slowly when communicating about complex topics.”
The challenge of building the new model was that most real-time translation AI technologies use written languages as the data collection and annotation basis for speech encoding. For example, English, Spanish, Mandarin Chinese and other written languages with numerous speakers make it relatively simple to mine data and build large models because there are already well-trained AI translation models in existence that have been annotated with accurate training data.
However, with a dialect such as Hokkien, there is no standard writing system and few speakers to work with. This makes it extremely difficult to build a vast data model. As a result, the researchers needed to find an intermediate language to bridge the two and so they used Mandarin, because of its similarities to Hokkien, to help build the initial model.
“Our team first translated English or Hokkien speech to Mandarin text, and then translated it to Hokkien or English — both with human annotators and automatically,” said Meta researcher Juan Pino. “They then added the paired sentences to the data used to train the AI model.”
The researchers also actively worked with native Hokkien speakers to make certain that the AI translation models were accurate.
The model itself is still a work in progress and will only work with languages that have a “bridge” language that allows for text-to-speech translation. Languages that do not have a closely similar language that can be bridged won’t be able to easily take advantage of this new model, but still opens up a larger number of new languages for universal translation.
The researchers will be making their model, code and benchmark data freely available for others to build their own AI real-time translation capabilities. It can currently only translate one sentence at a time, but it is a step toward simultaneous translation.
Opening up these resources for the translation of unwritten languages can also have a profound effect on bringing people into an increasingly digital world. Most interfaces for interacting with computers and devices require text and the speakers of unwritten languages get left out, as a result, they are less comfortable with technology.
Some languages without standardized written systems are also at risk of dying out. Linguists are attempting to preserve them as the number of speakers declines, but that’s difficult when there’s no written system. This could become a real problem with the increasing rise of digital technologies, which, as mentioned, leaves behind unwritten language speakers.
Meta AI’s engineers hope this first-step real-time translation model for Hokkien can pave the way for helping people who speak other unwritten languages in the future embrace the digital divide, speak more comfortably across language barriers and linguists preserve unwritten languages.
“I just want my father to be able to speak to whomever he wants,” Chen said.