Meta claims it has broken new grounds with its Universal Speech Translator, which it says is now capable of translating Hokkien. This particular Chinese dialect, for those who are unaware, is primarily spoken and not written, thus making it one of the more difficult languages for artificial intelligence (AI) systems to learn and translate to other languages.
For this purpose, as well as for other similar languages that lack a written form, the company has decided to train its Universal Speech Translator AI through a speech-to-speech approach. The company’s press release gets down to the nitty-gritty on how this is achieved, so feel free to head on over to its website for a more in-depth explanation. But if you just want to get the gist on how it works then stick around, as we’ll try to give you the crash course version regarding Meta’s AI wizardry.
— Meta Newsroom (@MetaNewsroom) October 19, 2022
According to said press release, the Universal Speech Translator AI first translates Hokkien into a sequence of acoustic sounds which is then used to create waveforms of the language, with each spoken word featuring its own unique waveform. These words are then matched with their respective Mandarin counterparts (which have been designated beforehand by the system’s developers) and will serve as the basis for translating it into other languages and back.
So, is it perfect? Well, Meta says the Hokkien translator is still a work in progress, as the AI can only translate one sentence at a time. In other words, it will still take a while before you can use the Universal Speech Translator to impress anyone who speaks the dialect, let alone conduct actual back and forth translation. But on the flip side, the company is releasing the tool as open-source so other researchers can build upon it.
Hokkien is the only unwritten language made known that the Universal Speech Translator’s AI is capable of supporting so far, but expect even more to be added as development progresses. Meta says that the project aims to “eventually allow real-time speech-to-speech translation across all extant languages, even primary spoken ones.” Aside from breaking down language barriers to ease communication for everyone, the company also made it clear that the tool will play a part in its ambitious Metaverse concept in the long run.