Advances in AI-enabled language translation hold special promise for the developing world
The Association for Computing Machinery presents the ACM A.M. Turing Award annually. Often referred to as the “Nobel Prize of Computing,” the A.M. Turing Award is accompanied by a USD1 million prize funded by Google.
This past year, Geoffrey Hinton, Yoshua Bengio and Yann LeCun received the award for their contributions to machine learning using deep neural networks.
Deep learning is one of the most transformative technologies of artificial intelligence (AI) research, and has resulted in major breakthroughs in areas including computer vision, speech recognition, language processing, and robotics.
One of the most interesting advances made possible through deep learning technologies is machine translation, or the ability of computers to translate between languages. Using a new process called neural machine translation, AI language algorithms have resulted in far more precise language translations than were previously thought possible. Unlike earlier approaches to AI translation (such as statistical machine translation, which translated sentence fragments), neural machine translation translates entire sentences.
These very recent advances in machine translation and speech recognition, combined with the proliferation of smartphones around the world, means that people can bridge language gaps simply by carrying their phones with them and using one of the many new language translation apps that are available.
The potential of these new capabilities is far greater than helping a tourist find their way to a museum or restaurant during a holiday.
In the developing world, language barriers can significantly impede education, healthcare, and economic development—and even contribute to inter-communal violence. The challenge of overcoming language barriers comes into fuller view when you realize that there are some 2,000 languages spoken in Africa alone.
Great strides have already been made in deploying these new AI technologies in Africa.
For example, in a 2006 study, researcher Michael Levin found that in a large pediatric hospital in Cape Town, South Africa, only 6% of medical interviews with the parents of patients were conducted in their first language. In the study, parents cited language and cultural barriers as the major impediments to their effective participation in the healthcare rendered to their children.
Another study focusing on South Africa underscored how accessing online education could enhance people’s lives—if only language wasn’t a barrier. Authors Jade Abbott and Laura Martinus pointed out that, while the Internet comprises 53.5 % of English content, the remaining 10 official languages of South Africa comprise just 0.1% of online content. And, in the most urgent example, the nonprofit organization Translators without Borders notes that “basic phrases can save lives in a humanitarian emergency, yet often communication fails because humanitarian aid workers and the people affected do not speak the same language.”
A new goal for the AI community is finding ways to develop neural machine translation for these low-resource languages.
Great strides have already been made in deploying these new AI technologies in Africa. Google Neural Machine Translation, which offers translation between English and 103 languages around the world, now supports translation for 13 African languages, including Igbo, Swahili, and Zulu—three of the most popular on the continent. It is hard to quantify the benefits these new translation capabilities have already fostered.
At the same time, significant challenges remain. For current neural machine translation approaches to work well, the machine translation program must first have access to a considerable volume of text in each of the languages (or the language pair) from which the translation will take place. This may not be a problem when translating between English and Chinese, where countless volumes have already been translated, but it is a problem translating, for example, English and Sepedi, one of the other official languages of South Africa.
Languages that do not have considerable volumes of text available for translation are often referred to as “low-resource” languages. These languages can be “low-resource” for a variety of reasons including that the language is primarily oral in nature rather than written; there is a lack of standardized spelling; or because there are too many variations across different dialects.
So a new goal for the AI community is finding ways to develop neural machine translation for these low-resource languages. Promising approaches are being explored, and the academic community and leading tech companies are actively involved in the effort. For example, just last month, Facebook announced research grants for computer scientists who can develop robust translation algorithms for low-resource languages.
Because of the leapfrog advances that we have seen in this area of AI in just the last 15 years, the worldwide computing community is excited to see what innovations are on the horizon and how these technologies might continue to improve the human condition.