What are the biggest challenges facing AI translation technology?

Published on July 22nd, 2020

Artificial intelligence has progressed a lot in recent years thanks to developments in machine learning, neural networks and data science. It now seems every form of software imaginable has AI alternatives that promise to automate human thinking and decision making – everything from autocorrect to music composition.

Translation technology is no different and this is actually one of the areas driving some of the strongest innovation in AI development. Tech giants like Google and Microsoft are investing billions into their respective translation systems and they are leading the lucrative race in a £40 billion + market.

However, there are some persistent challenges holding back progression.

Algorithmic bias is difficult to overcome

Algorithmic bias is one of the most talked-about problems within AI technology and it is also proving to be one of the most difficult to overcome. The stark reality is that algorithmic bias is inherited from the humans that build and train them and the social environment we live in.

Imagine an algorithm designed to predict the job of someone using only an image of their face. We know that jobs are disproportionately filled by genders, races, ages and other groups of people, which will be adopted by the algorithm being used.

If the majority of programmers are white males, then the algorithm is going to develop a basis that reflects this data.

With AI translation specifically, one of the most prevalent forms of bias relates to gender and this can be present in assuming nurses are female and doctors are male. This issue is made more complex by the fact that most languages rely less on gender pronouns than English and Google has tried to solve this by providing translations using both genders where relevant.

There are various other forms of bias that machine translation is susceptible to and they are not always moral issues either. Algorithms often develop a bias towards length that results in translation being shorter than they should be or even cause bias in the human translators who are influenced by the material they quality check.

The big question is: how can a technology developed by humans, who are so susceptible to bias, ever be free from bias?

Word accuracy is improving… but not much else

Data science has progressed enormously in recent years, but most of this progress is the result of faster and more powerful computing technology that is capable of crunching more numbers in a shorter time-frame. All of the advances in AI revolve around doing more, faster and this enables algorithms to compare more datasets, which results in more patterns being spotted.

This has resulted in greater word accuracy in AI translation and there is a fair reason to be optimistic that the technology can achieve a level of accuracy in the future that is reliable enough for professional use.

Word accuracy doesn’t even begin to solve AI translation’s language problems, though.

Humans do not simply communicate by using isolated words. These are paired with other words that affect their meaning, construct sentences, paragraphs and entire works of speech or writing. We provide context, imply meaning, use tone and articulation for emphasis, make comparatives, employ metaphors and add colour with satire, irony and a wealth of linguistic characteristics that cannot be defined by rules.

First, try explaining the concept of irony and how it is similar but also different from sarcasm and satire respectively – now try to turn this explanation into a dataset that algorithms can apply to human language.

Forget translation for a moment, AI is not even capable of detecting these characteristics in language, let alone translating them (this can be challenging enough for expert human translators). If AI algorithms ever get smart enough to accurately transcribe human speech into text (including languages, accents, dialects, speech impediments and all the other characteristics the human brain naturally computes), we might be able to address the bigger challenges of AI translation.

Posted on: July 22nd, 2020