Is Machine Translation Good Enough for Subtitling?

Published on April 17th, 2023

As AI technology improves, we have to re-ask questions such as is machine translation good enough for subtitling? With GPT-4 impressing the world with its capabilities, you might be wondering how long it will be before machines replace human translators for tasks like subtitling.

So let’s explore the current state of AI-powered machine translation and how close it is to matching professional language experts.

How good is machine translation for subtitles?

Many different studies have tried to answer this question, but the results can vary. If we look at one specific use case – for example, YouTube auto subtitles and captions – the average accuracy ranges between 60% and 70%.

This means we should expect roughly one in three words and sentences to be incorrect.

That doesn’t sound very reliable but we have to remember how much work algorithms are doing here. They are not simply translating text from one language to another. First, they have to recognise the speech of the original content and then, translate it into the target language.

The accuracy of machine-translated subtitles will vary, depending on several factors, including:

  • The source language of the footage
  • The target language of the subtitles
  • The linguistic distance between both languages
  • Audio quality in the footage
  • Dialogue speed and clarity
  • The number of speakers
  • Accents, interruptions, intonations, background noises – and other interferences.

Given the challenges AI translation algorithms face with subtitling, a 60%-70% success rate is still impressive. For the casual viewer, this may be enough to provide a passable viewing experience too.

Keep in mind, the average accuracy rate will be lower for languages with a greater linguistic distance. For example, English to Spanish should yield better results versus English to Chinese or English to Greek.

Translating subtitles: Understanding the scale of the challenge

If you need 100% accuracy in your translated subtitles, you can’t rely on AI translation alone. In fact, by definition, AI translation can’t generate subtitles without the help of other technologies. With platforms like YouTube’s auto-generated subtitles, they’re combining an array of different technologies – namely in the form of speech recognition, transcription and machine translation.

Speech recognition is a challenge in itself. We have access to plenty of AI-powered speech recognition tools like Google’s Live Transcribe these days. So we know how reliable (or unreliable) this technology is. This isn’t because the technology itself is flawed, though; it is because the very act of speech recognition is incredibly complex and there are different layers pertaining to this complexity.

Transcription itself is actually the easiest part of this process for machines, but relies entirely on the accuracy of speech recognition.

By the time you throw translation into the mix, it is not simply a case of asking algorithms to accurately recognise and transcript speech. You are asking the technology to convert this text accurately into different languages with different lexicon, grammar rules, cultural backgrounds and many other complexities.

Will AI ever crack translation and subtitling?

Truth is, we don’t really know if AI will ever match human linguistic capabilities. The fact is, humans are remarkably competent at communicating and understanding complex meanings. The act of formulating and understanding speech itself is a cognitive marvel, but we have enriched our languages with incredible depth over thousands of years.

We have sarcasm, irony, satire and a wealth of subtext that we can change by simply shifting tone and intonation.

We have metaphors, colloquialisms and polysemantics – even contronyms that can also mean the opposite of themselves.

With implied meaning, context and so much packed into everything we say, algorithms will struggle to match human comprehension. Forget about accurate translation into other languages – at least, for now.

Despite all this, AI-powered machine translation has made remarkable progress in recent decades and continues to do so. Progress has come from algorithmic breakthroughs, but most of this has come from advances in computational power.

Essentially, today’s algorithms are capable of running vast numbers of calculations that simply weren’t possible in the same time frame ten years ago.

The boost in speed and power is helping tools like machine translation automate more of the translation for language experts. This percentage will only increase with time, but we don’t truly know how far machine translation can go – or how quickly.

What we do know is that the technology is here to stay and lead the way as it will continue to play a much bigger role in tasks like translating subtitles, multimedia content and voiceovers. However, human translators will be doing the bulk of this complex work for quite some time still, including the post-editing part of such AI-translation-generated outputs.

Need a better system for translating subtitles?

If you need translated subtitles or you are currently looking for a more effective and fast way to produce them with the use of machine translation, our language technology experts can help. Please complete the form on our contact page with your subtitling translation request and we will get back to you.

Posted on: April 17th, 2023