AI dubbing tools promise to revolutionise international media. In theory, AI algorithms can generate dubbing tracks for a fraction of the price of traditional production methods. Now, we are seeing a bunch of AI dubbing tools hit the market and high-profile names like Netflix striking deals with dubbing providers to auto-dub feature films.
Anyone in the language services industry can understand the hype around AI tools and how good (or bad) in reality they can be when it comes to such projects. In this article, we discuss AI dubbing, how good these tools are and whether you should consider using them for your own projects.
This depends on the goals of a given project and the technology being used. In the age of deepfake, AI dubbing tools are now available that can not only produce dialogue in other languages – but even replicate the original actor’s voice and deepfake their facial movements to match the language they are supposedly speaking.
In theory, the days of watching Hollywood stars speak in Japanese with mismatched voices and lip movements are over.
Now, we can use a variety of tools to translate original dialogues, match the voice and performance of the original speaker and simulate lip movements to make it look like they are actually speaking the same language.
The idea is that this technology can make productions accessible to audiences around the world.
AI dubbing services are a combination of multiple different services:
- Transcription: If a text version of the dialogue isn’t available, it needs transcribing into the original language.
- Translation: This translates the original dialogue into each target language.
- Voice simulation: AI algorithms analyse the voice of the original speaker in order to match it in other languages.
- Dubbing: AI algorithms generate the dubbing track in the target language, matching the original speaker’s voice.
- Synching: AI algorithms generate and add mouth movements to the speaker on-screen to match the dialogue of each target language.
- Quality assessment: Each language version of the footage is reviewed, analysing the quality of translation, dubbing track, voice generation and synching.
- Review process: Any issues flagged up in the QA stage are reviewed and fixed before signing them off.
As you can see, a lot of work goes into producing quality AI dubbing tracks. This isn’t a case of pushing a button and waiting for one algorithm to do all the magic.
Many AI dubbing services rely entirely on human translators to guarantee 100% quality is going into the AI dubbing process itself. This is because any issues at the translation stage will have serious knock-on effects on the other stages of production.
That being said, not all projects require the complete package of dubbing, voice matching and lip-synching.
For example, you might be producing a documentary and simply need narration auto-dubbed for each language. With no on-screen speakers, you don’t have to worry about lip-synching and you may not want to match the voice of the original narrator, either. So it’s important to understand the goals and needs of each project.
Each element of AI dubbing has its own weaknesses so these need to be considered in isolation. The extent of these weaknesses will vary depending on the methods used in dubbing services, too. For example, when a company uses human translators to translate the dialogue – instead of simply using AI translation – this weakness is removed.
AI algorithms can simulate the voices of people or create stylised voices from scratch. So, in theory, someone can have Keanu Reeves narrate their next documentary or generate the voice of a relaxing, female narrator in her mid-30s.
The thing is, algorithms have to generate every nuance, tone, intonation and variation for them to exist in the AI-generated dialogue. This is something that is not possible yet and it requires a lot of data.
By extension, the quality of AI dubbing tracks depends on the amount of data used to create them. If the input data doesn’t include all of the nuance, depth, emotional range, style and realism you need, it is going to fall flat.
If these characteristics are all equally important in your project, AI dubbing may struggle to deliver.
This is the first, most notable weakness of AI dubbing – at least, when AI lip-synching is being used. The lip movements can look unnatural, easy to spot and distracting to the viewer.
It all comes down to how much quality creators want from dubbing services. Companies that seek and need premium quality will have to take a more traditional route, using professional translators as the output of AI services will not end up being good enough – not yet, anyway.
The burning question is how much further can the technology improve and how quickly?
In the meantime, if you need dubbing services, including adding some AI tech into the mix, our team can help. Get in touch with us by filling out the form on our contact page and we’ll get right back to you.