Blog

Machine translation: NMT translates literature with 25% flawless rate

Published on February 9th, 2018

machine translation

New research into neural machine translation (NMT) claims the technology is capable of translating literature with a 25% flawless rate. That might not sound like a lot but the complexity of translating creative literature makes this an impressive result for the technology.

The most famous uses of NMT are platforms like Google Translate and Skype Translator, even if the technology behind them remains a mystery to most users. Neural machine translation uses deep learning to apply context to words and phrases, thus removing one of the biggest barriers between machine translation and its application in the real world.

 

NMT hits 25% flawless rate with literature

Natural speech is a complex thing and the biggest challenge for machine translation is understanding the meaning behind our choice of words, let alone capturing this same meaning in another language. For example, the word “set” can be a noun, verb or adjective with various different meaning depending on where and how you use it in a sentence.

Neural machine translation aims to understand the meaning behind words and phrases by predicting word order and applying context via deep learning. This enables the technology to decide when “set” is being used as a noun and when it’s being used as a verb.

This is a crude example but it helps explain what’s going on at the algorithm level.

A far more complex use case would be using NMT to translate works of literature and this is precisely what Dr. Antonio Toral, Assistant Professor at the University of Groningen and Prof. Andy Way, Professor in Computing and Deputy Director of the EU’s ADAPT Centre for Digital Content Technology, did in their research project.

machine translation

They used neural machine translation to translate twelve well-known novels:

  1. Auster’s Sunset Park (2010)
  2. Collins’ Hunger Games #3 (2010)
  3. Golding’s Lord of the Flies (1954)
  4. Hemingway’s The Old Man and the Sea (1952)
  5. Highsmith’s Ripley Under Water (1991)
  6. Hosseini’s A Thousand Splendid Suns (2007)
  7. Joyce’s Ulysses (1922)
  8. Kerouac’s On the Road (1957)
  9. Orwell’s 1984 (1949)
  10. Rowling’s Harry Potter #7 (2007)
  11. Salinger’s The Catcher in the Rye (1951)
  12. Tolkien’s The Lord of the Rings #3 (1955)

 

The duo trained an NMT model to translate the novels from English into Catalan and compared results against a phrase-based statistical machine translation (PBSMT) system with the same works of literature. They found that the NMT model was 11% more successful at translating the novels accurately with 17%-34% of native speakers comparing the results to the quality expected from human translators.

 

What does this really tell us about machine translation?

The findings published by Dr. Antonio Toral and Prof. Andy Way illustrates how far machine translation has come in recent years – largely thanks to machine learning and related technologies. Half a decade ago it would have been unimaginable that an algorithm would be anything better than useless with translating works of literature and other creative pieces.

Of course, a 25% flawless rate is nowhere near good enough for use in real-world scenarios but the technology doesn’t need to translate Harry Potter. If it can progress enough to translate conversations with any accuracy or translate even 50% of a document for human translators, the technology will make a huge impact on the way we deal with language barriers.

The reason for a test of this nature is to put MNT in one of the most challenging scenarios possible and seeing how well it can do. The results are pretty impressive, too. Which gives us even more reason to be optimistic that machine translation can one day play a genuine role in the way people and businesses communicate with each other.

Posted on: February 9th, 2018