March is machine translation month

Well, five years ago I returned to the issue of machine translation, having addressed it five years earlier, when a firm offered on the fly translation for my website. So it is almost time for a reprise.1

In 2005 I tested the service I was offered and found it almost, but not quite, usable. The paragraph I set was chosen (almost at random) from the Hypertext Bible: Amos commentary, it concerned city gates.

In 2010 I tested Google Translate and found that it’s results for rendering my paragraph into French were not as good as the effort from five years previously, though mysteriously after rendering it again into from French German this time a back translation into English was actually better than the 2005 effort.

So how has Google improved its translation service over the last half decade?

Here are the results of translating into French side by side:

2005
localtranslation.com
2010
Google
 2015
Google
Dans la ville antique du proche Orient les portes n’étaient ni simplement des entrées, ni seulement utilisé pour la protection militaire. Comme point potentiellement faible dans les défenses, les portes d’Israelite ont muré des villes ont typiquement eu trois chambres donner quatre ensembles d'”portes” et d’espaces défendus entre. Les portes de Hazor (laissé – projetez ci-dessus) et de Gezer dès l’exposition de Solomon cette construction triple. Dans l’ancienne ville près des portes-Orient ne sont ni seulement les entrées, ni seulement utilisés pour la protection militaire. Comme un point de faiblesse potentielle dans la défense, aux portes des villes d’Israël paroi généralement avait trois chambres donnant quatre séries de “portes” et défendu les espaces entre les deux. Les portes de Hatsor (à gauche – plan ci-dessus) et Gezer du temps de Salomon montrer cette construction triple. Dans les portes de la ville antique du Proche-Orient ne étaient ni Entrées de simple, ni seulement utilisé pour la protection militaire. Comme un point potentiellement faible dans les défenses, les portes de villes fortifiées israélites avaient généralement trois chambres donnant quatre séries de «portes» et défendus espaces entre. Les portes de Hazor (à gauche ci-dessus ) – Plan et Gezer du temps de Salomon montrent cette triple construction.

Perversely, or more likely because of the complex interactions between translation decisions, that must (I assume) be made by machines as they are by human translators, Google has got both better and worse in the last five years. The opening phrase is now almost (though confusingly not in terms of syntax) as good as the ten year old localtranslation.com effort. Some of the changes are bizzare, in 2010 Google cleverly gave Hazor its French spelling Hatsor, but in 2015 it has returned to its more usual English rendering. On the whole in 2015 Google is still less good than the other (2005) effort at syntax.

In 2010 I remarked that oddly though Google’s French was much less good than the 2005 localtranslation.com the back translation from German was better. Here then are the 2015 results

:

2005
localtranslation.com
2010
Google
 2015
Google
In for the military protection antiken of the Near East the doors were neither simple from the entrances nor only use city. When potentially weak point in the defenses the doors have typically three chambers immured of Israelite of cities had to give four whole of “doors” and from areas to which were defended, between. The doors of Hazor (calmly – above you plan) and of Gezer of the exhibition of Solomon to this three-way construction. In the old city near the East doors are not only inputs or only military for protection. As a point of potential weakness in the defense at the gates of the walled cities of Israel usually had three rooms are four categories of “doors” and defended the gaps. The gates of Hazor (left – show on plan) and Gezer time of Solomon, that tripled the building. In the doors of the old city of the Middle East neither simple one- or just for the military protection were used . As a potential weakness in the defense of the doors of Jewish fortress towns usually had three rooms with four series of “gates” and defended spaces between . The gates of Hazor (top left) – Plan and Gezer Solomon’s show time for the triple structure .

So, five more years on Google’s back translation is much improved, but the translation is arguably not improved at all! How can we explain this? I think by noticing the strengths and weaknesses of Google’s approach. It is weak on syntax, often offering a pedestrian word for word translation, however it is good at spotting contextual cues. (As the French rendering of Hatsor in 2010 – mysteriously dropped in 2015, have more Francophone websites taken to using the more literal and less phonetic rendering in this period?) This combination is perfect for providing renderings in successive langauges that will not produce hilarious “mistakes” when back translated into the original language. That is the good news for the Google programmers, malicious reviewers (like me) will get little fodder from Google. BUT it is bad news for users, because what we actually need is not brilliant results from an artificial translate/back translate excercise (no matter how many or few intermediate languages we use), but rather a decent understandable translation. That goal is at least as far away in 2015 as it was in 2010 or 2005 :(

  1. Being impatient, and forgetful, I will not wait till the 30th or 31st, but will jump in today :) []

2 comments on “March is machine translation month

  1. Mike Crudge

    I find this a fascinating (and patient) bit of research Tim!

    1. tim

      Not too much research, but interesting suggestions. If I am right I think they have taken a false step by using back translation as a shortcut for testing their system.