Making Machine Translation Work for Your Software Business

Automatic translation tools have been a mainstay of the business world for some time now. Whether you want to get an idea of what insights an article in Chinese has to offer, or decipher an e-mail sent between Portuguese colleagues – just a few clicks here and there, and you’ve hopped the language barrier.

However, it’s a completely different situation when a software company wants to use machine translation (MT) to produce content with a professional edge, such as software documentation or UI strings. This is an area in which popular free MT services like Google Translate typically fall down. So how do you make MT work for your business to achieve maximum productivity? Milengo has summarized the most important aspects of a successful approach to MT below.

Selecting a technology provider

The market

Today’s MT systems for business use are offered as subscription services in the cloud. The major tech companies – Amazon Web Services, Google, Microsoft, and IBM – are leaders in this field. But Cologne-based company DeepL has also been making major waves in the localization industry, not to mention regional service providers like Yandex for Russian and Baidu for Chinese.

These offer a fee-based optional service that safeguards business data to corporate customers, as well as a REST API to connect the system to their internal software.

The quality

Generally speaking, you can‘t go wrong with the service providers listed above – the differences in quality are minimal, and prices pretty much work out around the same. As a market leader in the areas of deep learning and artificial intelligence, Google supplies the most reliable results across all languages, while DeepL relies on Linguee’s high-quality translation corpora and produces particularly fluent texts thanks to the specific configuration of its Convolutional Neural Network.

You should also bear in mind that the quality won’t just differ by provider, but can also vary depending on the language combination, as the latest technology – neural machine translation (NMT) – isn’t always used. Outdated statistical machine translation is still commonplace for some languages.

 

Today’s machine translations tend to appear readable and trustworthy – but on closer inspection, grave content errors or incorrect references can quickly become apparent.

 

The proof is in the pudding

Whatever you do, don’t just blindly trust an MT service provider’s claim that it produces “near-human translation quality” – this won’t give you any information on whether the machine will also produce good results for your company’s software documentation. That’s why you should test the output yourself – and by that we don’t mean just throwing a few sentences into Google Translate or DeepL. You’re best off taking a sample text of between 1,000 and 2,000 words that’s representative of the text type and the subject area you want to have translated. Then, you should have experts like technical writers or professional translators scrutinize the output quality, as today’s machine translations tend to appear readable and trustworthy – but on closer inspection, grave content errors or incorrect references can quickly become apparent.

Processing translation files

Website content or UI strings tend not to be a nice block of text in a Word file that can be conveniently transferred into a machine translation tool by copy and paste.

When it comes to more complex document types like XML and PowerPoint, the formatting can have a significant impact on the MT output’s quality.

The following problems typically occur in the target file subsequent to MT pre-translation:

  • Incorrect line breaks mean that the translation actually restarts in the middle of the sentence
  • Additional spaces are added before special characters (. , ? !)
  • The document’s formatting is scrambled
  • The mark-up language in the source code of the document is erroneous
  • Important terms – such as your company’s name – are translated incorrectly 

Inconsistencies of this nature result in higher translation costs, as more corrections have to be carried out during the post-editing step. To prevent this, well-engineered procedures are crucial when it comes to pre- and post-processing translation files – which in turn requires a sound knowledge of file formats and how they function.

Making the most of customization

The underlying systems of all MT providers will not be tailored to your company’s requirements. These translate everything from recipes, to poems, to your technical documentation using the same resources. That is why the precision of your corporate language is lost when your texts are reproduced using machine translation.

For that reason, an MT system has to be trained for a specific purpose in order to achieve maximum efficiency. This is where customization services come in handy – but this option really only applies to your company if it already has a sufficient volume of translation data. Ideally, you should have at least 20,000 high-quality translation memory segments at the ready.

Post-editing in CAT tools

Despite rapid technological progress, machine-translated texts are often still prone to errors. Within a business context, this can make it difficult for end users to comprehend them, or in a worst case scenario, they may be misled by poorly translated software documentation. That’s why in order to guarantee the best possible quality, the MT output should be revised by an experienced editor, who is not only familiar with the subject area and the source language, but is also au fait with the target language.

“Post-editing” is usually carried out in CAT tools, which now all feature their own plugins for selected MT systems. This allows pre-translation to be carried out directly in the translation tool. However, what’s problematic about this approach is that underdeveloped interfaces usually lie at the heart of these plugins. For example, formatting and special characters in the source document are not handled correctly, which can lead to qualitatively poor output, especially for more complicated file formats. That’s why you should avoid these kind of plugins where possible if you want to err on the side of caution. Instead, we recommend that you have the content pre-translated externally and integrate it in the tool yourself. This, however, necessitates an in-depth knowledge of software localization processes.

In a nutshell

Nowadays, machine translation services are conveniently available over the cloud and across the board – but you’ll only achieve the best possible quality and maximum productivity with professional pre- and post-processing of your translation files, as well as customized connections between MT systems and your workflows.