Can Artificial Intelligence solve the translation challenge in Learning?
By Stefan Haenisch, Ingo Schulz, Michael Pflanz
Providing learning content in a learner’s native language has always been a major challenge for knowledge transfer in global environments. With all technology advancements, the process has remained highly manual – slow, cumbersome, and expensive. Once content is available in a source language, translators are hired – typically through external agencies – who then manually translate into the required language. Then, to ensure your business specific lingo and context was translated correctly, another intensive quality assurance step is done with local experts – which often takes longer than the translation itself, due to resource bottlenecks. Multiply this by lots of content and lots of languages – and add, as a further ingredient, that the original source content may change while translation projects are already underway – and you soon get to unsolvable scalability and funding challenges. Companies like SAP have succeeded to the highest extent possible in providing a lot of content in local languages – still we cannot provide full coverage in all languages, but focus on the most popular customer demand, while trying to minimize the time to market gap. Not a perfect solution, but for the most part “good enough”.
This need for immediate availability of translated learning content, however, has drastically accelerated with the advent of the digital economy, mainly driven by the following two factors:
- More and more learning is consumed in digital formats as opposed to traditional in-person classroom training. In classroom training, a huge advantage from a language perspective is that with the instructor present, there is a ‘human bridge’ between the content and the learner. So, for example if you teach English course content in Germany or France with a German/ French instructor, this can still work well in many cases. If there’s no instructor in-between content and learner, the need for translated content is extensively increased .
- With the speed of innovation moving faster than ever and software solutions moving to the cloud with updates on a quarterly basis, the original content is so fast paced that there’s hardly any time for translations. If you were to follow traditional processes, the translated version would be already outdated on the day it is released!
The good news is, that machine translation has made dramatic advances, too. Most of us have used examples like Google translator and others in our personal lives, and are impressed by the results. From a learning perspective, it would be a dream if those engines were good enough to provide instant translations of all learning content into all required languages, at the right quality – all the challenges above would be immediately resolved! But how ready for learning is machine translation technology? What can already be achieved in today’s reality, and what do we expect to see in the future?
Machine Translation – the new hype
The topic of machine translation has been around for quite a while. It started in the 1950s with rule-base machine translation (RBT) followed by Statistical Machine Translation (SMT) in the 1990s. SMT was already quite successful and brought machine translation to everyone’s attention, but often people only remember the big failures; because SMT isn’t that good when the reordering of words is required. It only works well for certain language pairs and the resulting translation isn’t really fluent in many cases.
For a few years, there’s a new kid on the block: Neural Machine Translation (NMT). It has become one of the successful areas of machine learning – besides autonomous driving, image recognition, and classification tasks. The new NMT approaches benefit from the availability of strong and affordable hardware power, based on Graphical Processing Units (GPUs) and decades of human translated content – ideal pre-requisites for training a machine learning model, while trying to simulate the neural network of a human brain.
As NMT translation results show significantly better results compared to SMT translated content, it became one of the machine learning darlings: machine translation was a key note topic at the big developer conferences from Microsoft, Google, and even SAP (check out Jürgen Müller’s keynote from SAPPHIRE Now 2017). In addition, new gadgets like Google’s new earbuds also caught a lot of attention and publicity: “Real-time translation in up to 40 languages”, “A human dream comes true” … and so a hype was born.
Neural machine translation – a hype with promising results
SAP has built its own SAP NMT in the context of the SAP Translation Hub. The SAP NMT currently supports translations between ten language pairs. It’s available as an alpha translation service to developers in the SAP Business API hub on the SAP Leonardo Machine Learning Foundation and can also be tested via SAP Translation Hub.
And the SAP NMT delivers pretty impressive translation results. In our first internal assessments leveraging NMT for key and end user tutorials for three language pairs, we could prove that
- Human translators preferred NMT more than five times over SMT translations
- NMT translations were ~2 times faster to correct during post-editing
- And NMT translations required much less corrections at all compared to SMT e.g. More than 60% of English to Chinese translations didn’t require any human post-editing at all
Correlation between CharacTER (Translation Edit Rate on Character Level) and post-editing time
In summary, NMT results require much less human post-editing, the translations are more fluid than with SMT, and it’s really promising, … but you shouldn’t expect the same translation quality as from human translators. You should expect that issues will remain and even with funny errors, e.g. on Amazon, I looked at a cook book automatically translated to German: One of the recipes translated turkey (the bird) to Turkey (the country) … In some cases NMT results will show strange translation results and parts of sentences might be missing. But if a 70-80% quality fits with your translation use case, e.g. supports you in learning about a certain topic, it’s already a brilliant tool today.
Real-Life check with 10,000 openSAP Learners
To put NMT to the ultimate test with external users, SAP Globalization Services teamed up with openSAP and used NMT to translate English transcripts from the openSAP course, Enterprise Machine Learning in a Nutshell, in December 2017. openSAP is SAP’s Massive Open Online Course (MOOC) platform with a global audience. For this pilot course, we provided fully machine translated video subtitles in four languages – German, French, Spanish, and Portuguese. During this course, we then collected user feedback to check if the quality was acceptable and if the subtitles were helpful in the learning process. We received good feedback with over 80% of learners stating that they would like to see more machine-translated subtitles on openSAP, even if the quality is less than 100%. We also saw a large number of learners volunteering to support our future efforts to improve the quality of translations with NMT.
So today, does machine learning solve the learning globalization challenge already?
Well, regarding the dream of getting instant perfect translations of all learning content out of the box, we’re certainly not quite there yet. However, it already provides interesting use cases where ‘good enough’ rather than perfection is aspired. In addition, if combined with some human post-editing, it can provide significant efficiency gains. At SAP, we’re certainly looking with high focus into how we can leverage the best out of this today. And if you consider that today, in such a hybrid scenario, more than 70% of machine translated sentences don’t need to be touched anymore, we’re very confident that the ‘translation dream’ coming true completely is not too many years out from today.
I wonder when we'll be to a place where the translations are seamless. For example you talk to me in German - I have a in ear device that immediately translates it to English. I talk in English - you hear me in German.
I think we are headed that way for the future. I heard just yesterday that a bot had to be slowed down because it would answer questions, immediately after they were entered. People found that creepy. So if bots are that quick and neural machine language, it just seems like the next natural step.
And if I really want to step into the science fiction world. At some point implants could be used to do the above. 🙂
Interesting thoughts and I'm glad the OpenSAP courses are now open to a global audience. It's an excellent way to get knowledge and a great way to test things out.
thanks for your comments ... and you're right: we are still in testing mode and try to understand the possibilities. Still far away from something similar to the babel fish ... although translations with Google's Earbuds are a nice gimmick.
Thank you for this interesting article! There is one question that comes into my mind though.
It is a current discussion whether machine translation could replace human translation. As you said, 60-70 years ago the quality of machine-translated texts wasn't even comparable to how it is today. Nevertheless, in recent years there was a huge development, and the texts produced by machines became not just a simple "word for word" translations, which were sometimes hard to understand. The machines are now considering the syntax and grammatical rules of the target language as well as the context and thus producing, I would even say, good-quality texts 😉 However, in my opinion translating is a very creative process, also in technical areas. There are still quite a few universities in Germany offering applied translation studies. What a future can we imagine for people who have a degree in this area and work as a translator. Should they only carry out copy-editing tasks?
That's an interesting thought that I had also. I think it will change the translator's job. But I'd love to hear what Stefan Haenischis thinking. I don't think we are there yet. But I wonder how fast we will get to the point where a translator's job isn't needed, or is it simply changed.
Hi Michelle, hi Inna,
you are absolutely right that translation is a very creative process and I can't foresee companies like SAP basing translations of e.g. their marketing collateral solely on machine translations ...
Nevertheless, we definitely see opportunities for neural machine translation to better support translators with an improved initial translation of the source text, to speed up the process, and let the translator focus more on passages that really need their creativity.
In addition, there will be opportunities for collateral which is not translated today just because of scalability, costs, efficiencies - so translation of content that wasn't translated so far ... NMT might be a solution to provide good enough initial translations. This might even further increase the need for translation experts to support final quality checks - as more content might get pushed through translation processes ...
So, we definitely see a strong need for translation experts also in the future as they will rather benefit from improvements in neural machine translation engines to support their work.
Thank you for this interesting update on the use of neural machine engines for the translation of educational materials in Open SAP.
The bottom line, as you justly point out, is that
"NMT results require much less human post-editing, the translations are more fluid than with SMT, and it’s really promising, … but you shouldn’t expect the same translation quality as from human translators" ...
Compromising quality for the sake of instant access to information might be acceptable provided there is a clear, express and disclosed tolerance for possible substantial errors and liabilities.
My question is whether such tolerance and the implicit acceptance of possible substantial errors could undermine the trust in an information product, be it a training session, a product description or else, especially when there is no way to verify the output in the target language (i.e. google translated health advice on the Internet - sometimes hilarious).
Will translators become the quality auditors of machine translated textual outputs? Will this completely change the business model within the language industry and raise the professional profile and qualifications requirements of translators?
Very interesting point and brings me back to a discussion I had last year with some SAP Community members about using translator tools to translate posts in this Community, instead of having spaces for different languages, where people can read and write in their own language, without the fear of misunderstanding the solution or the question due to substantial errors in the translation.