• RPTimes



RPTEvents for when and where to meet us when we’re out and about!


RPTInformed for a more pragmatic approach to international holidays, data security and GDPR or general news


RPTIntelligence for linguistic research touchpoints to help you think outside of the box, delving deeper into areas of fascinating insight or approaches for exciting and stimulating solutions

30 December 2016

The Big Data approach: does using Google Translate™ breach GDPR?

Written by Maria Green, Posted in RPTInformed

The Big Data approach: does using Google Translate™ breach GDPR?

The importance of data security in this incontrovertibly digital world is, by now, at the epicentre of any business strategy. General Data Protection Regulation changes have been confirmed in response and are due to come into effect in 2018, with the aim of modernising the current data protection directive. Most organisations that work in the EU, or those external to it that collect EU citizen data, will be included in adherence to the new regulations – with hefty financial penalties for those who do not conform.

This is (hopefully) old news to anyone handling consumer or respondent data. The ramifications of the changes are significant and the potential to get ‘caught out’ is a reality. In the July 2016 GPDR update for EphMRA Members, one of the Data Protection Principles noted as being reformed pertains to keeping personal data secure by ‘appropriate organizational and technical means’. With these new regulations set to operate with extraterritorial effect, it started us thinking about one of the most common shortcuts taken by researchers, analysts and consultants alike when processing multilingual data. 


Google and the Big Data approach

Dramatic changes are starting to take place in the free, online translation world. Google’s latest advance in machine learning has led to its big translation upgrade, built through a technique known as ‘deep learning’. Deep learning takes a Big Data approach through using networks of math functions, inspired by studies of mammalian brains, which rely on vast sources of stored data. In fact, the networks rely on an open database of all source and translated material ever input which is kept and used for the translation tool to increase its learning ability. The initial results for this new technological approach have shaken up the Machine Translation world quite favourably.

deep learning

Well, it’s not quite ending world hunger, but that does sound fabulous, doesn’t it? The development of Google Translate™ means it has become widely used in the MR industry to help get quick results from multi-lingual OEs and back translations, with little or no impact on the bottom line. If you are not linguistically bountiful, and do not happen to have working knowledge of dozens of languages, then Google Translate™ (and others), can help to provide a free, fast and easily accessible solution that recognizes the feedback language and gives an instantly-generated translation.

Free, quick, and nearing the mark… if the alarm bells aren't ringing already, then there are some things of which you should be aware. 


Does using Google Translate™ really breach GDPR?

Firstly, ANYONE can go on to Google and agree to help 'contribute' to its learning. This process involves reviewing and verifying sentences that have been input by others, from and into your chosen language. Really, anyone can do this. No translation qualifications or even basic linguistic tests are required. See for yourself:

how can you help

Secondly, let’s come back to that open database of all source and translated material ever input - because this is exactly where it may become a sticky wicket for the Insight industry when the new GDPR regulations come into force in 2018.

Online figures suggest that even back in 2013, Google was reporting upwards of 200 million users of its free machine translation services. Aside from the obvious pitfalls for businesses using these online free tools - miscommunications, unintentional cultural insults, invalid insight - there is also another underlying risk that many can fail to identify.

Read further into some of the T&Cs of Google Translate™ and you discover that "When you upload or otherwise submit content to our Services, you give Google (and those we work with) a worldwide license to use, host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes we make so that your content works better with our Services), communicate, publish, publicly perform, publicly display and distribute such content.” (see ‘Your Content in our Services’).

By inputting information into no-cost translation tools, you could unwittingly be making confidential information and intellectual property available to the World-Wide-Web. This includes data under protection of a signed NDA or MCA with clients and suppliers, as well as any personal data given to you by respondents with the legally-binding promise of keeping it ‘confidential’.


How to get around it

globe in hand

So, how does the GDPR define ‘personal data’? The new definition is more expansive than its predecessor and provides for a wide range of personal identifiers to constitute personal data, reflecting changes in technology and the way that organisations now collect information about people.

A telephone number, IP address, full name, post code, even a job title or support ticket reference number may now be considered as a personal identifier. Personal data that has been pseudonymised – e.g. key-coded – can also fall within the scope of the GDPR depending on how difficult it is to attribute the pseudonym to a particular individual.

One way to ‘get around it’ and keep using Google Translate™ would be to vet data and remove all personal identifiers before submitting it to the open database. Here is a real-life example (all identifiers are fictional and do not knowingly relate to any individual person or people):

  • Your customer satisfaction survey carried out in Italy turns back this OE:
    ‘mi piace molto chiara camaro perche quando parla e’ chiaro. gli darei 90/100 o 95/100 solamente perche’ non mi a risposto ieri pero puo chiamarmi 06 39 15 03 1 grazie’
  • Google Translate™ generates this output:
    ‘I like it very clear camaro because when he speaks, and‘ very clear. I'd give it 90/100 or 95/100 only because’ I can not answer The recently but call me 06 39 15 03 1 thanks’
  • An Italian speaker deciphers it as this:
    ‘I really like Chiara Camaro because she speaks clearly. I would give her 90/100, or 95/100 [as a score], only because she didn’t reply to me yesterday, however, can she call me on 06 39 25 03 1, thank you’

A more realistic option would be to invest in a low-cost Machine Translation tool, outside of the free translations-stratosphere, to keep in adherence with the new GDPR. You could most certainly decide that you want to keep hold of as many of your pennies as possible, and who could blame you in an industry that is estimated to lose nearly 5% of its turnover as a result of the Brexit vote. However, the old adage “if something is too good to be true” springs to mind. Even paid-for, deep learning, advanced online translation tools cannot mitigate the ‘risks’ that these types of un-edited translations can do to business integrity, and they are still far from capable of navigating the intricacies and minefields in idiomatic, poorly-written verbatim.




Finally, just in case we thought leaving the EU might ‘get around it’: one of the major developments of this reform is that the scope of European law is about to get much wider. All companies collecting European citizens’ data, regardless of being based inside or outside of Europe, will have to comply with this legislation. The GDPR is set to become a central part of business’ lives within a couple of years’ time and there is a lot of work to be done before then, adapting processes to comply by the time it comes into force. Put simply, all organisations probably should start a GDPR compliance project and keep a look out for breaches that could, quite easily, slip under the radar.


NB: Google, as a Company, addresses General Data Protection Regulations in several areas, e.g. https://cloud.google.com/security/compliance/eu-data-protection/

Links / References:


About the Author

Maria Green

Maria Green

Maria is a BA (Hons) Psychology and Business Studies graduate with a passion for creativity, effective communication and believes that only the highest standards of quality and service will do… which is a good job really as this is what RPT do so well!

Back to RPTimes
Front Page




Back to RPTimes Front Page

Blog Disclaimer

All data and information provided on this blog is for informational purposes only. www.rptranslate.com makes no representations as to accuracy, completeness, currentness, suitability, or validity of any information on this site and will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use. All information is provided on an as-is basis.

facebook twitter  linkedin


  + 44 117 379 0400

Policies and Terms


RP Translate Ltd © 2019 All Rights Reserved

  • Contact the RPTeam

    Tel: + 44 117 379 0400

    Email: enquiries@rptranslate.com

Quick quote