Andy Green, Technical Content Specialist at Varonis, says:


We all understand the risks in accidentally revealing a social security number. But are there other pieces of less identifying or even anonymous information that taken together act like a social security number? The European Union is breaking new ground on consumer privacy as it begins to reform its own regulations. The EU’s broader ideas on personal identity have even made their way across the pond into proposed new US regulations.

The history of the European Union’s consumer privacy and data security regulations begins with its 1995 Data Protection Directive–or EU 96/46/EC for security wonks. EU directives provide guidance to its member nations’ legislatures, who then are free to craft their own specific laws. The DPD has been influential in shaping the vocabulary and, less charitably, the jargon of the consumer privacy discussion on both sides of the Atlantic.

In the US, the starting point for discussion on data security is Sarbanes-Oxley, which became law in 2002. In comparing and contrasting the two, it’s fair to say the DPD was more focused on securing consumer information, but more inclusive—unlike SOX–in covering both public and private companies. To this day in the US there’s currently no single comprehensive law on consumer privacy.

The EU’s original directive is significant because it defined personal data as “information relating to an identified or identifiable person.” For example, by EU rules, street address, name, and phone number are personal data; height, eye color, and model of car you drive are not. This notion of personal data as a type of key is part of the definition used in privacy laws outside the EU–including the US. In North America, though, we’ve come up with our own term for personal data, calling it instead “personally identifiable information” or PII.

By the way, the EU regulators intentionally created a less explicit definition of personal data so that it would encompass new technologies. In 2012, data related to an identifiable person could now be an email address, IP address, and for some EU nations, even a photo image. 

To bring the story up to date, security experts began to realize that along with personal data there was other data–let’s call it quasi-personal–that if released could also be used to relate back to an individual. The data magic to accomplish identification typically requires matching a collection of anonymous data points– birth dates (or years), zip codes, ethnicity, and perhaps even car model driven–against publicly available databases. 

For example, there are well documented cases involving anonymized hospital discharge records subsequently used to re-identify the original patients!

With Facebook now up to 1 billion active users, it’s fair to say that the Web is overflowing with personal data at all levels of detail. Essentially social networks have provided hackers—the new ominous player on the scene—with a huge public repository to match against (c.f. Matt Honan).

To get a better understanding of how it’s possible to re-identify an individual, let’s review a variation on the aforementioned case. While the technique is not always guaranteed to uniquely identify a person (this depends on the available related information), it can often produce a narrowed down list of highly likely subjects.

Suppose, for argument’s sake, a European mortgage company analyzes a health report from a large public hospital. The records show that five individuals were being treated for a rare disease. Their ages were also published. Assuming the patients live near the hospital, the mortgage lender then simply filters its database on zip code and birth year. Working with a smaller set of records, it then scans social media sites or other online forums, filtering on the retrieved names and other data, all the while looking, for say, “get well” messages. If it finds a few matches, and with the additional new data points from the social site … I think you see where this is leading.

The good news is that the EU countries have long recognized that their laws have not kept pace. And the EU governing body is currently in the process of reformingthe 1995 directive, taking into account the new realities of public data on the Web and the blurring of personal and anonymous data. To get a sense of the EU’s new thinking on personal data, refer to this work-in-progress paper.

And there are also rumblings of change in the US along the same lines as the EU reforms. Keep an eye here to Data Center Post and to our own blog at blog.varonis.com, where we’ll be writing more about US laws and what this will all mean for your company’s data protection policies in future posts.