Data Ownership

Data Mining, The Internet, and Counter Intelligence

Data Mining, The Internet, and Counter Intelligence

We can, and do, talk about Data Privacy and Ownership until we're blue in the face; we also talk about how seriously screwed up some of the things the National Security Agency did were, but we never really harp on the obvious fact:  All the security in the world doesn't do a damn bit of good if you give your information away; one ambitious JSON project over on Github called Looking Glass is capitalizing on just that very fact.

In fact, as of this publication, they have data mined over 139,361 resumes belonging to military and civilian officials within our nation's Intelligence, Surveillance, and Reconnaissance (ISR) fields.  Those handy little endorsements have been used by job seekers to categorize them into fields (e.g. "Security Clearance" or "ISR") where data miners have been more than willing to scoop that  information up.

Big Data and Privacy

Earlier this week, the President's Council of Advisors on Science and Technology (PCAST) released a seventy two page report on the intersection of Big Data and Privacy with an unoriginal title of:  Big Data And Privacy: A Technological Perspective.  It started by first establishing the groundwork for the traditional definition of privacy, as defined by Samuel Warren and Louis Brandeis in 1890.  These individuals stipulated that privacy infractions can occur in one of four ways:

  1. Intrusion upon seclusion.  If a person intentionally intrudes upon the solitude of another person (or their affairs), and the intrusion is seen as "highly offensive" then an invasion of privacy has occurred.
  2. Public disclosure of private facts.  If a person publishes private facts, even if true, about someone's life - an invasion of privacy has occurred.
  3. Defamation, or the publication of untrue facts, is an invasion of privacy.
  4. Removing personal control of an individual's name and/or likeness for commercial gain is an invasion of privacy.

These infractions basically come down to a removal of the control that an individual has over various aspects of their life (being left alone, selective disclosure, and reputation), and PCAST tends to agree as they state a couple of times throughout their report about the need for selective sharing and anonymity.  The report went on to address a few philosophical changes in our mindset about privacy that were needed in order to better enable the successful implementation of the five aforementioned recommendations:

 

  • We must first acknowledge that private communication interception is easier
  • We need to extend "Home as one's castle" to become "The Castle in the Clouds"
  • Inferred Private facts are just as stolen as real data
  • The misuse of data and loss of selective anonymity is the key issue.

 

The report goes on to state that the majority of the concern is with the harm done by the use of personal data and that the historic way of preventing misuse of personal data has been in controlling access; a measure that is no longer made possible in today's nebulous world of data ownership.

Personal data may never be, or have been, within one's possession.

From public cameras and sensors to other people using social media, we simply have no control over who collects data from whom; and we likely never will again.  Which raises the question of who owns the data and who controls it.

And while the Electronic Frontier Foundation would complain (again) that this failed to address metadata (in spite of it equating metadata to actual data in the first few pages), this report comes on the eve of a unanimous vote in the House to rein in the National Security Agency making this a big week for big data privacy advocates.



Why It's Not About Privacy

Why It's Not About Privacy

I've faced some opposition recently based on my views that the Electronic Frontier Foundation did a disservice to their constituents by focusing so much of their efforts on privacy, rather than data ownership.  With that in mind, I pose two ethical scenarios to help illustrate my (and the Guardian's) point that solving the data ownership debate will solve far more than just the privacy debate.

Our laws are focused on data collection, but the existence of data is not the concern; it’s the usage and sharing of data.  In today’s interconnected world, individuals are no longer as concerned about what a given company knows about them, but how it’s used and with whom that information is shared.  These are issues that cannot be solved when we limit the scope of our conversation to privacy, but must be evaluated in the larger discussion of establishing ethical data ownership legislation.