It has been almost a month since the massive Cambridge Analytica x Facebook improper-user-data-ex-filtration mess (don’t call it a data breach) came to light. The news is settling down despite the real numbers coming out of Facebook and a possible 600,000 Canadians possibly affected.
I’ve been mulling over how I feel about it and I’ve finally come to a conclusion.
As much as I’d like to see this as a catalyst for people to start finding (and building) alternatives to Facebook’s walled garden of exploitation, I don’t think they did anything wrong.
The basic narrative of the Cambridge Analytica story seems to be that Facebook tricked average Americans opting to share all their facebook data with some benign looking app (like a quiz); which in turn gave the app maker further access to the victim’s friends data. Without the victim’s friends’ permission. In other words, if your friends fell for this ploy, Facebook’s API gave the app maker access to your data without your permission.
I don’t believe there is any truth do this assumption. Facebook’s API never granted access to this level of data about friends (let alone friends-of-friends). They are not that stupid.
I was involved in building Facebook app integration during the time that Cambridge Analytica gathered their data, I read Facebook’s Open Graph API documentation numerous times. Unfortunately that version of the API no longer seems to be available online, but I was able to find some old how-to videos referencing it.
As far as I can piece together, the only data about your friends that Facebook ever provided via the API was their full name and user id. Any data about your likes, political affiliation, family connections, marital status, or anything else that could be used for “psychographic” modelling was never available via your friends.
These personal details were available to anyone and everyone via your public profile! Assuming that you hadn’t opted out of sharing this info (and I really doubt most user were giving their privacy details much thought before they learned the name Cambridge Analytica).
In order for Cambridge Analytica and others to mine this data they would have had to write bots to scrape data directly from your public facing profile. In the past, it was very easy to gain access to these profiles in a programmatic way. Anybody could simply load http://facebook.com/profile.php?id= with your ID to see your public profile. Even a non-programmer can see how easy it would be to generate a list of targets for a bot to crawl.
At some point, Facebook started closing this “profile.php” access point as they rolled out username (I’m ohryanca). Once that was locked down, it became more complicated to scrape content and the bad actors became more clever.
I’m pretty sure I’m right
In a blog post yesterday Facebook announced an enormous array of restrictions to their APIs (which are undoubtedly pissing off a lot of sketchy developers). Regarding account recovery, they mentioned the following:
…malicious actors have also abused [account recovery] features to scrape public profile information by submitting phone numbers or email addresses they already have through search and account recovery. Given the scale and sophistication of the activity we’ve seen, we believe most people on Facebook could have had their public profile scraped in this way. So we have now disabled this feature. We’re also making changes to account recovery to reduce the risk of scraping as well.
As much as I hate to say it, I don’t think Facebook did anything wrong. Their APIs never fed this data to any and every app developer who wanted. Cambridge Analytica and friends had jump through additional hoops. They took actions that were outside of the normal/approved methods Facebook expected and allowed app makers to access our data.
Facebook simply built a reasonable public profile feature meant to allow you to use Facebook as a home on the web. A URL to share outside the platform.
They built a reasonable account recovery feature, that allowed users to recover their logins in standard non-controversial ways.
There is no evidence that Facebook’s APIs allowed access to the type of data Cambridge Analytica took advantage of. They were just outplayed by an opponent who thought of clever ways to get what it needed.
In case the mainstream media has lulled you in to a false sense of whatever; the democrats have this data too (and then some).
Here is footage of Carol Davidsen (VP of political technology at Rentrak) at a conference in 2015 gleefully explaining how the Obama campaign mapped THE ENTIRE SOCIAL GRAPH OF THE UNITED STATES who were on Facebook at the time of the 2012 election. The techniques she describes are strikingly similar to what Cambridge Analytica is accused of.