My Thoughts on Facebook and Cambridge Analytica

It has been almost a month since the massive Cambridge Analytica x Facebook improper-user-data-ex-filtration mess (don’t call it a data breach) came to light. The news is settling down despite the real numbers coming out of Facebook and a possible 600,000 Canadians possibly affected.

I’ve been mulling over how I feel about it and I’ve finally come to a conclusion.

As much as I’d like to see this as a catalyst for people to start finding (and building) alternatives to Facebook’s walled garden of exploitation, I don’t think they did anything wrong.


The basic narrative of the Cambridge Analytica story seems to be that Facebook tricked average Americans opting to share all their facebook data with some benign looking app (like a quiz); which in turn gave the app maker further access to the victim’s friends data. Without the victim’s friends’ permission. In other words, if your friends fell for this ploy, Facebook’s API gave the app maker access to your data without your permission.

I don’t believe there is any truth do this assumption. Facebook’s API never granted access to this level of data about friends (let alone friends-of-friends). They are not that stupid.

I was involved in building Facebook app integration during the time that Cambridge Analytica gathered their data, I read Facebook’s Open Graph API documentation numerous times. Unfortunately that version of the API no longer seems to be available online, but I was able to find some old how-to videos referencing it.

As far as I can piece together, the only data about your friends that Facebook ever provided via the API was their full name and user id. Any data about your likes, political affiliation, family connections, marital status, or anything else that could be used for “psychographic” modelling was never available via your friends.

However!

These personal details were available to anyone and everyone via your public profile! Assuming that you hadn’t opted out of sharing this info (and I really doubt most user were giving their privacy details much thought before they learned the name Cambridge Analytica).

In order for Cambridge Analytica and others to mine this data they would have had to write bots to scrape data directly from your public facing profile. In the past, it was very easy to gain access to these profiles in a programmatic way. Anybody could simply load http://facebook.com/profile.php?id= with your ID to see your public profile. Even a non-programmer can see how easy it would be to generate a list of targets for a bot to crawl.

At some point, Facebook started closing this “profile.php” access point as they rolled out username (I’m ohryanca). Once that was locked down, it became more complicated to scrape content and the bad actors became more clever.

I’m pretty sure I’m right

In a blog post yesterday Facebook announced an enormous array of restrictions to their APIs (which are undoubtedly pissing off a lot of sketchy developers). Regarding account recovery, they mentioned the following:

…malicious actors have also abused [account recovery] features to scrape public profile information by submitting phone numbers or email addresses they already have through search and account recovery. Given the scale and sophistication of the activity we’ve seen, we believe most people on Facebook could have had their public profile scraped in this way. So we have now disabled this feature. We’re also making changes to account recovery to reduce the risk of scraping as well.

Conclusion

As much as I hate to say it, I don’t think Facebook did anything wrong. Their APIs never fed this data to any and every app developer who wanted. Cambridge Analytica and friends had jump through additional hoops. They took actions that were outside of the normal/approved methods Facebook expected and allowed app makers to access our data.

Facebook simply built a reasonable public profile feature meant to allow you to use Facebook as a home on the web. A URL to share outside the platform.

They built a reasonable account recovery feature, that allowed users to recover their logins in standard non-controversial ways.

There is no evidence that Facebook’s APIs allowed access to the type of data Cambridge Analytica took advantage of. They were just outplayed by an opponent who thought of clever ways to get what it needed.

PS

In case the mainstream media has lulled you in to a false sense of whatever; the democrats have this data too (and then some).

Here is footage of Carol Davidsen (VP of political technology at Rentrak) at a conference in 2015 gleefully explaining how the Obama campaign mapped THE ENTIRE SOCIAL GRAPH OF THE UNITED STATES who were on Facebook at the time of the 2012 election. The techniques she describes are strikingly similar to what Cambridge Analytica is accused of.

Nobody blogs anymore and this is a bad thing

To confirm my suspicion about lack of blogging, I took some time to compile some stats on the roughly 450 normal non-celebrity human beings who follow on twitter. I counted all the people I follow how list a blog in their bio or within 1-click of the link in their bio (to account for “about me” landing pages).

I found that only 93% had a functioning blog attached to their account. Of those 93, only 42 had published one or more blog posts in 2018. 55% of the real humans I follow have abandoned blogging. A small handful of the blogs I looked at had not even been updated in the past 5 years (why you would even bother linking this to your bio is beyond me).

Here’s the really interesting thing though…
I had never read a post by nearly any of those 42 active bloggers I identified. I simply wasn’t aware they existed.

Blogging has always suffered from discoverability issues. Discoverability is hard without a centralized platform like Twitter, Tumblr, WordPress.com, etc. But I think it’s a solvable problem.

We need blogging…

I’m sure many more smarter people have shared their thoughts on the importance of blogging.

Very simply put, decentralized, self-published content, free of corporate or advertiser control, is kinda sorta the dream of the internet.

In 2018, it’s easier than ever.

Sunday Links: Deep-Sea Diving, Spotting Fake Reviews

Treasures from the Wreck of the Unbelievable

In 2010, artist Damien Hirst funded an undersea exploration of a mysterious ship wreck, seemingly on a whim. Netflix just released a documentary about the expedition that ended up uncovering one of the biggest finds of ancient art ever seen. Growing up I loved watching afternoon CBC documentaries about this sorta thing and this film is one of the best I’ve ever seen. If you’re looking for something to scratch that Jacques Cousteau, Steve Zissou itch, I’d highly recommend this one.

Spoiler alert: there is actually a very huge twist that is so well done, I did not fully understand it until Googling specifics about the film before writing this blog post. If you’ve never heard of this one, definitely do not research before watching and absolutely do not read this article.

Fakespot.com

Ever wonder if Amazon product reviews are legit? Wonder no more. Fakespot.com uses an algorithm to give a product review section a confidence grade. As i find myself buying more and more things on Amazon, I think I’ll be using this more often.

Sunday Links: Hackers, Hot Dogs and Rhinos

A scary story, a funny video and an interesting photo for your Sunday afternoon pleasure.

Scary story.
Hackernoon contributor writes a very plausible story about how a bad actor might go about injecting password/credit card stealing code into any number of websites. In a way that would be extremely undetectable. Spoiler alter: It relies on NPM.

Looking back on these golden years, I can’t believe people spend so much time messing around with cross-site scripting to get code into a single site. It’s so easy to ship malicious code to thousands of websites, with a little help from my web developer friends.

I’m harvesting credit card numbers and passwords from your site. Here’s how. by David Gilbertson.

Video.

I’m not really int to prank videos, but this one is supremely funny and so innocent.

Picture.

Elasmotherium

A giant unicorn rhinoceros named Elasmotherium roamed the plains of Siberia 29,000 years ago. In many ways, I find these prehistoric animals much more interesting than  dinosaurs. (I couldn’t track down the original source of this photo unfortunately)

2017 Podcast Picks

I haven’t done one of these lists in a few years, looking back through my archives I found my first list from 2008. Many of those podcasts have faded out of existence and I no longer listen to any of the others — with the exception of Daily Tech News Show, a spiritual successor to Buzz Out Loud. If you’re curious, here are my lists from: 2009, 2011 and 2012.

I subscribe to a lot of podcast, so I’ll just highlight a few shows I added to my subscriptions in the past year or two.

99% Invisible

Hosted by smooth voiced Roman Mars, this weekly show is ostensibly about architecture and design. Almost every week I find myself learning a bit of trivial or a little behind-the-scenes information that changes how I think about the way the world is constructed.

Website
Wikipedia

Episodes to check out:

Oyster-techture — Surprising importance of Oyster’s in NYC’s past and future.
Coal Hogs Work Safe — How stickers promote workplace safety in mining.
Half Measures — The history of metrification in the USA.

Reply All

Reply All is kind of like a cross between “behind-the-music” and Encyclopedia Brown for the internet. I previously highlighted their episode covering the history of Livejournal in Russia and the real possibility that it’s now an FSB spy tool.

Website
Wikipedia

Episodes to check out:

Long Distance – Part I & Part II — Host Alex Goldman receives a call from a telephone scammer, befriends him and travels to India to investigate their operation.
Antifa Supersolider Spectacular — Hosts discuss the origin of “Milkshake Duck” and other twitter weirdness.
The Case of the Phantom Caller — A woman in New Jersey is getting strange phone calls to her office from unknown numbers. The hosts investigate and uncover an interesting scam.

Stuff You Should Know

This show has been around since 2008, I’m really surprised I have not heard of it until this year. Twice per week the hosts spend about 45 minutes doing a deep dive on a pretty-much-random topic. I’m not sure how else to describe it.

Website
Wikipedia

 

Episodes to check out:

Cake: So Great. So, So Great — The history of cake is more interesting than I would have guessed.
Who Committed the 1912 Villisca Ax Murders — A murder mystery from 1912 and possibly the origin of the Ax murder trope.
How Multiple Sclerosis Works — The title says it all.