A few followup thoughts regarding Monday’s post about setting up a personal VPN.
Self-Sufficient, DIY Internet
All the Facebook Cambridge Analytica nonsense has really emphasized how dependent we have become on third party services and social networks.
As I thought about it, the idea of being self-sufficient online has really started to appeal to me. I mean this blog has always been independent, fully controlled by me. As a web developer with fully-stack devops ninja experience, I have all the skill and experience I need to set up any sort of web service I want.
So when I thought about the reasons for using a VPN regularly and the likelihood that I’d have to pay for a decent service, I wanted to see if i could do it myself. On severs I own.
I think there are more opportunities to DIY online, to rely less on dubious third parties.
Peace of Mind
As I alluded to in my first post, the real world security threats associated with public wifi are only a minor concern. I’m not generally too concerned, most of the time.
That said this little icon next to my WiFi connection gives me such a massive sense of security and piece of mind. The fact that it auto-connects without me having to take an action is just the icing on the cake.
Streissand is an anti-censorship tool designed to bypass draconian government censorship like China’s Greatfirewall. You don’t live in China, do you really need do worry about censorship? Probably — and if you hang around the right subreddits — increasingly so.
Canada’s telcos are presently lobbying for a censorship regime. Perhaps the first draft targets content most of us would agree is “bad,” but who knows what the next version will look like.
Even if you’re less paranoid, there’s a good chance your workplace or school is filtering some content. Maybe it’s not content you bump in to very often. But if even if they are not filtering traffic, they’re almost certainly collecting your web traffic. That’s something I’ve never been too comfortable with.
A VPN allows you to take back your online freedom whenever you’re using a work, school or any other network that distrusts you.
Bypassing Geographic Restrictions
In case you missed, VPNs allow you to bypass geographic content restrictions. When you use a VPN, you traffic originates from the IP address of the VPN server. And since cloud providers host servers in many physical locations, you can easily bypass any geo restrictions based on IP address.
If you missed Monday’s post you can read it here:
Skill Level, Novice: To set this up you’ll want to be mildly comfortable with the command-line. But you won’t necessarily need know (or care) about the technologies involved.
Way back in 2010, firesheep scared my pants off. I was traveling for work when it dropped and I became acutely aware of just how vulnerable my data was on huge airport wifi. In the 8 years since then https everywhere has become a reality and the threat of bad actors sniffing your web traffic is nearly a thing of the past.
But I’m still paranoid. And today I finally did something about it.
Streisand is an open-source project with the goal of defeating censorship. The best way to defeat local censorship is secure, undetectable VPN connection (usually in a foreign country). The goal of defeating censorship aligns nicely with the goal of hardening your internet connection.
Streisand is essentially an installer for a set VPN tools which you’ll install on a cloud hosted server that you control. The project presently supports Amazon EC2, Azure, DigitalOcean, Google Compute Engine, Linode, and Rackspace. You simply run a few commands, select a few options (the defaults are totally fine) and Streisand does the rest.
If you’ve ever run
apt-get or setup homebrew on MacOS you should have no problem setting this up. Streisand’s installation instructions well written and easy to follow (jump right to the instruction here).
Much to my surprise — unlike many of these types of command-line driven projects — I ran into absolutely zero issues during the install.
It gets even easier.
If that doesn’t sound easy enough — get this — Streisand copies over an HTML document with an incredibly easy to use guide, per-filled with all the configuration settings your need for your server. It’s dead simple to share this with anybody you choose.
Bonus points: Auto-Connect on public WiFi.
The last time I used the TunnelBear app, I noticed an advanced setting to auto-connect to all wifi except for a whitelist of trusted network. So that if you’re on your secure home, work or other trusted wifi network, you don’t waste VPN bandwidth or take the potential performance hit.
Unfortunately, iOS doesn’t support settings like this natively.
In order to accomplish this, you have to create a custom
.mobileconfig file. These files are huge XML documents that you probably shouldn’t write by hand.
I am hosting my Streisand VPN on Linode, my goto host for the past serveral years. Their lowest tier server is more than power enough to host a VPN. And they generously include 1TB of service. For US$5/mo.
The $5/mo price-point is competitive with many of the popular VPN services. Except, since you’re self-hosting, you are not limited to 1 user. You can freely hand out the streisand connection to friends and family.
One of the most powerful aspects of the internet and open source software is the ability to take control of everything yourself. As somehow with this skills to do this myself, I am going to start to make a concerted effort to take control of more things myself and be less dependant on untrustworthy third-parties.
Running my own VPN is just one small step.
I wrote a short follow-up post you might enjoy:
It has been almost a month since the massive Cambridge Analytica x Facebook improper-user-data-ex-filtration mess (don’t call it a data breach) came to light. The news is settling down despite the real numbers coming out of Facebook and a possible 600,000 Canadians possibly affected.
I’ve been mulling over how I feel about it and I’ve finally come to a conclusion.
As much as I’d like to see this as a catalyst for people to start finding (and building) alternatives to Facebook’s walled garden of exploitation, I don’t think they did anything wrong.
The basic narrative of the Cambridge Analytica story seems to be that Facebook tricked average Americans opting to share all their facebook data with some benign looking app (like a quiz); which in turn gave the app maker further access to the victim’s friends data. Without the victim’s friends’ permission. In other words, if your friends fell for this ploy, Facebook’s API gave the app maker access to your data without your permission.
I don’t believe there is any truth do this assumption. Facebook’s API never granted access to this level of data about friends (let alone friends-of-friends). They are not that stupid.
I was involved in building Facebook app integration during the time that Cambridge Analytica gathered their data, I read Facebook’s Open Graph API documentation numerous times. Unfortunately that version of the API no longer seems to be available online, but I was able to find some old how-to videos referencing it.
As far as I can piece together, the only data about your friends that Facebook ever provided via the API was their full name and user id. Any data about your likes, political affiliation, family connections, marital status, or anything else that could be used for “psychographic” modelling was never available via your friends.
These personal details were available to anyone and everyone via your public profile! Assuming that you hadn’t opted out of sharing this info (and I really doubt most user were giving their privacy details much thought before they learned the name Cambridge Analytica).
In order for Cambridge Analytica and others to mine this data they would have had to write bots to scrape data directly from your public facing profile. In the past, it was very easy to gain access to these profiles in a programmatic way. Anybody could simply load http://facebook.com/profile.php?id= with your ID to see your public profile. Even a non-programmer can see how easy it would be to generate a list of targets for a bot to crawl.
At some point, Facebook started closing this “profile.php” access point as they rolled out username (I’m ohryanca). Once that was locked down, it became more complicated to scrape content and the bad actors became more clever.
I’m pretty sure I’m right
In a blog post yesterday Facebook announced an enormous array of restrictions to their APIs (which are undoubtedly pissing off a lot of sketchy developers). Regarding account recovery, they mentioned the following:
…malicious actors have also abused [account recovery] features to scrape public profile information by submitting phone numbers or email addresses they already have through search and account recovery. Given the scale and sophistication of the activity we’ve seen, we believe most people on Facebook could have had their public profile scraped in this way. So we have now disabled this feature. We’re also making changes to account recovery to reduce the risk of scraping as well.
As much as I hate to say it, I don’t think Facebook did anything wrong. Their APIs never fed this data to any and every app developer who wanted. Cambridge Analytica and friends had jump through additional hoops. They took actions that were outside of the normal/approved methods Facebook expected and allowed app makers to access our data.
Facebook simply built a reasonable public profile feature meant to allow you to use Facebook as a home on the web. A URL to share outside the platform.
They built a reasonable account recovery feature, that allowed users to recover their logins in standard non-controversial ways.
There is no evidence that Facebook’s APIs allowed access to the type of data Cambridge Analytica took advantage of. They were just outplayed by an opponent who thought of clever ways to get what it needed.
In case the mainstream media has lulled you in to a false sense of whatever; the democrats have this data too (and then some).
Here is footage of Carol Davidsen (VP of political technology at Rentrak) at a conference in 2015 gleefully explaining how the Obama campaign mapped THE ENTIRE SOCIAL GRAPH OF THE UNITED STATES who were on Facebook at the time of the 2012 election. The techniques she describes are strikingly similar to what Cambridge Analytica is accused of.
To confirm my suspicion about lack of blogging, I took some time to compile some stats on the roughly 450 normal non-celebrity human beings who follow on twitter. I counted all the people I follow how list a blog in their bio or within 1-click of the link in their bio (to account for “about me” landing pages).
I found that only 93% had a functioning blog attached to their account. Of those 93, only 42 had published one or more blog posts in 2018. 55% of the real humans I follow have abandoned blogging. A small handful of the blogs I looked at had not even been updated in the past 5 years (why you would even bother linking this to your bio is beyond me).
Here’s the really interesting thing though…
I had never read a post by nearly any of those 42 active bloggers I identified. I simply wasn’t aware they existed.
Blogging has always suffered from discoverability issues. Discoverability is hard without a centralized platform like Twitter, Tumblr, WordPress.com, etc. But I think it’s a solvable problem.
We need blogging…
I’m sure many more smarter people have shared their thoughts on the importance of blogging.
Very simply put, decentralized, self-published content, free of corporate or advertiser control, is kinda sorta the dream of the internet.
In 2018, it’s easier than ever.