Categories
From The Archives Web Development

better bandwidth protection: revisited

I meant to post this a couple of days after my initial bandwidth protection post, but alas, updating this site is usually the last thing on my mind.

Firstly, I glazed over something I probably should have explained in more detail. That is, the php file masquerades as the media file. The media files should not be in a web accessible location, this way it is not possible for anyone to direct link the the media file itself. To accomplish this you first need to send the proper content-type header, to tell the client it’s receiving media not a php file (the webmaster-toolkit.com has a good list of mime-types. For instance, if you’re protecting a real video file, you’d want:

header("content-type: application/vnd.rn-realmedia")

I’ve found that some browsers choke if they’re not given the proper file extension, so you’ll want to have .rm at the end of your request_uri, something like:

mediafiles.php?uid=uidstring&itemid;=id&abitraryvalue;=somethingirrelivate.rm

In case it is not completely clear, you do not necessarily want your code to do anything with the arbitrary value, it’s just there as a placeholder to tack on the file extension.

Next you’ll need to pass on the contents of the media file (after doing database queries or whatever is necessary to figure out the file path). In my original example I used the include() function. That was actually a pretty bad choice, php evaluates the content of the file being include()d and therefore will eat up some cpu cycles and potentially do really bad things if it happens to find a <? somewhere.

A function like readfile() would be a much better choice. Also, some feedback i received on digg suggests that php might bring your server to it’s kneels if you try to process a file larger than 100MB in this manner. My testing on my PII 400 fedora box did not encounter any problems, but it was far from scientific.

Security

I would also like to point out that my code snippets were not meant to be usable example code, but rather very brief outlines to help illustrate my ideas. As such, my code actually suffers the filename/path security hole I “paid lip service to.” I assumed that you would be able to figure out how to write the code yourselves. Here is an old post on NotIan.net that illustrates the bad things that can happen if you include filenames as request parameters, but fail to check the integrity of said filename/path.

Proxies

Apparently AOL (and some other ISPs) use a system of rotating proxies, in which each http request may be shunted through a different proxy server, ie. different IP address – even within the same page. This makes IP based filtering completely unreliable. I’m unsure how much internet traffic is routed through this sort of system and so I’m unsure how large of a problem this might be.

A Lighter Approach

Fear not, there is another similar approach that can be done without IP addresses whatsoever. The same concept of obfuscated keys can be applied to system for expiring links based on time.

In the first protection scheme, we essentially expired any keys which did not match the current IP address. In the last example, I included a date() value. Well, it should be fairly obvious that if you drop the ip address shenanigans you’ll get a key which expires based on date. The meat of a key generation function that expires daily would look like this

$key = md5(date('zY')+'MYSECRETSTRING');

Re: MYSECRETSTRING
Adding a secret string can be used to create a completely unique hash, making it harder to duplicate the hash. Although, i’m not cryptologist, but my gut tells me that doing so makes the hash easier to crack, as each seed contains a static token which could be used to calculate a pattern.

an important note:
Because our example uses a hashing your key generation cannot actually calculate an expiry DATE. The hash cannot be reversed so your validation function cannot retrieve to expiry date to determine if the it has passed. The only thing it can do if evaluate if the current hash matches the requested hash. Also for this reason the date format has to be exactly as precise at the duration of the key. An hourly key would have to use ‘HzY’ (24-hour format of an hour, day of year, year). If you fail to include the hour in 24hr format, the key will be valid in the morning and afternoon. If you do not include the year, the key will be valid every year on the given day. And so on.

Conversely, if you had reason to only allow access to content at certain times of day, or on certain days, you could ONLY specify that date() parameter.

Categories
From The Archives Web Development

Better Bandwidth Theft Protection

Bandwidth theft (sometimes referred to as hotlinking) is the bane of the internet, in some people’s opinions. Bandwidth theft is the practice of linking or embedding someone else’s content within your own page, without the owner’s permission. Posting l0lzzz ROFTLMAO images on forum message boards is probably the most common form of bandwidth theft, often costing innocent photobucket accounts. When theft also occurs with larger media such as mp3s or streaming video, it grossly increases bandwidth costs and reduces ad revenue for sites that might have advertisements on a “media player” page.

An obvious solution might be setting up a foolproof user authentication system, forcing users to log in before they can eat up your precious bandwidth. This may work well for porn sites and music stores, but it’s really not very practical – and somewhat annoying – on an average site.

HTTP_REFERER

A common method for locking down media content is the use of apache’s rewrite engine in an .htaccess file to check the HTTP_REFERER, eg:

RewriteEngine On

RewriteCond %{REQUEST_FILENAME} .*jpg$|.*gif$|.*png$ [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !yoursite\.com [NC]
RewriteCond %{HTTP_REFERER} !friendlysite\.com [NC]
RewriteCond %{HTTP_REFERER} !google\. [NC]
RewriteCond %{HTTP_REFERER} !search\?q=cache [NC]

If fact, this is the only method i have been able to find online. And this does work very well in a lot of situations. A List Apart even has an elegant solutions using php, redirecting the user to a nicely formatted page when a regular link is clicked. See: Smarter Image Hotlinking Prevention(the above code snippet was stolen from said article)

Streaming Media

A major problem arises when you need to protect your streaming media or any files which are not normally viewed by a web browser. When a media player like Real, Winamp or Windows Media Player request a file it does not send the HTTP_REFERER header to your webserver. [Although I suspect they might, I have not bothered to research whether embedded players send this header, as it is trivial to copy the src from an <embed> tag then open it in a media player and share it with your friends on a message board.]

What we need is some form of anonymous authorization, to prove that a request is coming from our site. Authorization normally requires at least two pieces of information, a unique username and it’s corresponding password. But didn’t I say that logging in was annoying? Yes, yes I did. Read on.

Instead of a username we can use the one reliably unique parameter of any internet connection: the ip address. And we will use our favorite web programming language to verify that a given ip address has visited your website. We could set up a fancy IP logging database – but this could potentially get out of hand very quickly. Why make your webserver do more than it has to.

A better solution

The solution I came up with tonight is an encrypted unique session id. This sort of thing is probably overly obvious to anyone who’s into cryptography- but i’m not, so i have no idea if this whole thing sounds really basic.

I’m getting a little ahead of myself.
The first thing we need to do is have php handle all media requests, something as simple as mediafiles.php?file=filename.ext. This way we obscure the location for the actual files, forcing users to link our php. You could even use an id for a database record containing the fileinfo or the file itself, obviously. Just make sure to validate you file so as you don’t reveal your complete server directory structure allowing your site to be hacked on national tv (yes, i’m looking your way junos.ca). Now that we have php handling our file requests when can do some validation.

Next, we “encrypt” the IP address and add it to the request uri. This way we can use the IP address as both username and password. A couple of good encryption methods are availible in php, namely md5() and crypt(). Observe the following code:

media link:

<a href="mediafiles.php?file=filename.ext&uid;=<?php echo md5($_SERVER['REMOTE_ADDR']); ?>">hot video</a>

mediafiles.php code snippet:

<?php
if(!isset($_GET['file']) || !isset($_GET['uid']){
exit;
}

if($_GET['uid'] == md5($_SERVER['REMOTE_ADDR'])){
include('/path/to/'.$_GET['file'];
}
?>

When some luser posts your link to their favorite forum it will look something like: http://www.yoursite.com/mediafiles.php?file=filename&uid=1f3870be274f6c49b3e31a0c6728957f. And when some other forum user clicks the link the md5 hash of their ip obviously will not match the original user’s.

There we have it. We’ve validated a user without any actual user validation.

In my opinion, simply using md5() is not quite good enough. It wouldn’t be too hard for an observant user to expose our little slight of hand. If they recognize the md5 hash in a link, they may create an md5() has of their own ip address and discover that they suddenly have access to our locked media. As added security, I suggest either creating your own pseudo-encryption method, or completely altering the ip address before md5()ing it. You might want to use the ip address to generate another number or only using certain parts of the address.

We can even cause the uid to expire by adding a date() value to our encryption function.

An example:

function encrypt_ip($ip){
   $quads = explode('.', $ip);
   $seed = $quads[0]+$quads[1]+$quads[2]+$quads[3] + date('z');
   return md5($seed);
}

To spell it out for you; if my ip address is 127.0.0.1, then today’s $seed value equals 341. This uid will expire tomorrow. Summation probably is not the best way to randomize the ip address, as it actually makes the ip less unique. As you can easily see a number of other ip addresses sum up to the same value. Someone who actually knows mathematics can tell you the odds. Note: whatever you do to generate $seed must not include any random numbers, or any values (besides the date) that will change – we need encrypt_ip() to consistently generate the same values for a give ip address.

I’ve spent far too much time writing this.

My next post will contain some suggestions for making this process a little more transparent.