better bandwidth protection: revisited

I meant to post this a couple of days after my initial bandwidth protection post, but alas, updating this site is usually the last thing on my mind.

Firstly, I glazed over something I probably should have explained in more detail. That is, the php file masquerades as the media file. The media files should not be in a web accessible location, this way it is not possible for anyone to direct link the the media file itself. To accomplish this you first need to send the proper content-type header, to tell the client it’s receiving media not a php file (the webmaster-toolkit.com has a good list of mime-types. For instance, if you’re protecting a real video file, you’d want:

header("content-type: application/vnd.rn-realmedia")

I’ve found that some browsers choke if they’re not given the proper file extension, so you’ll want to have .rm at the end of your request_uri, something like:

mediafiles.php?uid=uidstring&itemid;=id&abitraryvalue;=somethingirrelivate.rm

In case it is not completely clear, you do not necessarily want your code to do anything with the arbitrary value, it’s just there as a placeholder to tack on the file extension.

Next you’ll need to pass on the contents of the media file (after doing database queries or whatever is necessary to figure out the file path). In my original example I used the include() function. That was actually a pretty bad choice, php evaluates the content of the file being include()d and therefore will eat up some cpu cycles and potentially do really bad things if it happens to find a <? somewhere.

A function like readfile() would be a much better choice. Also, some feedback i received on digg suggests that php might bring your server to it’s kneels if you try to process a file larger than 100MB in this manner. My testing on my PII 400 fedora box did not encounter any problems, but it was far from scientific.

Security

I would also like to point out that my code snippets were not meant to be usable example code, but rather very brief outlines to help illustrate my ideas. As such, my code actually suffers the filename/path security hole I “paid lip service to.” I assumed that you would be able to figure out how to write the code yourselves. Here is an old post on NotIan.net that illustrates the bad things that can happen if you include filenames as request parameters, but fail to check the integrity of said filename/path.

Proxies

Apparently AOL (and some other ISPs) use a system of rotating proxies, in which each http request may be shunted through a different proxy server, ie. different IP address – even within the same page. This makes IP based filtering completely unreliable. I’m unsure how much internet traffic is routed through this sort of system and so I’m unsure how large of a problem this might be.

A Lighter Approach

Fear not, there is another similar approach that can be done without IP addresses whatsoever. The same concept of obfuscated keys can be applied to system for expiring links based on time.

In the first protection scheme, we essentially expired any keys which did not match the current IP address. In the last example, I included a date() value. Well, it should be fairly obvious that if you drop the ip address shenanigans you’ll get a key which expires based on date. The meat of a key generation function that expires daily would look like this

$key = md5(date('zY')+'MYSECRETSTRING');

Re: MYSECRETSTRING
Adding a secret string can be used to create a completely unique hash, making it harder to duplicate the hash. Although, i’m not cryptologist, but my gut tells me that doing so makes the hash easier to crack, as each seed contains a static token which could be used to calculate a pattern.

an important note:
Because our example uses a hashing your key generation cannot actually calculate an expiry DATE. The hash cannot be reversed so your validation function cannot retrieve to expiry date to determine if the it has passed. The only thing it can do if evaluate if the current hash matches the requested hash. Also for this reason the date format has to be exactly as precise at the duration of the key. An hourly key would have to use ‘HzY’ (24-hour format of an hour, day of year, year). If you fail to include the hour in 24hr format, the key will be valid in the morning and afternoon. If you do not include the year, the key will be valid every year on the given day. And so on.

Conversely, if you had reason to only allow access to content at certain times of day, or on certain days, you could ONLY specify that date() parameter.