The Cost Of Popularity

Bandwidth usage during April 2006 for http://www.lattimore.id.auHalf way through last month, I upgraded the version of WordPress that powers this site. During the upgrade process, I neglected to update the .htaccess file with additional configuration options. For those unaware, the .htaccess file is a plain text file used in the Apache web server to allow per site, per directory configuration of the web server.

As most are aware, hosting a web site generally costs money. Like most things which cost money, the people paying the bills generally like to get as much bang for their buck as possible. Unfortunately, there are people which take it upon themselves to reduce the bang, to a point where it will end up costing the site owner more money to host thier site.

Bandwidth usage during May 2006 for http://www.lattimore.id.auThe scenario I speak of very nearly happened to this site over the last fortnight. The additional configuration options which I neglected to add back into the .htaccess file were used to stop people linking directly to my images from another website; its generally referred to as hot linking.

Once I had realised that I hadn’t updated the information, I immediately checked my traffic logs to see if it had done any damage yet. At that point, somewhere around the 12th April there was no noticeable impact so as a test I thought I would allow Google access to the images via http://images.google.com.au. A few days later and the impact was clear; daily data consumption had increased from approximately 60Mb to 200Mb.

For the first week of May, the daily consumption has carried on from April. From the 8th May onwards, something changed dramatically as the site started to push out a literally double my previous daily peak! If the traffic trend from April continued for an entire month (~200Mb/day), I would exceed my monthly hosting plan and be billed an additional AU$600! When I checked how much traffic my site has been moving in the last couple of days, I was shocked to find out that if that trend continued (> 400Mb/day) that I’d be billed in excess of AU$1800 for a month of hosting!

To make sure this doesn’t happen again, I have re-enabled the hot linking protection through the .htaccess file. If you’re having a similar problem, you could achieve a similar outcome as follows:

  1. <IfModule mod_rewrite.c>
  2. RewriteEngine On
  3. RewriteBase /
  4. RewriteCond %{HTTP_REFERER} !^$
  5. RewriteCond %{HTTP_REFERER} !^http://(www\.)?lattimore\.id\.au [NC]
  6. RewriteRule \.(gif|jpe?g)$ - [NC,F]
  7. </IfModule>

Each line of the previous block is explained as follows:

  1. Check if the mod_rewrite module is available in Apache
  2. Enable the rewrite engine
  3. Set the base to the root of the domain
  4. Check that the referring information in your browser is not blank
  5. Check that the referring information in your browser is not www.lattimore.id.au
  6. Do a case insensitive check if the request has a .gif, .jpg or .jpeg in it and return a HTTP Forbidden to deny access

In a language that makes sense, it says that if you’re trying to access images on my site and you aren’t viewing the images from my site or entering the URL directly into your browser; you will be denied access to the image. So, if you were to directly link to one of my images from another website, your referring information won’t be blank and it won’t be this site – therefore you won’t be given access to the image.

If you did want to provide Google Images access to your files, you could insert the below two lines after line 5 above:

  1. RewriteCond %{HTTP_REFERER} !google\. [NC]
  2. RewriteCond %{HTTP_REFERER} !search\?q=cache [NC]

Traffic to your website is generally “a good thing”™. However, if you’re the one paying the bills – you’ll be well served to keep an eye on your website hosting statistics to make sure you aren’t going to get a nasty surprise.

4 thoughts on “The Cost Of Popularity

  1. correction for your explanation of the regex:
    doesnt jpe?g mean “jpg or jpeg” and not “jpe or jpeg” as you have indicated in your post?

  2. Jake,

    You’re right – I didn’t notice that I’d written that in the explanation. The RegEx is intended to match jpg or jpeg.

    I’ve amended point #6.

    Al.

Comments are closed.