path: home / archives / 2003 / 12
Although some people favour GZIP compression (web page compression on the fly; browser support sketchy) over the various caching methods (both client side and server side), I would have to recommend using both. That is if you can afford the overhead which is created by compressing each file on the fly; to put it into perspective, Amazon can't, I can. Which ballpark do you belong to?
GZIP compression comes as a free Apache module surprisingly named, mod_gzip. It seems easy enough to install and get running but I've never done it so distance travelled may vary. Anyway, mod_gzip works by compressing textual files like (X)HTML, CSS and XML and decompressing them client side, all in the wink of an eye. It is claimed that a compression rate of up to 80% can be achieved but I usually find that I get a compression rate of around 40-50% which, although smaller, still makes it a worth while venture.
The real benefit of using GZIP compression becomes apparent when you use it in conjunction with client side caching, evoked through etags and http unmodified headers. GZIP compression reduces the initial download of the file but if the client refreshes, the page is resent even if it is unmodified. This eats away at the saved bandwidth you collected from comprising your files. A much better scenario would have the browser check if the file had changed and re-download it only if nessecery.
This can be achieved using http headers, the little snippets of text which hold information about the requested file. Conveniently, they are sent along with the requested file.
PHP's header()
function allows us to insert or overwrite the headers which are being set to the client along with the file. As the headers must be sent before the file they must be placed at the top of the file or PHP will throw errors around like a toddler (let's not get into output buffering). So now we can edit the headers we can insert an etag, which is basically a generated hash which changes each time the file is updated. Not a bad indicator for checking if the file has changed then. A few if
else
es later and you've hacked up a etag based caching mechanism, much like Jordan Russell's I imagine.
Hope you have a nice day tomorrow. A barrel of laughs shall be had by all. That's an order. A time for giving and a time for receiving. Have a good one.
Following on from yesterdays piece about client side caching we can explore the ingenious world that is server side caching. Buckle your socks (or something like that), we're going in. First of all, why would you want to cache something server side? There is no bandwidth to be saved right? Correct, but something of equal importance can easily be affected depending on how busy the server is. I'm talking about server overhead. Both database overhead and script processing overhead can lead to timeouts, incorrect data and ultimately ugly error messages for your users. A great example of this was on Simon Willson's weblog. A few months ago he went from being a popular weblogger to a very popular weblogger, this of course meant that his pages were being read more often by more people then ever before. His homebrew weblogging system couldn't take the increased demand. Database connection errors plagued his site. So why did that happen? Well, as his homebrew weblogging system used a MySQL database to store posts and comments, each time somebody accessed a page the script would have to go and fetch the data from the MySQL database, format it and then display it. This works well if only a few people are accessing the page, but once it reaches the limit all hell breaks loose. If you can imagine the MySQL database as a house and the database table as a room in the house. Each time somebody requests a web page a little man has too run into the house, climb the stairs, run along the corridor and enter the room. He then has to search through the thousands of items in the room finally locating the one he is after. Now, he runs back down the corridor, down the stairs and out the door. Finally he gets out side where the angry crowd ask, "What took you so long?". Don't you feel sorry for the little man? This can all be avoided by creating a server side cache. For example if you knew a page only changed once a day, you could have the little man only fetch it once a day and photocopy it outside on the conveniently placed photocopier. Now all he has to do is press a button and wait a second or two and hand it out. Much simpler than all that running. As Simon is an ingenious programmer, he came up with a simple and effective method for caching his pages. He wrote about it of course.
Safari, Apple's very own little browser is not perfect. In fact it is quite some way off being perfect. Regular crashes, connection time outs and lame tab support aside, Safari has one major problem and that is with caching (update: this seems to have been fixed with 10.3.2 but as the rest of the post is still relevant...). Now caching is an ancient, lost and forgotten art. Many people have harked on about the importance of caching but I feel it needs to be said again. Caching in its simplest form (when taken in a file retrieval context) works like so; a user requests a file from a server. The user's client checks for a local copy of the file and then does one of two things, either finds it and displays it or doesn't find it and goes and collects it from the server. Now this is where it gets a little more tricky. What if it does find a copy of the file locally but the remote file has been updated? Well the client must allow for this with one of the following methods.
"If only those Google ads were gone we would be so close to perfection..." -- 9 Rules.The ads are staying, indefinitely. To be almost close to perfection is good enough for me.
copyright 2003-2004 (inclusive) zlog, all rights reserved. view the editorial policy.