Category: code

  • SSL Certificate Expiry Warning Script

    With the increasing trend of SSL on the web, where Google values SSL sites higher and you can have your site be added to the HSTS preload list (the browser will first try HTTPS before trying HTTP), it is a good idea to start using SSL yourself.

    The downside: you need to get a certificate through a (pre-trusted by the browser) CA, or certificate authority. This usually costs money, though there are some services that give you a certificate for free. The free certificates only last for one year or less, this means you need to request and install a new certificate frequently, especially when you have multiple domains.

    Now it can happen to anyone, even Microsoft (Windows Azure Service Disruption from Expired Certificate), that you forget to renew (and update) your certificate in time.

    There is a nice service called (interestingly enough not over HTTPS) that will send you an e-mail when a certificate is due to be updated. But as with any web service, unfortunately you can never be sure how long it’s going to live.

    So, I have created a script that I run through a cronjob every day that will send me a notification e-mail several times in advance (1 day and 2 7 14 30 60 days ahead), so that you are not dependent on a third party to get notified about expiries. As it is supposed to be with cronjobs, there is no output when there is nothing to report (thus no e-mail).

    Here is the script (download

    for i in /etc/certificates/*.pem; do
    	echo $(basename $i): $(openssl x509 -in $i -inform PEM -text -noout -enddate | grep "Not After" | tail -1 | awk '{print $4, $5, $7}') >> $CertExpiries
    Date=$(date -ud "+1 day" | awk '{print $2, $3, $6}')
    Expiries=$(grep "$Date" $CertExpiries)
    if [ $? -eq 0 ]; then 
    	echo These Certificates expire TOMORROW!
    	echo $Expiries
    for i in 2 7 14 30 60; do
    	Date=$(date -ud "+$i day" | awk '{print $2, $3, $6}')
    	Expiries=$(grep "$Date" $CertExpiries)
    	if [ $? -eq 0 ]; then 
    		echo These Certificates expire in $i days:
    		echo $Expiries
    rm $CertExpiries;

  • Add a Rate Limit to Your Website

    Suppose you have a ressource on the web (for example an API) that either generates a lot of load, or that is prone to be abused by excessive use, you want to rate-limit it. That is, only a certain number of requests is allowed per time-period.

    A possible way to do this is to use Memcache to record the number of requests received per a certain time period.

    Task: Only allow 1000 requests per 5 minutes

    First attempt:
    The naive approach would be to have a key rate-limit- (where would be the client’s IP address) with a expiration time of 5 minutes (aka 300 seconds) and increment it with every request. But consider this:

    10:00: 250 reqs -> value 250
    10:02: 500 reqs -> value 750
    10:04: 250 reqs -> value 1000
    10:06: 100 reqs -> value 1250 -> fails! (though there were only 850 requests in the last 5 minutes)

    Whats the problem?

    Memcache renews the expiration time with every set.

    Second attempt:
    Have a new key every 5 minutes: rate-limit-${minutes modulo 5}. This circumvents the problem that the key expiration but creates another one:

    10:00: 250 reqs -> value 250
    10:02: 500 reqs -> value 750
    10:04: 250 reqs -> value 1000
    10:06: 300 reqs -> value 300 -> doesn’t fail! (though there were 1050 requests in the last 5 minutes)

    Store the value for each minute separately: rate-limit-$hour$minute. When checking, query all the keys in the last 5 minutes to calculate the requests in the last 5 minutes.

    Sample code:

    foreach ($this->getKeys($minutes) as $key) {
        $requests += $this->memcache->get($key);
    $this->memcache->increment($key, 1);
    if ($requests > $allowedRequests) throw new RateExceededException;

    For your convenience I have open sourced my code at github: php-ratelimiter.

  • munin smart plugin: ignore error in the past

    As a hard drive in my server failed, my hosting provider exchanged the drive with another one which obviously had some sort of error in its past, but now seems to be fully ok again. I would have wished to receive a drive without any problems but as my server is RAID 1, I can live with that.

    I do my monitoring with Munin and for monitoring my hard drives I use the smart plugin. Now this plugin also monitors the exit code of smartctl, where smartctl sets bit no 6 if there was an error in the past, so now while everything is alright, the exit code is always numeric 64.

    Now the smart plugin reports this as an error, if the exit code is > 0, i.e. now it always reports a problem.

    I could set the threshold to 65, but then I wouldn’t be notified of other errors which essentially makes the plugin useless.

    I asked at Serverfault but no one seems to have a solution for that.

    So I attacked the problem on my own and patched the plugin. In the source code the important line is here:

    if exit_status!=None :
    # smartctl exit code is a bitmask, check man page.

    which I have modified to look like this:

    if exit_status!=None :
    # smartctl exit code is a bitmask, check man page.
    # filter out bit 6
    num_exit_status &= 191
    if num_exit_status<=2 : exit_status=None if exit_status!=None :

    Now it doesn't bug me anymore when bit 6 is set, but if any other bit goes on again, I will still be notified. The most interesting part is the line where there is a bitwise operation with 191: this is 0x11011111 in binary, so doing an AND operation with the current value it will just set bit no 6 to 0 while letting the other values untouched.

    Therefore a value of 64 (as mine does) will be reported as 0 while a value of 8 would remain at 8. But also, very importantly, a value of 72 (bit 6 set as always and bit 3 set because the disk is failing) it would also report 8.

    And there we have another reason, why it is good to be firm with knowledge about how bits and bytes behave in a computer. Saved me from a warning message every 5 minutes :-)

  • preg_match, UTF-8 and whitespace

    Just a quick note, be careful when using the whitespace character \s in preg_match when operating with UTF-8 strings.

    Suppose you have a string containing a dagger symbol. When you try to strip all whitespace from the string like this, you will end up with an invalid UTF-8 character:

    $ php -r 'echo preg_replace("#\s#", "", "?");' | xxd
    0000000: e280

    (On a side note: xxd displays all bytes in hexadecimal representation. The resulting string here consists of two bytes e2 and 80)

    \s stripped away the a0 byte. I was unaware that this character was included in the whitespace list, but actually it represents the non-breaking space.

    So actually use the u (PCRE8) modifier as it will be aware of the a0 “belonging” to the dagger:

    $ php -r 'echo preg_replace("#\s#u", "", "?");' | xxd
    0000000: e280 a0

    By the way, trim() doesn’t strip non-breaking spaces and can therefore safely be used for UTF-8 strings. (If you still want to trim non-breaking spaces with trim, read this comment on

    Finally here you can see the ASCII characters matched by \s when using the u modifier.

    $ php -r '$i = 0; while (++$i < 256) echo preg_replace("#[^\s]#", "", chr($i));' | xxd 0000000: 090a 0c0d 2085 a0 $ php -r '$i = 0; while (++$i < 256) echo preg_replace("#[^\s]#u", "", chr($i));' | xxd 0000000: 090a 0c0d 20

    Functions operating just on the ASCII characters (with a byte code below 128) are generally safe, as the multi-byte characters of UTF-8 have a leading bit of one (and are therefore above 128).

  • Restoring single objects in mongodb

    Today I had the need to restore single objects from a mongodb installation. mongodb offers two tools for this mongodump and mongorestore, both of which seem to be designed to only dump and restore whole collections.

    So I’ll demonstrate the workflow just to restore a bunch of objects. Maybe it’s a clumsy way, but we’ll improve this over time.

    So, we have an existing backup, done with mongodump (for example through a daily, over-night backup). This consists of several .bson files, one for each collection.

    1. Restore a whole collection to a new database: mongorestore -d newdb collection.bson
    2. Open this database: mongo newdb
    3. Find the items you want to restore through a query, for example: db.collection.find({"_id": {"$gte": ObjectId("4da4231c747359d16c370000")}});
    4. Back on the command line again, just dump these lines to a new bson file: mongodump -d newdb -c collection -q '{"_id": {"$gte": ObjectId("4da4231c747359d16c370000")}}'
    5. Now you can finally import just those objects into your existing collection: mongorestore -d realdb collection.bson

  • trac Report for Feature Voting

    I use trac for quite a few projects of mine. Recently I tried to find a plugin for deciding which features to implement next. Usually trac hacks has something in store for that, but not this time.

    I wanted to be able to create a ticket and then collect user feedback as comments for the feature, with each piece of feedback being a vote for that feature, like this:

    After searching for a bit I came up with a solution by using just a report with a nicely constructed SQL query.

    SELECT p.value AS __color__,
       t.type AS `type`, id AS ticket, count(tc.ticket) as votes, summary, component, version, milestone,
       t.time AS created,
       changetime AS _changetime, description AS _description,
       reporter AS _reporter
      FROM ticket t, ticket_change tc, enum p
      WHERE t.status <> 'closed'
    AND tc.ticket = and tc.field = 'comment' and tc.newvalue like '%#vote%'
    AND = t.priority AND p.type = 'priority'
    GROUP BY id, summary, component, version, milestone, t.type, owner, t.time,
      changetime, description, reporter, p.value, status
    HAVING count(tc.ticket) >= 1
     ORDER BY votes DESC, milestone, t.type, t.time

    So just by including “#vote” in a comment, it would count towards the number of votes. You can change this text to anything you want, of course. For example like this:

    I hope this can be useful for someone else, too.

  • iOS 2011 Alarm Clock Bug

    Just to add to the speculation about the causes of the 2011 alarm clock bug of iOS where the one-time alarms would not activate on January 1 and January 2, 2011.

    My guess is that the code that sets off the alarm takes day, month and year into account when checking whether the alarm should go off. But instead of using the “normal” year, an ISO-8601 year could have been used. This type of year is calculated by the number of the week (with Monday as the first day of the week), thus for the week 52 (from December 27, 2010 to January 2, 2011) the respective year remains 2010.

    When setting the date to January 1, 2012, the alarm doesn’t go off as well (week 52 of 2011). This adds to my theory and also means that this hasn’t been a one-time issue and requires a bug fix by Apple.

  • Debugging PHP on Mac OS X


    I have been using Mac OS X as my primary operating system for a few years now, and only today I have found a very neat way to debug PHP code, like it is common for application code (i.e. stepping through code for debugging purposes).

    The solution is a combination of Xdebug and MacGDBp.


    I am using the PHP package by Marc Liyanage almost ever since I have been working on OS X, because it’s far more flexible than the PHP shipped with OS X.

    Unfortunately, installing Xdebug the usual pecl install xdebug doesn’t work. But on the internetz you can find a solution to this problem.

    Basically you need to download the source tarball and use the magic command CFLAGS='-arch x86_64' ./configure --enable-xdebug for configuring it. (The same works for installing APC by the way)

    /usr/local/php5/php.d $ cat 50-extension-xdebug.ini


    Now you can use MacGDBp. There is an article on Particletree that describes the interface in a little more detail.

    I really enjoy using this method to only fire up this external program, when I want to debug some PHP code, and can continue to use my small editor, so that I don’t have to switch to a huge IDE to accomplish the same.

  • Website Optimization, a book by Andrew B. King

    Website Optimization

    This time I’m reviewing a book by Andy King. Unlike High Performance website by Steve Souders, it doesn’t solely focus on the speed side of optimization, but it adds the art of Search Engine Optimization to form a compelling mix in a single book.

    If you have a website that underperforms your expectations, this single book can be your one-stop shop to get all the knowledge you need.

    Andy uses interesting examples of how he succeeded in improving his clients’ pages that illustrate well what he describes in theory before. He not only focuses on how to make your website show up at high ranks in search engines (what he calls “natural SEO”), but also discusses in detail how to use pay per click (PPC) ads to drive even more people to one’s site. I especially liked how Andy describes how to find the best keywords to pick and also describes how to monitor success of PPC.

    The part about the optimization for speed feels a little too separated in the book. It is a good read and provides similar content as Steve Souders book, though the level of detail feels a little awkward considering how different the audience for the SEO part of the book is. Still, programmers can easily get deep knowledge about how to get that page load fast.

    Unfortunately Andy missed out a little on bringing this all into the grand picture. Why would I want to follow not only SEO but also optimize the speed of the page? There is a chapter meant to “bridge” the topics, but it turns out to be about how to properly do statistics and use the correct metrics. Important, but not enough to really connect the topics (and actually I would have expected this bridging beforehand).

    Altogether I would have structured things a little different. For example: It’s the content that makes search engines find the page and makes people return to a page, yet Andy explains how to pick the right keywords for the content first whereas he tells the reader how to create it only afterwards.
    Everything is there, I had just hoped for a different organization of things.

    All in all, the book really deserves the broad title “Website Optimization.” Other books leave out SEO which usually is the thing that people mean when they want to optimize their websites (or have them optimized).

    I really liked that the topics are combined a book and I highly recommend the book for everyone who wants to get his or her website in shape.

    The book has been published by O’Reilly in July 2008, ISBN 9780596515089. Also take a look at the Website Optimization Secrets companion site.

    Thanks to Andy for providing me a review copy of this book.

  • Upgrade WordPress Script

    Whenever a new version of WordPress comes out (as just WordPress 2.6 did), it is somewhat of a pain to upgrade it.

    But not for me anymore, because I have created a small (and simple) script some versions ago which I would like to share with you.

    $ cat
    mv www wordpress
    tar --overwrite -xzf latest.tar.gz
    rm latest.tar.gz
    mv wordpress www

    www is my document root and the script sits outside of it. It downloads the most recent version, extracts it while overwriting the already existing files. The script doesn’t contain anything extra-ordinary, but makes upgrading real easy.

    Of course this script is only useful if you have ssh access to your web server, but if you do that script might ease the (almost too frequent) pain of upgrading WordPress.

  • bash completion for the pear command

    I am only scratching my own itch here, but maybe someone can use it or expand from it.

    I just always found annoying that pear run-tests tab gives all files instead of just *.phpt. This is what this snippet actually does.

    Paste this into the file /opt/local/etc/bash_completion on OSX (for me it is just before _filedir_xspec()) or into a new file /etc/bash_completion.d/pear on Debian.

    # pear completion
    have pear &&
    local cur prev commands options command


    commands='build bundle channel-add channel-alias channel-delete channel-discover channel-info channel-update clear-cache config-create config-get config-help config-set config-show convert cvsdiff cvstag download download-all info install list list-all list-channels list-files list-upgrades login logout makerpm package package-dependencies package-validate pickle remote-info remote-list run-scripts run-tests search shell-test sign uninstall update-channels upgrade upgrade-all'

    if [[ $COMP_CWORD -eq 1 ]] ; then
    if [[ "$cur" == -* ]]; then
    COMPREPLY=( $( compgen -W '-V' -- $cur ) )
    COMPREPLY=( $( compgen -W "$commands" -- $cur ) )


    case $command in
    _filedir 'phpt'

    return 0
    complete -F _pear $default pear

    Then re-source your bashrc or logout and re-login.

    I am far from being an expert in bash_completion programming, so I hope someone can go on from here (or maybe has something more complete lying around?).

  • High Performance Web Sites, a book by Steve Souders

    I’d like to introduce you to this great book by Steve Souders. There already have been several reports on the Internet about it, for example on the Yahoo Developers Blog. There is also a video of Steve Souders talking about the book.

    The book is structured into 14 rules, which, when applied properly, can vastly improve the speed of a web site or web application.

    Alongside with the book he also introduced YSlow, an extension for the Firefox extension FireBug. YSlow helps the developer to see how good his site complies with the rules Steve has set up.

    I had the honour to do the technical review on this book, and I love it. Apart from some standard techniques (for example employing HTTP headers like Expires or Last-Modified/Etag), Steve certainly has some tricks up his sleave:

    For instance he shows how it is possible to reduce the number of HTTP requests (by inlining the script sources) for first time visitors, while still filling up their cache for their next page load (see page 59ff).

    The small down side of this book is that some rules need to be taken with care when applied to smaller environments; for example, it does not make sense (from a cost-benefit perspective) for everyone to employ a CDN. A book just can’t be perfect for all readers.

    If you are interested in web site performance and have a developer background, then buy this book (or read it online). It is certainly something for you.

    The book has been published by O’Reilly in September 2007, ISBN 9780596529307.

    Some more links on the topic:

    high performance web apps, steve souders

  • Subversion: The Magic of Merging

    When programming professionally, Subversion is a must-have. Same for system administration: it’s quite a good idea to keep your configuration files (e.g in Linux the whole /etc/ directory) as a Subversion checkout.

    So the goal of Subversion (or any other Source Control system) is to allow you to do something Apple will introduce with it’s new Leopard operating system: Time Machine. Go back in time (and restore a version of a file as it was on day x).

    Using Subversion on a daily basis is quite easy. Just check in (svn ci) your changes after you have completed a certain task. When you work collaboratively, and someone else has committed some changes, you do a svn up and the changes of the others are applied to your codebase.

    That’s all you basically need. But how can you go back in time now?

    So you poke around a bit and find that svn up has a parameter -r which let’s you put your checkout to the state in which it was at a certain revision.

    Let’s suppose we know that something was ok on monday and is not today. So let’s use the command from above to see what it looks like.

    ~/project/trunk$ svn up -r {2006-10-09} app.php
    U app.php

    Voila, there it is. Now we choose to use that code now and throw away all changes that have been committed since. We modify the file a bit and do a check in:

    ~/project/trunk$ svn ci -m "revert to monday" app.php
    Sending app.php
    svn: Commit failed (details follow):
    svn: Your file or directory 'app.php' is probably out-of-date
    svn: The version resource does not correspond to the resource within the transaction. Either the requested version resource is out of date (needs to be updated), or the requested version resource is newer than the transaction root (restart the commit).

    Uh.. ok. So you probably you know that error message already. It is also returned when you want to check something in on a file that has been changed by someone else since your last svn up.

    When you check something into a subversion repository, one of the basic rules is that the file you want to commit is “up to date”, i.e. the revision number of your local file (updated by svn up) equals the number in the repository (on the server).

    Ok, so, let’s update our checkout so we can re-run the check in.

    ~/project/trunk$ svn up
    G app.php

    So you discover the changes that happened since have been re-inserted to that file again. Maybe Subversion has alerted you of a conflict, because you changed some lines that have been modified since monday also.

    Great! Basically we are back to where we started.

    Let’s not resign here, but rather use the appropriate command: svn merge. That command is mostly known for merging changes from one branch of development to another. But it can also help you to go back in time.

    The parameters of svn merge are to specify a revision range, which changes to be merged, and a source — what part of the subversion repository should be searched for the changes.

    Usually one would find this command used in a way like:

    ~/project/trunk$ svn merge -r 15:26 ../branches/first_release/
    G app.php

    So with two revisions specified you define a range of changes which should be merged into the current checkout. Ok so how would us help this here?

    You can also specify revisions backwards, to go back in time. So to undo the command form before you can write:

    ~/project/trunk$ svn merge -r 26:15 ../branches/first_release/
    G app.php

    To put it simple, Subversion generates a diff file behind the scenes that incorporates the changes between the given revisions. Then the changes are merged with the files in the same way the patch command (Linux, Unix, OS X, …) does it. When going back in time, the parameter -R is used which applies the patch in the reverse direction. Voila.

    So as a final solution this leaves us with:

    ~/project/trunk$ svn merge -r head:{2006-10-09} .
    U app.php
    ~/project/trunk$ svn ci -m "revert to monday" app.php
    Sending app.php
    Transmitting file data .
    Committed revision 27.

    For further questions, the Subversion FAQ is a good starting point when you know exactly what you want (i.e. the correct terminology). (For example reverting does not mean to go back to a previous version of the file, but rather to remove the changes you did locally).

    There is the subversion book (also published by O’Reilly), of which the Guided Tour is a good starting point.

    The process I described above as a trial and error is also described in that book at Undoing changes.

    Also OSCON: Subversion Best Practices, a transcript of a talk given by the subversion creators (Ben Collins-Sussman and Brian W. Fitzpatrick) by Brad Choate has some good tips.

    Have fun :)

    subversion, merge, revert, time machine

  • JavaScript Tricks And Good Programming Style

    Note that this is an updated version. Original version can be found here.

    Thanks to the commenters I have updated this post with some better tricks.

    In a loose series I’d like to point out a few of them. As I am currently mostly programming in JavaScript, I will write most of my samples in that language; also some of the tricks I mention only apply to JavaScript. But most of them apply to most programming languages around.

    Optional parameter and default value #
    When defining a function in PHP you can declare optional parameters by giving them a default value (something like function myfunc($optional = "default value") {}).

    In JavaScript it works a bit differently:

    var myfunc = function(optional) {
    if (typeof optional == "undefined") {
    optional = "default value";

    This is a clean method to do it. Basically I pretty much recommend the use of typeof operator.

    Michael Geary (his comment) pointed out this solution that I like.

    var myfunc = function(optional) {
    if (optional === undefined) {
    optional = "default value";

    The solutions mentioned (if (!optional), optional = optional || "default value", and the like) have problems when you pass 0 (zero) or null as an argument.

    Commenters said that the 0/null problem is not one as this would not be the situation to use it. I would not say so. In an AJAX world where you do serialization back to a server/database often a 0/1 to false/true mapping has to be established. For default values it is important.

    In case you just need to make sure that an object is not null I do prefer the mentioned

    myobject = myobject || { animal: "dog" };

    end update

    Parameters Hints #
    The larger your app gets, the more functions you get which you would use throughout the app. It also creates a problem with maintenance. As each function can contain multiple arguments it is not unlikely that you forget what those parameters were for (especially for boolean variables) or mix up their sequence (I am especially gifted for that).

    So what I do is this: update substitute variables with comments end update

    var myfunc2 = function(title, enable_notify) {
    // [...]
    myfunc2(/* title */ "test", /* enable_notify */ true);

    This piece of code relies on the functionality of programming languages that the return value of an assignment is the assigned value. (This is something that you should also maintain in your app, for example with database storage calls, give the assignment value as a return value. It’s minimal effort and you might be glad at some point that you did it).

    If you do this you can see at any point in the code, what parameters the function takes. Of course this is not always useful, but especially for functions with many parameters it gets very useful.

    Search JavaScript documentation #
    When I need some documentation for JavaScript I use the mozilla development center (mdc). To quickly search for toLocaleString, I use Google:

    As I am a German speaker I also use the excellent (though a bit out-dated) JavaScript section SelfHTML. I use the downloaded version on my own computer for even faster access.

    The self variable #

    … should be avoided. Even if someone like Douglas Crockford (creator of JSON) uses it and calls it that.

    Let me quote Jack Slocum who put it best:

    // used to fix "this" prob with Function.apply to give call proper scope
    // nice method to put in your lib
    function delegate(instance, method) {
    return function() {
    return method.apply(instance, arguments);

    function Animal(name) { = name;
    this.hello = function() {
    alert("hello " +;

    var dog = new Animal("Jake");
    var button = {
    onclick : delegate(dog, dog.hello)

    I removed my code as it can be considered obsolete by this.
    end update

    Reduce indentation amount #

    I have removed the code because it leads people into believing something different than I meant. So let me put it differently:

    What I am opposing is white space deserts. If you have many levels of indentation then probably something is wrong.

    If a for loop only applies to a handful of cases, don’t indent the whole loop in an if clause but rather catch the other cases at the top.
    Often it is advisable to move longer functionality to a function (there is a good reason for that name) that you call throughout a loop.
    end update

    That’s all for now, to be continued. Further readings on this blog:

    Eventhough some commenters disagreed with what I said, I think posts like this are very much needed in the bloggersphere. Even if they are not free of errors on the first take, great people can help improve them. I would appreciate if more people took that risk.
    end update

    javascript, tricks, coding practices

  • JavaScript Tricks And Good Programming Style – Original Version

    Note that there is an updated version

    I have been programming for about 10 years now, and I am always longing for improving my code. Throughout time I added a few habbits that I consider to be good practices and increase the quality of my code.

    In a loose series I’d like to point out a few of them. As I am currently mostly programming in JavaScript, I will write most of my samples in that language; also some of the tricks I mention only apply to JavaScript. But most of them apply to most programming languages around.

    Optional parameter and default value #
    When defining a function in PHP you can declare optional parameters by giving them a default value (something like function myfunc($optional = "default value") {}).

    In JavaScript it works a bit differently:

    var myfunc = function(optional) {
    if (typeof optional == "undefined") {
    optional = "default value";

    This is a clean method to do it. Basically I pretty much recommend the use of typeof operator. Some people would do the above with a if (!optional), but my version works cross browser (e.g. Safari will throw an error when you try to negate null).

    Parameters Hints #
    The larger your app gets, the more functions you get which you would use throughout the app. It also creates a problem with maintenance. As each function can contain multiple arguments it is not unlikely that you forget what those parameters were for (especially for boolean variables) or mix up their sequence (I am especially gifted for that).

    So what I do is this:

    var myfunc2 = function(title, enable_notify) {
    // [...]
    myfunc2(title = "test", enable_notify = true);

    This piece of code relies on the functionality of programming languages that the return value of an assignment is the assigned value. (This is something that you should also maintain in your app, for example with database storage calls, give the assignment value as a return value. It’s minimal effort and you might be glad at some point that you did it).

    If you do this you can see at any point in the code, what parameters the function takes. Of course this is not always useful, but especially for functions with many parameters it gets very useful.

    Also be careful that you would override the variable names in the scope of which you are calling the function. You might mini-namespace the variables, e.g. with letter+underscore (p_title, p_enable_notify).

    Search JavaScript documentation #
    When I need some documentation for JavaScript I use the mozilla development center (mdc). To quickly search for toLocaleString, I use Google:

    As I am a German speaker I also use the excellent (though a bit out-dated) JavaScript section SelfHTML. I use the downloaded version on my own computer for even faster access.

    The self variable #
    This technique comes from Private Members in JavaScript by Douglas Crockford. By assigning a value in a function to this.value it will be publically accessible afterwards.

    function Animal(name) { = name;
    var self = this;
    this.hello = function() {
    alert("hello " +;
    //alert("hello " +; // would fail
    var dog = new Animal("Jake");
    button = {
    onclick = dog.hello;

    The cause of this problem is that the this keyword receives different values in different contexts. See here for a closer explanation.

    Problem with this solution is that I am not absolutely sure if this creates a memory leak in internet explorer

    Reduce indentation amount #
    One of the most annoying things I find in other people’s code is this: (multiple) nested if clauses. Something like this:

    var arr = ["dog", "cat"];
    var action = 'greet';
    for(i = 0, ln = arr.length; i < ln; i++) { animal = arr[i]; if (animal == "cat") { alert("hello " + animal); } }

    This is only a short example, but I often saw this going deep into 10 levels of nested clauses. I suggest using the break and continue (and next in Perl):

    var arr = ["dog", "cat"];
    for(i = 0, ln = arr.length; i < ln; i++) { animal = arr[i]; if (animal != "cat") continue; alert("hello " + animal); }

    This accomplishes the same with only one level of indentation. One more example for a function:

    function greet_animal(animal) {
    if (typeof animal == "undefined") return;
    if (animal != "cat") return;
    alert("hello " + animal);

    Javascript is one of the few languages where you can leave the return value empty (i.e. typeof greet_animal() == "undefined"). You might want to rather use return false so that you can easily determine if the function failed for some reason.

    javascript, tricks, coding practices

  • Firefox 1.5, XmlHttpRequest, req.responseXML and document.domain

    Recently I have been working on a web application, extending it with an iframe on another subdomain.

    When you set up communication with an iframe on another subdomain, it works by setting document.domain in both pages. Pretty nice and straight forward.
    But it can mess up the rest of your page.

    As soon as you have set document.domain you should be able to do an XHR to your original domain according to the same domain policy.

    This will work in IE, Safari, and Opera.
    This will not work in Firefox 1.0. This is very awkward but at least it has been fixed in 1.5.
    So it will work in Firefox 1.5. But:

    The responseXML object is useless. You can’t access it, you receive a Permission Denied when trying to access it’s content (e.g. documentElement). Very annoying.
    Even stranger that responseText is still readable. What’s the reason for this? Is there some security risk i am unaware of or is it a plain bug?

    As the responseText is available there is a pretty simple fix: re-parse the XML, which is kinda stupid and cpu intense if you have a lot of them. (something like: var doc =
    (new DOMParser()).parseFromString(req.responseText, "text/xml");

    I have some sample code available here.

    Apparently a bug report has been filed at No response from developers. Great.
    Unfortunately it has only been filed for OSX, but it also afffects Windows Firefox.

    Mozilla guys, fix this ASAP.

    Update 2007-06-21: Things seem to start moving, we will likely have a fix for Firefox 3.

    firefox bug, document.domain, XmlHttpRequest, responseXML

  • Misuse of the Array Object in JavaScript

    There is a very good post about Associative Arrays considered harmful by Andrew Dupont.

    The title is a bit misleading but correct. When coming accross a piece of JavaScript like this
    foo["test"] = 1;
    there is nothing wrong about it. It’s the basic usage scheme of assoziative arrays. Or should i rather say objects?

    While in languages such as PHP arrays used like this $foo = array("test" => 1); is perfectly correct.

    In JavaScript
    var foo = new Array();
    foo["test"] = 1;

    works but does not do what you want.

    I don’t need to repeat Andrew’s really great post, but basically you should use Object instead of Array.

    var foo = new Object(); // same as var foo = {};
    foo["test"] = 1; // same as foo.test = 1;

    Now go and read Andrew’s post.

    via Erik Arvidsson.

    btw: that post lead me to Object.prototype is verboten which explains for me why my for (i in myvar) {} loops never worked correctly. I was using prototype.js version < 1.4 (which messed with Object.prototype).

    javascript, array, object, prototype.js, Object.prototype

  • Welcome naked!

    Today is CSS Naked Day. Take this chance to visit my site in person (= with your browser).
    Looking quite okay naked, too ;)

    css naked day, css

  • A better understanding of JavaScript

    I’ve been working with JavaScript for years. It was my replacement for a server side language when I couldn’t afford to buy web space in the mid-90’s. Still, as the language becomes popular again, I recognized that I did understand the basics but there was much more to the language.

    digg it, add to delicious

    So I dug into the topic a little deeper. I can highly recommend reading the blogs of all the great JavaScript guys like Alex Russell (of Dojo), Aaron Boodman, Erik Arvidsson (both at Google), Douglas Crockford (at Yahoo). (Give me more in the comments ;)

    So, JavaScript is easy to start with. You can take a procedural approach like in C. You declare a function, you call a function.

    A Survey of the JavaScript Programming Language (by Douglas Crockford) does an amazing job at explaining the notable aspects of the language on a quite short page.

    I want to point out the most interesting points for me:

    Subscript and dot notation
    You can access a member of an object by using two different notations:

    var y = { p: 1 };
    alert(y["p"]); // subscript notation
    alert(y.p); // dot notation

    The great difference is that with subscript notation you can also access member vars that contain reserved words (of which there quite a few in JavaScript). Dot notation is shorter and more convenient.

    Different meanings of the this keyword
    Consider this piece of code creating a small object.

    click here
    <script type="text/javascript">
    var myobject = {
    id: 'obj',
    method: function() {
    var l = document.getElementById("link");
    l.onclick = myobject.method;

    When you call myobject.method();, this points to the current object and you receive an alert box with the text ‘obj’. But there are exceptions:

    If you call this function from within a HTML page via an onclick event, this is refers to the calling object (i.e. the link). You will therefore receive and alert box containing ‘link’ as message.

    This can be useful in many cases, but if you want to access “your” object, you can’t. Aaron Boodman proposed a function that was eventually named hitch:

    function hitch(obj, meth) {
    return function() { return obj[meth].apply(obj, arguments); }

    You’d use it like this: l.onclick=hitch(myobject, 'method'); Now the this keyword points at the correct object.

    You could also change the function to something like this and still use the previous notation:

    method = function() {
    if (this != myobject) { return myobject.method(arguments); }

    Creating objects with new
    I was always wondering how to create objects from a class as I am used to with other programming languages, which means that by instanciating the object is created according to the “building instructions” of a class.

    Douglas shows this in more detail on his Private Members in JavaScript page.

    I’ve quickly hacked together this example:

    var x = function () {
    var created = new Date();
    this.when = function () { alert(created); }
    var p, u = new x();
    window.setTimeout("n()", 1000);
    function n () {
    p = new x();
    alert(typeof p.created);

    You receive 2 objects p and u that have different creation times. They also have a private variable created which is only accessible via the public function when (because specified via this).

    So even as you create an object by using the new Object() or {} notation, you only receive a static object. If you want to instanciate it, you need to create it as function.

    The example above already demonstrated closures. The fact that closures exist in JavaScript make it only possible to create private variables.

    A closure is, to put it simply, a function within another function. The inner function has access to it’s parents variables but not the other way round.

    All together a function is just another data type that can be assigned to a variable. Therefore these two notations can be used interchangably:

    function test() { alert(new Date()); }
    var test = function() { alert(new Date()); }

    The ominous prototype “object” is a way of using the this keyword from “outside”.
    Modifying the piece of code from before:

    var x = function () {
    var created = new Date();
    x.prototype.when = function () { alert(created); }

    But there’s a pitfall. The created variable is private. Even though the function when now is a member of the object x it does not “see” the variable created. So in the original example the function when had privileged access (see Private Members in JavaScript).

    All in all I see that JavaScript is a powerful language. Many things that can be accomplished in an elegant (and sometimes quite unusual) way. (Curried JavaScript demonstrates even how to use it as a functional programming language)

    I realize that there is a nice and clean solution for almost every problem you come across. This is where libraries come into play. The downside: you can quickly add tons of libraries, leading to large page sizes and memory consumption.

    dojo for example is a really great library that provides you with numerous well thought-out functions, making your life a lot easier. But the size is 132 KB, just for the basic functions. More than a mega byte all in all. It circumvents needing to load everything by an in time loading mechanism (dojo.require).

    In my opinion we’d need something like a local library storage. A Firefox extension would be a nice first step.
    As far as I have looked into that topic, though, there are some difficulties. Foremost there is a problem with namespaces. Firefox clearly separates JS code by extensions from those coming from the web. A good thing, security-wise, but hindering in this case.

    Maybe some Firefox guru can tell a way how to circumvent this, I think it might be worth a shot.

    digg it, add to delicious

    javascript, object, dojo, library

  • 10 Realistic Steps to a Faster Web Site

    I complained before about bad guides to improve the performance of your website.

    digg it, add to delicious

    I’d like to give you a more realistic guide on how to achieve the goal. I have written my master thesis in computer sciences on this topic and will refer to it throughout the guide.

    1. Determine the bottleneck
    When you want to improve the speed of your website, you feel that it’s somehow slow. There are various points that can affect the performance of your page. Here are the most common ones.

    Before we move on, you should always remember that you answer each question with your target audience in mind.

    1.1. File Size
    How much data is the user required to load before (s)he can use the page.

    It is a frequent question, how much data your web page is allowed to have. You cannot answer this unless you know your target audience.

    In the early years of the internet one would suggest a size of 30k max for the whole page (including images, etc.). Now that many people have a broadband connection, I think we can push the level to a value between 60k and 100k. Although, you should consider lowering the size if you also target modem users.

    Still, the less data you require to download, the faster your page will appear.

    1.2. Latency
    The time it takes between your request to the server and when the data reaches your PC.

    This time adds together from twice the network latency (which depends on the uplink of the hosting provider, the geographical distance between server and user, and some other factors) and the time it takes until the server produces the output.

    Network latency can hardly be optimized without moving the server, so this guide will not cover this.
    The processing time of the server combines complex time factors and contains most often much room for improvement.

    2. Reducing the file size
    First, you need to know how large your page really is. There are some useful tools out there. I picked Web Page Analyzer which does a nice job at this.

    I suggest not spending too much time on this, unless your page size is larger than 100kb. So skip to step 3.

    Large page sizes are nowadays often caused by large JavaScript libraries. Often you only need a small part of their functionality, so you could use a cut-down version of it. For example when using prototype.js just for Ajax, you could use pt.ajax.js (also see moo.ajax), or the moo.fx as a replacement.

    Digg for example used to have about 290kb, they now have reduced the size to 160kb by leaving out unnecessary libraries.

    Also large images can cause large file sizes, this is often caused by the wrong image format. A rule of thumb: JPG for photos, PNG for most other aspects, especially if plain colors are involved. Also: use PNG for screen shots, JPGs are not only larger but also look ugly. You can also use GIF instead of PNG when the image has only few colors and/or you want to create an animation.

    Also often large images are scaled via the HTML width and height attributes. You should do this in your graphical editor and scale it there. This will also reduce the size.

    Old HTML style can also cause large file size. There is no need for thousands of tags anymore. Use XHTML and CSS!

    A further important step to smaller size is on-the-fly compressing of your content. Almost all browsers already support gzip compression. For an Apache 2 web server, for example, there is the mod_deflate module can do this transparently for you.

    If you don’t have access to your server’s configuration, you can use the zlib for PHP or for Django (Python) there is GZipMiddleware, Ruby on Rails has a gzip plugin, too.

    Beware of compressing JavaScript, there are quite some bugs with Internet Explorer.

    And for heaven’s sake, you can also strip the white space after you’ve completed the previous steps.

    3. Check what’s causing a high latency
    As mentioned, the latency can be caused by two large factors.

    3.1. Is it the network latency?
    To determine whether the network latency is the blocking factor you can ping your server. This can be done from the command line via the command ping

    If your server admin has disabled the pinging function you can also use a traceroute which uses another method to determine the time tracert (Windows) or traceroute (Unix).

    If you address an audience that is geographically not very close to you, you can also use a service such as Just Ping which pings the given address from 12 different locations in the world.

    3.2. Does it take too long to generate the page?
    If the ping times are ok, it might take too long to generate the page. Note that this applies to dynamic pages, for example written in a scripting language such as PHP. Static pages are usually served very quickly.

    You can measure the time it takes to generate the page quite easily. You just need to save an time stamp at the beginning of the page and subtract it from the time stamp when the page has been generated. For example in PHP you do it like this (due to technical restrictions a space is inserted before the question mark):

    < ?php // Start of the Page $start_time = explode(' ', microtime()); $start_time = $start_time[1] + $start_time[0]; ?>

    and at the end of the page:

    < ?php $end_time = explode(' ', microtime()); $total_time = $end_time[0] + $end_time[1] - $start_time; printf('Page loaded in %.3f seconds.', $total_time); ?>

    The time needed to generate the page is now displayed at the bottom of it.

    You can also compare the time between loading a static page (often a file ending in .html) and a dynamic one. I’d advise to use the first method because you are going to need that method to go on optimizing the page.

    You can also use a Profiler which usually offers even more information on the generation process.

    For PHP you can, as a first easy step, enable Output Buffering and restart the test.

    Also you should consider testing your page with a benchmarking program such as ApacheBench (ab). This will stress the server via requesting several copies at once.

    It is difficult to say what time suffices for generating a web page. It depends on your own requirements. You should try to keep the generation time under 1 second, as this is a delay which users usually can cope with.

    3.3. Is it the rendering performance?
    This plays only a minor role in my guide, but still this can be a reason why your page takes long to load.

    If you use a complex table structure (which can render slowly), you most probably are using old style HTML, try to switch to XHTML and CSS.

    Don’t use overly complex JavaScript, like slow scripts in combination with onmousemove events make a page real sluggish. If your JavaScript makes the page load slowly (you can use a similar technique as the PHP time measuring, using the (new Date()).getMilliseconds()), you are doing something wrong. Rethink your concept.

    4. Determine the lagging component(s)
    As your page usually consists of more than one component (such as header, login window, navigation, footer, etc.) you should next check which one needs tuning. You can do this by integrating a few of the measuring fragments to the page which will show you several split times throughout the page.

    The following steps can now be applied to the slowest parts of the page.

    5. Enable a Compiler Cache
    Scripting languages recompile their script upon each request. As there are far more requests to the unchanged script, it makes no sense to compile the script over and over (especially when core development has finished).

    For PHP there is amongst others APC (which will probably be integrated with PHP 6), Python stores a compiled version by itself.

    6. Look at the DB Queries
    At university most complex queries with lots of JOINs and GROUPs are taught, but in real life it can often be useful to avoid JOINs between (especially large) tables. Instead you do multiple selects which can be cached by the SQL server. This is especially true if you don’t need the joined data for every row. It really depends on your application, but trying without a JOIN is often worth it.

    Ensure that you use query folding (also called query cache; such as the MySQL Query Cache). Because in a web environment the same SELECT statements are executed over and over. This almost screams for a cache (and explains why avoiding JOINs can be much faster).

    7. Send the correct Modification Data
    Dynamic Web pages often make one big mistake: They don’t have their date of last modification set. This means that the browser always has to load the whole page from the server and cannot use its cache.

    In HTTP there are various headers important for caching: for 1.0 there is the Last-Modified header which plays together with the browser-sent If-Modified-Since (see specification). HTTP 1.1 uses the ETag (so called Entity Tag) which allows different last modification dates for the same page (e.g. for different languages). Other relevant headers are Cache-Control and Expires.

    Read on about how to set the headers correctly and respond to them (1.0) and 1.1.

    8. Consider Component Caching (advanced)
    If optimizing the database does not improve your generation time enough, you are most likely doing something complex ;)
    So for public pages it’s very likely that you will present two users with the same content (at least for a specific component). So instead of doing complex database queries, you can store a pre-rendered copy and use that when needed, to save time.

    This is a rather complex topic but can be the ultimate solution to your performance problems. You need to make sure that you don’t deliver a stale copy to the client, you need think about how to organize your cache files so you can invalidate them quickly.

    Most web frameworks give you a hand when doing component caching: for PHP there is Smarty’s template caching, Perl has Mason’s Data Caching, Ruby’s Rails has Page Caching, Django supports it as well.

    This technique can eventually lead to a result when loading your page does not need any request to the data base. This can be a favorable result as a connection to the database is often the most obvious bottleneck.

    If your page is not that complex you could also consider just caching the whole page. This is easier but makes the page usually feel less up-to-date.

    One more thing: If you have enough RAM you should also consider storing the cache files in a RAM drive. As the data is discardable (as it can be re-generated at any time) a loss when rebooting would not matter. Keeping disk I/O low can boost the speed once again.

    9. Reducing the Server Load
    Consider that your page loads quickly and everything looks alright, but when too many users access the page, it suddenly becomes slow.

    This is most likely due to a lack of resources on the server. You cannot add an indefinite amount of CPU power or RAM into the server but you can handle what you’ve got more carefully.

    9.1. Use a Reverse Proxy (needs access to the server)
    Whenever a request needs to be handled, a whole copy (or child process) of the web server executable needs to be held in memory. Not only for the time of generating the page but also until the page has been transferred to the client. Slow clients can cost performance. When you have many users connecting, you can be sure that quite a few slow ones will block the line for somebody else just for transferring back the data.

    So there is a solution for this. The well known Squid proxy has a HTTP Acceleration mode which handles communication with the client. It’s like a secretary that handles all communication.

    It waits patiently until the client has filed his request. Asks the web server to respond, quickly receives the response (while the web server can move on to the next request) and then will patiently return the file to the client.

    Also the Squid server is small, lightweight, and specialized for that task. Therefore you need less RAM for more clients which allows a higher throughput (regarding served clients per time unit).

    9.2. Take a lightweight HTTP Server (needs access to the server)
    Often people also say that Apache is quite huge and does not do it’s work quickly enough. Personally I am satisfied with its performance, but when it comes to dealing with scripting languages that handle their web server communication via the (fast)CGI interface, Apache is easily trumped by a lightweight alternative.

    It’s called LightTPD (pronounced “lighty”) and does a good job at doing that special task very quickly. You can already see from a configuration file that it keeps things simple.

    I suggest testing both scenarios if you gain from using LightTPD or if you should stay with your old web server. The Apache Web Server is stable and is built on long lasting experience in the web server business, but LightTPD is taking it’s chance.

    10. Server Scaling (extreme technique)
    Once you have gone through all steps and your page still does not load fast enough (most obvious because of too many concurrent users), you can now duplicate your hardware. Because of the previous steps there isn’t too much work left.

    The Reverse Proxy can act as a load balancer by sending its requests to one of the web servers, either quite-randomly (Round Robin) or server load driven.

    All in all you can say that the main strategy for a large page is a combination of caching and intelligent handling of the resources helps you reach the goal. While the first 7 steps apply to any page, the last 3 points are usually only useful (and needed) at sites with many concurrent users.

    The guide shows that you don’t need a special server to withstand slashdotting or digging.

    Further Reading
    For more detail on each step I recommend taking a look at my diploma thesis.

    MySQL tuning is nicely described in Jeremy Zawodny’s High Performance MySQL. A presentation about how Yahoo tunes its Apache Servers. Some tips for Websites running on Java. George Schlossnagle gives some good tips for caching in his Advanced PHP Programming. His tips are not restricted to PHP as a scripting language.

    digg it, add to delicious

    performance, tuning, website

  • Speed up your page, but how?

    Today I ran accross the blog entry by Marcelo Calbucci, called "Web Developers: Speed up your pages!".

    It’s a typical example of good idea, bad execution. Most of the points he mentions are really bad practice.

    He suggests reducing traffic (and therefore loading time) by removing whitespace from the source code, to write all code in lower case (for better compression?!?), reduce code by writing invalid xhtml, and to keep javascript function names and variables short. This is nit-picking. And results in a maintenance nightmare.

    For big sites, e.g. Google, the white space reduction tricks make sense. But they have enourmous numbers of page impressions. Saving 200 bytes tops by stripping whitespace is nearly worthless for smaller sites. And not worth the trouble. Additionally I bet that Google does not maintain that page as such, but has created some kind of conversion script.

    Other thoughts are quite nice but commonplace. Most of the comments (e.g. by Sarah) posted at that article reflect my opinion quite well and deal with each point in more detail.

    For most dynamic pages the bottleneck for responding to a client request is the script loading (or running) time. I suggest the writer to read some articles about server caching (my thesis also deals with that topic) and optimization.

    Often also the latency between client and server can be held responsible for considerable delays. As the client has to parse the HTML file to decide what files to load next, delays can sum up.

    All in all, it’s a good idea to deal with the loading time of a page. But you have to search at the right place.

    web, speed, xhtml, script, caching

  • Using a Feedback Form for Spam

    Have you ever received weird spam via the feedback form of your site? Something with your own address as sender or with some Mime stuff in the body? Your form is likely to be misused for spamming.

    How does it work?

    For PHP, for example, there is the mail function that can be used to easily send an e-mail. Most probably you’d use some code like this to send the message from your feedback form.

    < ?php $msg .= "Name: " . $_POST["name"] . "\n"; $msg .= "E-Mail: " . $_POST["email"] . "\n"; $msg .= $_POST["msg"]; mail("", "feedback from my site", $msg); ?>

    That’s simple and works well, but it’s a little annoying if you want to answer that e-mail. You click the e-mail address to open a new message and have to paste the whole message into the new window for quoting. There’s an easy solution: Pretend that the e-mail comes from the customer requesting some info. This can be simply done via the additional_headers parameter of mail.

    < ?php $sender = "From: " . $_POST["email"] . "\r\n"; $sender = "From: " . $_POST["name"] . " <" . $_POST["email"] . ">\r\n"; // even nicer, shows the name, not the address
    mail("", "feedback from my site", $msg, $sender);

    Well. We’ve just introduced 2 potential spamming opportunities. Why? Let’s see. For mail transport we use SMTP. Our outgoing mail might look like this (generated by mail).

    From: tester <>
    Subject: feedback from my site

    this is my message

    (Before, the From would have looked something like From:
    So if the spammer manages to insert another field (like To, CC, or BCC), not only we would receive that e-mail but also the guy entered as CC. This works by inserting a line break into the name or e-mail address. For example, for a given name such as


    that would be the case.
    Although this is usually not possible through a normal textbox () a post request can easily be constructed containing that linebreak and the malicious CC.

    So be sure to strip out at least the characters \r and \n from name or e-mail address or just strip out any non-latin characters (people with german umlauts in their names, for example, will have to live with that).

    So a quite good method would be to use this piece of code:

    $name = preg_replace("|[^a-z0-9 \-.,]|i", "", $_POST["name"]);
    $email = preg_replace("|[^a-z0-9@.]|i", "", $_POST["email"]);
    $sender = "From: " . $name . " < " . $email . ">\r\n";
    mail("", "feedback from my site", $msg, $sender);

    The conclusion is simple (and always the same one): Never trust any data you receive from a user.
    Verify all data you receive and strip potentially harmful characters. Common bad characters are:

    • for mails: \r, \n,
    • for HTML: < , > (you could use htmlspecialchars for that),
    • for URLs: &, =,
    • complete the list in the comments ;)

    Ah, the conclusion. Never trust any data you receive from a user.

    spam, form, e-mail, smtp

  • Eclipse Everywhere. Buah.

    It’s been a little quiet lately. This is because I am working on a cute little project that I will be able to present soon. More when the time is ready.

    There has been rumor lately that Zend (developer of PHP) will release a PHP Framework. This is nothing new, there has been a IDE (Zend ) for a long time now. But it will be based on Eclipse.

    Also Macromedia announced that their new Flex 2.0 environment (Flashbuilder) will be based on Eclispe.

    Why on earth Eclipse?! I think this is the most slowest IDEs available. It’s based on Java which makes it incredibly slow already and it’s so blown up that it’s unbelievable.

    I just can’t understand why developers would use such a tool. I am not willing to buy a GHz monster PC just to have an editor running there. That’s a pure waste of money and electricity. Emacs is kinda slow already but it runs on a few MHz.

    Can anyone explain to me why to use such a monster?

    I thought that maybe everything changed for the better by now and downloaded the whole thing. That’s 100MB already. This already shows how much memory it will consume. Ok, I still started it. It took more than 2 minutes on my Powerbook G4. Hello? The features it provides are so not worth that.

    I can recommend TextMate (best completition) and EditPlus (best integrated (S)FTP). These are fast, neat text editors. That’s what I want.

    eclipse, zend, php, flex, textmate, editplus

  • Caching of Downloaded Code: Testing Results

    Today I did some experimenting with the caching of downloaded code (or On-Demand Javascript, whatever you want to call it).

    I’ve set up a small testing suite that currently tests 3 different ways of downloading code: script-tag insertion via DOM, XmlHttpRequest as a GET and XHR as a POST.

    These are my results for now:

    Method IE6 Firefox 1.07 Firefox 1.5b2 Safari 2.0 Opera 8.5
    script_dom cached cached cached cached cached
    xhr_post not cached not cached not cached not cached not cached
    xhr_get cached not cached cached not cached not cached

    (Results are the same for Win and OS X where both browsers are available (FF & Opera))

    Safari Code Downloading Cache Test

    This gives an interesting picture: Firefox does not seem to cache any scripts, neither the ones loaded via DOM nor those loaded via XHR. Only IE loads an XHR GET request from cache.

    I’ve got the script in my public testing area, so you can test it for your own browser. Please do so and correct my values if you receive different results.

    The sources of my tests are available, too: index.phps and js.phps. I did my testings using the latest prototype.js library. Maybe I will try it later on with another library (e.g. with

    I’d be interested in more ways to download code (especially via document.write since I haven’t been able to include this properly to my tests) and in your results for other browsers. Just leave a comment.

    UPDATE: I have now included the Expires header field with the Javascript file. Now FireFox in both version caches the script with script_dom, in version 1.5b2 it also caches XHR with GET requests.

    XmlHttpRequest, caching, prototype.js, test

  • Better code downloading with AJAX

    I’ve been playing with Code downloading (or Javascript on Demand) a little more.

    Michael Mahemoff pointed me at his great Ajaxpatterns in which he suggests a different solution:

    if (self.uploadMessages) { // Already exists
    var head = document.getElementsByTagName("head")[0];
    var script = document.createElement('script');
    script.type = 'text/javascript';
    script.src = "upload.js";

    Via DOM manipulation a new script tag is added to our document, loading the new script via the ‘src’ attribute. I have put a working example here. As you can see this does not even need to do an XmlHttpRequest (XHR later on) so it will also work on browsers not supporting that.

    So why use this approach and not mine? Initially I thought that it was not as good as doing it via XHR because you receive a direct feedback (i.e. a function call) when the script has been loaded. This is per se not possible with this technique. But as in good ol’ times a simple function call at the end of the script file will do the same job (compare source codes from the last example and this one (plus load.js)).

    Using this method to load code later on also provides another “feature” (thanks for that hint to Erik Arvidsson): Unlike XHRs Firefox also provides a cache for scripts loaded that way. There seems to be a disagreement about whether this is a bug or a feature (people complaining that IE caches such requests while it could be quite useful in this scenario).

    When using dynamically generated javascript code you will also have to keep your HTTP headers in mind (scripts don’t send them by default). The headers Cache-Control and Last-Modified will do usually (see section 6.1.2 of my thesis)

    The method above is also the method used by Dojo, a developer (David Schontzler) commented, too. He says that Dojo also only loads the stuff the programmer needs, so little overhead can be expected from this project.

    Also Alex Russell from Dojo left a comment about bloated javascript libraries. He has some good points about script size to say (read for yourself), I just want quote the best point of his posting:

    So yes, large libraries are a problem, but developers need some of the capabilities they provide. The best libraries, though, should make you only pay for what you use. Hopefully Dojo and JSAN will make this the defacto way of doing things.

    So hang on for Dojo, they seem to be on a good way (coverage of Dojo to follow).

    Finally I want to thank you all for your great and insightful comments!

    ajax, dojo, code downloading, javascript on demand, caching, http headers