10 Realistic Steps to a Faster Web Site

I complained before about bad guides to improve the performance of your website.

digg it, add to delicious

I’d like to give you a more realistic guide on how to achieve the goal. I have written my master thesis in computer sciences on this topic and will refer to it throughout the guide.

1. Determine the bottleneck
When you want to improve the speed of your website, you feel that it’s somehow slow. There are various points that can affect the performance of your page. Here are the most common ones.

Before we move on, you should always remember that you answer each question with your target audience in mind.

1.1. File Size
How much data is the user required to load before (s)he can use the page.

It is a frequent question, how much data your web page is allowed to have. You cannot answer this unless you know your target audience.

In the early years of the internet one would suggest a size of 30k max for the whole page (including images, etc.). Now that many people have a broadband connection, I think we can push the level to a value between 60k and 100k. Although, you should consider lowering the size if you also target modem users.

Still, the less data you require to download, the faster your page will appear.

1.2. Latency
The time it takes between your request to the server and when the data reaches your PC.

This time adds together from twice the network latency (which depends on the uplink of the hosting provider, the geographical distance between server and user, and some other factors) and the time it takes until the server produces the output.

Network latency can hardly be optimized without moving the server, so this guide will not cover this.
The processing time of the server combines complex time factors and contains most often much room for improvement.

2. Reducing the file size
First, you need to know how large your page really is. There are some useful tools out there. I picked Web Page Analyzer which does a nice job at this.

I suggest not spending too much time on this, unless your page size is larger than 100kb. So skip to step 3.

Large page sizes are nowadays often caused by large JavaScript libraries. Often you only need a small part of their functionality, so you could use a cut-down version of it. For example when using prototype.js just for Ajax, you could use pt.ajax.js (also see moo.ajax), or the moo.fx as a script.aculo.us replacement.

Digg for example used to have about 290kb, they now have reduced the size to 160kb by leaving out unnecessary libraries.

Also large images can cause large file sizes, this is often caused by the wrong image format. A rule of thumb: JPG for photos, PNG for most other aspects, especially if plain colors are involved. Also: use PNG for screen shots, JPGs are not only larger but also look ugly. You can also use GIF instead of PNG when the image has only few colors and/or you want to create an animation.

Also often large images are scaled via the HTML width and height attributes. You should do this in your graphical editor and scale it there. This will also reduce the size.

Old HTML style can also cause large file size. There is no need for thousands of tags anymore. Use XHTML and CSS!

A further important step to smaller size is on-the-fly compressing of your content. Almost all browsers already support gzip compression. For an Apache 2 web server, for example, there is the mod_deflate module can do this transparently for you.

If you don’t have access to your server’s configuration, you can use the zlib for PHP or for Django (Python) there is GZipMiddleware, Ruby on Rails has a gzip plugin, too.

Beware of compressing JavaScript, there are quite some bugs with Internet Explorer.

And for heaven’s sake, you can also strip the white space after you’ve completed the previous steps.

3. Check what’s causing a high latency
As mentioned, the latency can be caused by two large factors.

3.1. Is it the network latency?
To determine whether the network latency is the blocking factor you can ping your server. This can be done from the command line via the command ping servername.com

If your server admin has disabled the pinging function you can also use a traceroute which uses another method to determine the time tracert servername.com (Windows) or traceroute servername.com (Unix).

If you address an audience that is geographically not very close to you, you can also use a service such as Just Ping which pings the given address from 12 different locations in the world.

3.2. Does it take too long to generate the page?
If the ping times are ok, it might take too long to generate the page. Note that this applies to dynamic pages, for example written in a scripting language such as PHP. Static pages are usually served very quickly.

You can measure the time it takes to generate the page quite easily. You just need to save an time stamp at the beginning of the page and subtract it from the time stamp when the page has been generated. For example in PHP you do it like this (due to technical restrictions a space is inserted before the question mark):

< ?php // Start of the Page $start_time = explode(' ', microtime()); $start_time = $start_time[1] + $start_time[0]; ?>

and at the end of the page:

< ?php $end_time = explode(' ', microtime()); $total_time = $end_time[0] + $end_time[1] - $start_time; printf('Page loaded in %.3f seconds.', $total_time); ?>

The time needed to generate the page is now displayed at the bottom of it.

You can also compare the time between loading a static page (often a file ending in .html) and a dynamic one. I’d advise to use the first method because you are going to need that method to go on optimizing the page.

You can also use a Profiler which usually offers even more information on the generation process.

For PHP you can, as a first easy step, enable Output Buffering and restart the test.

Also you should consider testing your page with a benchmarking program such as ApacheBench (ab). This will stress the server via requesting several copies at once.

It is difficult to say what time suffices for generating a web page. It depends on your own requirements. You should try to keep the generation time under 1 second, as this is a delay which users usually can cope with.

3.3. Is it the rendering performance?
This plays only a minor role in my guide, but still this can be a reason why your page takes long to load.

If you use a complex table structure (which can render slowly), you most probably are using old style HTML, try to switch to XHTML and CSS.

Don’t use overly complex JavaScript, like slow scripts in combination with onmousemove events make a page real sluggish. If your JavaScript makes the page load slowly (you can use a similar technique as the PHP time measuring, using the (new Date()).getMilliseconds()), you are doing something wrong. Rethink your concept.

4. Determine the lagging component(s)
As your page usually consists of more than one component (such as header, login window, navigation, footer, etc.) you should next check which one needs tuning. You can do this by integrating a few of the measuring fragments to the page which will show you several split times throughout the page.

The following steps can now be applied to the slowest parts of the page.

5. Enable a Compiler Cache
Scripting languages recompile their script upon each request. As there are far more requests to the unchanged script, it makes no sense to compile the script over and over (especially when core development has finished).

For PHP there is amongst others APC (which will probably be integrated with PHP 6), Python stores a compiled version by itself.

6. Look at the DB Queries
At university most complex queries with lots of JOINs and GROUPs are taught, but in real life it can often be useful to avoid JOINs between (especially large) tables. Instead you do multiple selects which can be cached by the SQL server. This is especially true if you don’t need the joined data for every row. It really depends on your application, but trying without a JOIN is often worth it.

Ensure that you use query folding (also called query cache; such as the MySQL Query Cache). Because in a web environment the same SELECT statements are executed over and over. This almost screams for a cache (and explains why avoiding JOINs can be much faster).

7. Send the correct Modification Data
Dynamic Web pages often make one big mistake: They don’t have their date of last modification set. This means that the browser always has to load the whole page from the server and cannot use its cache.

In HTTP there are various headers important for caching: for 1.0 there is the Last-Modified header which plays together with the browser-sent If-Modified-Since (see specification). HTTP 1.1 uses the ETag (so called Entity Tag) which allows different last modification dates for the same page (e.g. for different languages). Other relevant headers are Cache-Control and Expires.

Read on about how to set the headers correctly and respond to them (1.0) and 1.1.

8. Consider Component Caching (advanced)
If optimizing the database does not improve your generation time enough, you are most likely doing something complex ;)
So for public pages it’s very likely that you will present two users with the same content (at least for a specific component). So instead of doing complex database queries, you can store a pre-rendered copy and use that when needed, to save time.

This is a rather complex topic but can be the ultimate solution to your performance problems. You need to make sure that you don’t deliver a stale copy to the client, you need think about how to organize your cache files so you can invalidate them quickly.

Most web frameworks give you a hand when doing component caching: for PHP there is Smarty’s template caching, Perl has Mason’s Data Caching, Ruby’s Rails has Page Caching, Django supports it as well.

This technique can eventually lead to a result when loading your page does not need any request to the data base. This can be a favorable result as a connection to the database is often the most obvious bottleneck.

If your page is not that complex you could also consider just caching the whole page. This is easier but makes the page usually feel less up-to-date.

One more thing: If you have enough RAM you should also consider storing the cache files in a RAM drive. As the data is discardable (as it can be re-generated at any time) a loss when rebooting would not matter. Keeping disk I/O low can boost the speed once again.

9. Reducing the Server Load
Consider that your page loads quickly and everything looks alright, but when too many users access the page, it suddenly becomes slow.

This is most likely due to a lack of resources on the server. You cannot add an indefinite amount of CPU power or RAM into the server but you can handle what you’ve got more carefully.

9.1. Use a Reverse Proxy (needs access to the server)
Whenever a request needs to be handled, a whole copy (or child process) of the web server executable needs to be held in memory. Not only for the time of generating the page but also until the page has been transferred to the client. Slow clients can cost performance. When you have many users connecting, you can be sure that quite a few slow ones will block the line for somebody else just for transferring back the data.

So there is a solution for this. The well known Squid proxy has a HTTP Acceleration mode which handles communication with the client. It’s like a secretary that handles all communication.

It waits patiently until the client has filed his request. Asks the web server to respond, quickly receives the response (while the web server can move on to the next request) and then will patiently return the file to the client.

Also the Squid server is small, lightweight, and specialized for that task. Therefore you need less RAM for more clients which allows a higher throughput (regarding served clients per time unit).

9.2. Take a lightweight HTTP Server (needs access to the server)
Often people also say that Apache is quite huge and does not do it’s work quickly enough. Personally I am satisfied with its performance, but when it comes to dealing with scripting languages that handle their web server communication via the (fast)CGI interface, Apache is easily trumped by a lightweight alternative.

It’s called LightTPD (pronounced “lighty”) and does a good job at doing that special task very quickly. You can already see from a configuration file that it keeps things simple.

I suggest testing both scenarios if you gain from using LightTPD or if you should stay with your old web server. The Apache Web Server is stable and is built on long lasting experience in the web server business, but LightTPD is taking it’s chance.

10. Server Scaling (extreme technique)
Once you have gone through all steps and your page still does not load fast enough (most obvious because of too many concurrent users), you can now duplicate your hardware. Because of the previous steps there isn’t too much work left.

The Reverse Proxy can act as a load balancer by sending its requests to one of the web servers, either quite-randomly (Round Robin) or server load driven.

Conclusion
All in all you can say that the main strategy for a large page is a combination of caching and intelligent handling of the resources helps you reach the goal. While the first 7 steps apply to any page, the last 3 points are usually only useful (and needed) at sites with many concurrent users.

The guide shows that you don’t need a special server to withstand slashdotting or digging.

Further Reading
For more detail on each step I recommend taking a look at my diploma thesis.

MySQL tuning is nicely described in Jeremy Zawodny’s High Performance MySQL. A presentation about how Yahoo tunes its Apache Servers. Some tips for Websites running on Java. George Schlossnagle gives some good tips for caching in his Advanced PHP Programming. His tips are not restricted to PHP as a scripting language.

digg it, add to delicious

performance, tuning, website

53 thoughts on “10 Realistic Steps to a Faster Web Site

Leave a Reply

Only people in my network can comment.

This site uses Akismet to reduce spam. Learn how your comment data is processed.