Title: 10 Realistic Steps to a Faster Web Site
Author: Alex Kirk
Published: February 2, 2006
Last modified: August 17, 2007

---

# 10 Realistic Steps to a Faster Web Site

February 2, 2006

I complained [before](https://alex.kirk.at/2006/01/03/49/) about [bad guides to improve the performance](https://alex.kirk.at/2006/01/03/49/)
of your website.

[digg it](http://digg.com/programming/10_Realistic_Steps_to_a_Faster_Web_Site), 
[add to delicious](http://del.icio.us/post?v=2&url=http%3A%2F%2Falex.kirk.at%2F2006%2F02%2F02%2F10-steps-to-a-faster-web-site%2F&title=10%20Realistic%20Steps%20to%20a%20Faster%20Web%20Site)

I’d like to give you a more realistic guide on how to achieve the goal. I have written
my [master thesis in computer sciences](https://alex.kirk.at/papers/caching-strategies/diploma_thesis.html)
on this topic and will refer to it throughout the guide.

**1. Determine the bottleneck**
 When you want to improve the speed of your website,
you feel that it’s somehow slow. There are various points that can affect the performance
of your page. Here are the most common ones.

Before we move on, you should always remember that you answer each question with
your target audience in mind.

**1.1. File Size**
 How much data is the user required to load before (s)he can 
use the page.

It is a frequent question, how much data your web page is allowed to have. You cannot
answer this unless you know your target audience.

In the early years of the internet one would suggest a size of 30k max for the whole
page (including images, etc.). Now that many people have a broadband connection,
I think we can push the level to a value between 60k and 100k. Although, you should
consider lowering the size if you also target modem users.

Still, the less data you require to download, the faster your page will appear.

**1.2. Latency**
 The time it takes between your request to the server and when 
the data reaches your PC.

This time adds together from twice the network latency (which depends on the uplink
of the hosting provider, the geographical distance between server and user, and 
some other factors) and the time it takes until the server produces the output.

Network latency can hardly be optimized without moving the server, so this guide
will not cover this.
 The processing time of the server combines complex time factors
and contains most often much room for improvement.

**2. Reducing the file size**
 First, you need to know how large your page really
is. There are some useful tools out there. I picked [Web Page Analyzer](http://www.websiteoptimization.com/services/analyze/index.html)
which does a nice job at this.

I suggest not spending too much time on this, unless your page size is larger than
100kb. So skip to step 3.

Large page sizes are nowadays often caused by large JavaScript libraries. Often 
you only need a small part of their functionality, so you could use a cut-down version
of it. For example when using prototype.js just for Ajax, you could use [pt.ajax.js](https://alex.kirk.at/2005/10/05/prototypejs-just-for-ajax/)(
also see [moo.ajax](http://www.mad4milk.net/entry/moo.ajax)), or the [moo.fx](http://moofx.mad4milk.net/)
as a script.aculo.us replacement.

[Digg](http://digg.com/) for example used to have [about 290kb](http://project-2501.net/?view=document&id=525),
they now have reduced the size to [160kb](http://www.websiteoptimization.com/services/analyze/wso.php?url=digg.com)
by leaving out unnecessary libraries.

Also large images can cause large file sizes, this is often caused by the wrong 
image format. A rule of thumb: JPG for photos, PNG for most other aspects, especially
if plain colors are involved. Also: use PNG for screen shots, JPGs are not only 
larger but also look ugly. You can also use GIF instead of PNG when the image has
only few colors and/or you want to create an animation.

Also often large images are scaled via the HTML `width` and `height` attributes.
You should do this in your graphical editor and scale it there. This will also reduce
the size.

Old HTML style can also cause large file size. There is no need for thousands of
tags anymore. Use [XHTML](http://www.w3.org/TR/xhtml11/) and [CSS](http://www.w3.org/Style/CSS/)!

A further important step to smaller size is on-the-fly compressing of your content.
Almost all browsers already support [gzip compression](http://en.wikipedia.org/wiki/Gzip).
For an Apache 2 web server, for example, there is the [mod_deflate](http://httpd.apache.org/docs/2.0/mod/mod_deflate.html)
module can do this transparently for you.

If you don’t have access to your server’s configuration, you can use the [zlib](http://php.net/zlib)
for PHP or for Django (Python) there is [GZipMiddleware](http://www.djangoproject.com/documentation/middleware/#django-middleware-gzip-gzipmiddleware),
Ruby on Rails has a [gzip plugin](http://wiki.rubyonrails.org/rails/pages/Output+Compression+Plugin),
too.

Beware of compressing JavaScript, there are [quite](http://support.microsoft.com/kb/312496)
[some](http://support.microsoft.com/kb/871205) bugs with Internet Explorer.

And for heaven’s sake, you can also strip the white space after you’ve completed
the previous steps.

**3. Check what’s causing a high latency**
 As mentioned, the latency can be caused
by two large factors.

**3.1. Is it the network latency?**
 To determine whether the network latency is
the blocking factor you can ping your server. This can be done from the command 
line via the command `ping servername.com`

If your server admin has disabled the pinging function you can also use a traceroute
which uses another method to determine the time `tracert servername.com` (Windows)
or `traceroute servername.com` (Unix).

If you address an audience that is geographically not very close to you, you can
also use a service such as [Just Ping](http://www.just-ping.com/) which pings the
given address from 12 different locations in the world.

**3.2. Does it take too long to generate the page?**
 If the ping times are ok, 
it might take too long to generate the page. Note that this applies to dynamic pages,
for example written in a scripting language such as PHP. Static pages are usually
served very quickly.

You can measure the time it takes to generate the page quite easily. You just need
to save an time stamp at the beginning of the page and subtract it from the time
stamp when the page has been generated. For example in PHP you do it like this (
due to technical restrictions a space is inserted before the question mark):

`< ?php // Start of the Page $start_time = explode(' ', microtime()); $start_time
= $start_time[1] + $start_time[0]; ?>`

and at the end of the page:

`< ?php $end_time = explode(' ', microtime()); $total_time = $end_time[0] + $end_time[
1] - $start_time; printf('Page loaded in %.3f seconds.', $total_time); ?>`

The time needed to generate the page is now displayed at the bottom of it.

You can also compare the time between loading a static page (often a file ending
in .html) and a dynamic one. I’d advise to use the first method because you are 
going to need that method to go on optimizing the page.

You can also use a [Profiler](http://en.wikipedia.org/wiki/Profiler_(computer_science))
which usually offers even more information on the generation process.

For PHP you can, as a first easy step, enable [Output Buffering](http://php.net/ob_start)
and restart the test.

Also you should consider testing your page with a benchmarking program such as [ApacheBench (ab)](https://alex.kirk.at/papers/caching-strategies/diploma_thesisch4.html#x8-510004.8).
This will stress the server via requesting several copies at once.

It is difficult to say what time suffices for generating a web page. It depends 
on your own requirements. You should try to keep the generation time under 1 second,
as this is a delay which users usually can cope with.

**3.3. Is it the rendering performance?**
 This plays only a minor role in my guide,
but still this can be a reason why your page takes long to load.

If you use a complex table structure (which can render slowly), you most probably
are using old style HTML, try to switch to XHTML and CSS.

Don’t use overly complex JavaScript, like slow scripts in combination with `onmousemove`
events make a page real sluggish. If your JavaScript makes the page load slowly (
you can use a similar technique as the PHP time measuring, using the `(new Date()).
getMilliseconds()`), you are doing something wrong. Rethink your concept.

**4. Determine the lagging component(s)**
 As your page usually consists of more
than one component (such as header, login window, navigation, footer, etc.) you 
should next check which one needs tuning. You can do this by integrating a few of
the measuring fragments to the page which will show you several split times throughout
the page.

The following steps can now be applied to the slowest parts of the page.

**5. Enable a Compiler Cache**
 Scripting languages recompile their script upon 
each request. As there are far more requests to the unchanged script, it makes no
sense to compile the script over and over (especially when core development has 
finished).

For PHP there is amongst others [APC](http://pecl.php.net/apc) (which will probably
be integrated with [PHP 6](http://www.php.net/~derick/meeting-notes.html#add-an-opcode-cache-to-the-distribution-apc)),
Python stores a [compiled version](http://www.python.org/doc/2.2.3/tut/node8.html#SECTION008120000000000000000)
by itself.

**6. Look at the DB Queries**
 At university most complex queries with lots of JOINs
and GROUPs are taught, but in real life it can often be useful to avoid JOINs between(
especially large) tables. Instead you do multiple selects which can be cached by
the SQL server. This is especially true if you don’t need the joined data for every
row. It really depends on your application, but trying without a JOIN is often worth
it.

Ensure that you use query folding (also called query cache; such as the [MySQL Query Cache](https://alex.kirk.at/papers/caching-strategies/diploma_thesisch4.html#x8-350004.3.2)).
Because in a web environment the same SELECT statements are executed over and over.
This almost screams for a cache (and explains why avoiding JOINs can be much faster).

**7. Send the correct Modification Data**
 Dynamic Web pages often make one big 
mistake: They don’t have their date of last modification set. This means that the
browser always has to load the whole page from the server and cannot use its cache.

In HTTP there are various headers important for caching: for 1.0 there is the `Last-
Modified` header which plays together with the browser-sent `If-Modified-Since` (
see [specification](http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.3.1)).
HTTP 1.1 uses the `ETag` (so called [Entity Tag](http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.3.2))
which allows different last modification dates for the same page (e.g. for different
languages). Other relevant headers are `Cache-Control` and `Expires`.

Read on about [how to set the headers correctly and respond to them (1.0)](https://alex.kirk.at/papers/caching-strategies/diploma_thesisch6.html#x17-720006.1.1)
and [1.1](http://simon.incutio.com/archive/2003/04/23/conditionalGet).

**8. Consider Component Caching** (advanced)
 If optimizing the database does not
improve your generation time enough, you are most likely doing something complex;)
So for public pages it’s very likely that you will present two users with the same
content (at least for a specific component). So instead of doing complex database
queries, you can store a pre-rendered copy and use that when needed, to save time.

This is a rather complex topic but can be the ultimate solution to your performance
problems. You need to make sure that you don’t deliver a stale copy to the client,
you need think about how to organize your cache files so you can invalidate them
quickly.

Most web frameworks give you a hand when doing component caching: for PHP there 
is [Smarty’s template caching](https://alex.kirk.at/papers/caching-strategies/diploma_thesisch9.html#x25-1040009),
Perl has [Mason’s Data Caching](http://www.masonhq.com/docs/manual/Devel.html#data_caching),
Ruby’s Rails has [Page Caching](http://api.rubyonrails.com/classes/ActionController/Caching/Pages.html),
Django [supports it as well](http://www.djangoproject.com/documentation/cache/).

This technique can eventually lead to a result when loading your page does not need
any request to the data base. This can be a favorable result as a connection to 
the database is often the most obvious bottleneck.

If your page is not that complex you could also consider just caching the whole 
page. This is easier but makes the page usually feel less up-to-date.

One more thing: If you have enough RAM you should also consider storing the cache
files in a RAM drive. As the data is discardable (as it can be re-generated at any
time) a loss when rebooting would not matter. Keeping disk I/O low can boost the
speed once again.

**9. Reducing the Server Load**
 Consider that your page loads quickly and everything
looks alright, but when too many users access the page, it suddenly becomes slow.

This is most likely due to a lack of resources on the server. You cannot add an 
indefinite amount of CPU power or RAM into the server but you can handle what you’ve
got more carefully.

**9.1. Use a Reverse Proxy** (needs access to the server)
 Whenever a request needs
to be handled, a whole copy (or child process) of the web server executable needs
to be held in memory. Not only for the time of generating the page but also until
the page has been transferred to the client. Slow clients can cost performance. 
When you have many users connecting, you can be sure that quite a few slow ones 
will block the line for somebody else just for transferring back the data.

So there is a solution for this. The well known Squid proxy has a [HTTP Acceleration](https://alex.kirk.at/2005/11/29/squids-http-acceleration-mode/)
mode which handles communication with the client. It’s like a secretary that handles
all communication.

It waits patiently until the client has filed his request. Asks the web server to
respond, quickly receives the response (while the web server can move on to the 
next request) and then will patiently return the file to the client.

Also the Squid server is small, lightweight, and specialized for that task. Therefore
you need less RAM for more clients which allows a higher throughput (regarding served
clients per time unit).

**9.2. Take a lightweight HTTP Server** (needs access to the server)
 Often people
also say that Apache is quite huge and does not do it’s work quickly enough. Personally
I am satisfied with its performance, but when it comes to dealing with scripting
languages that handle their web server communication via the (fast)CGI interface,
Apache is easily trumped by a lightweight alternative.

It’s called [LightTPD](http://www.lighttpd.net/) (pronounced “lighty”) and does 
a good job at doing that special task very quickly. You can already see from a [configuration file](http://www.lighttpd.net/documentation/configuration.html)
that it keeps things simple.

I suggest testing both scenarios if you gain from using LightTPD or if you should
stay with your old web server. The Apache Web Server is stable and is built on long
lasting experience in the web server business, but LightTPD is taking it’s chance.

**10. Server Scaling** (extreme technique)
 Once you have gone through all steps
and your page still does not load fast enough (most obvious because of too many 
concurrent users), you can now duplicate your hardware. Because of the previous 
steps there isn’t too much work left.

The Reverse Proxy can act as a load balancer by sending its requests to one of the
web servers, either quite-randomly ([Round Robin](http://en.wikipedia.org/wiki/Round-robin))
or server load driven.

**Conclusion**
 All in all you can say that the main strategy for a large page is
a combination of caching and intelligent handling of the resources helps you reach
the goal. While the first 7 steps apply to any page, the last 3 points are usually
only useful (and needed) at sites with many concurrent users.

The guide shows that you don’t need a special server to withstand [slashdotting](http://slashdot.org/)
or [digging](http://digg.com/).

**Further Reading**
 For more detail on each step I recommend taking a look at my
[diploma thesis](https://alex.kirk.at/papers/caching-strategies/diploma_thesis.html).

MySQL tuning is nicely described in [Jeremy Zawodny’s](http://jeremy.zawodny.com/blog/)
[High Performance MySQL](http://highperformancemysql.com/). A presentation about
how [Yahoo tunes its Apache Servers](http://public.yahoo.com/~radwin/talks/yapache-apachecon2005.htm).
Some tips for [Websites running on Java](http://www.javaperformancetuning.com/tips/j2ee_srvlt.shtml).
[George Schlossnagle](http://www.schlossnagle.org/~george/blog/) gives some good
tips for caching in his [Advanced PHP Programming](http://www.samspublishing.com/bookstore/product.asp?isbn=0672325616&rl=1).
His tips are not restricted to PHP as a scripting language.

[digg it](http://digg.com/programming/10_Realistic_Steps_to_a_Faster_Web_Site), 
[add to delicious](http://del.icio.us/post?v=2&url=http%3A%2F%2Falex.kirk.at%2F2006%2F02%2F02%2F10-steps-to-a-faster-web-site%2F&title=10%20Realistic%20Steps%20to%20a%20Faster%20Web%20Site)

performance, tuning, website

[Code](https://alex.kirk.at/category/code/), [Web](https://alex.kirk.at/category/web/)

Read this next

[Blummy: Major Update](https://alex.kirk.at/2006/01/23/blummy-major-update/)

## 53 responses to “10 Realistic Steps to a Faster Web Site”

 1.  [Dicas Neosite » Blog Archive » 138 tutoriais Ajax Javascript gratuitos](http://neosite.ilogic.com.br/dicas/?p=79)
 2.  [March 24, 2007](https://alex.kirk.at/2006/02/02/10-steps-to-a-faster-web-site/comment-page-2/#comment-45360)
 3.  […] 10 Realistic Steps to a Faster Web Site […]
 4.  [Log in to Reply](https://alex.kirk.at/wp-login.php?redirect_to=https%3A%2F%2Falex.kirk.at%2F2006%2F02%2F02%2F10-steps-to-a-faster-web-site%2F)
 5.  [AJAX Tutorials – News](http://ajaxmatters.com/archive/2007/02/17/ajax-tutorials.aspx)
 6.  [April 17, 2007](https://alex.kirk.at/2006/02/02/10-steps-to-a-faster-web-site/comment-page-2/#comment-51165)
 7.  […] […]
 8.  [Log in to Reply](https://alex.kirk.at/wp-login.php?redirect_to=https%3A%2F%2Falex.kirk.at%2F2006%2F02%2F02%2F10-steps-to-a-faster-web-site%2F)
 9.  [hebertphp » 138 tutoriais de Ajax e Javascript gratuitos](http://www.hebertphp.net/wordpress/?p=11)
 10. [May 25, 2007](https://alex.kirk.at/2006/02/02/10-steps-to-a-faster-web-site/comment-page-2/#comment-60299)
 11. […] 10 Realistic Steps to a Faster Web Site […]
 12. [Log in to Reply](https://alex.kirk.at/wp-login.php?redirect_to=https%3A%2F%2Falex.kirk.at%2F2006%2F02%2F02%2F10-steps-to-a-faster-web-site%2F)

 [Older Comments](https://alex.kirk.at/2006/02/02/10-steps-to-a-faster-web-site/comment-page-1/#comments)

### Leave a Reply 󠀁[Cancel reply](https://alex.kirk.at/2006/02/02/10-steps-to-a-faster-web-site/?output_format=md#respond)󠁿

Only people in [my network](https://alex.kirk.at/friends/) can comment.

This site uses Akismet to reduce spam. [Learn how your comment data is processed.](https://akismet.com/privacy/)