Speed up your page, but how?

Today I ran accross the blog entry by Marcelo Calbucci, called "Web Developers: Speed up your pages!".

It’s a typical example of good idea, bad execution. Most of the points he mentions are really bad practice.

He suggests reducing traffic (and therefore loading time) by removing whitespace from the source code, to write all code in lower case (for better compression?!?), reduce code by writing invalid xhtml, and to keep javascript function names and variables short. This is nit-picking. And results in a maintenance nightmare.

For big sites, e.g. Google, the white space reduction tricks make sense. But they have enourmous numbers of page impressions. Saving 200 bytes tops by stripping whitespace is nearly worthless for smaller sites. And not worth the trouble. Additionally I bet that Google does not maintain that page as such, but has created some kind of conversion script.

Other thoughts are quite nice but commonplace. Most of the comments (e.g. by Sarah) posted at that article reflect my opinion quite well and deal with each point in more detail.

For most dynamic pages the bottleneck for responding to a client request is the script loading (or running) time. I suggest the writer to read some articles about server caching (my thesis also deals with that topic) and optimization.

Often also the latency between client and server can be held responsible for considerable delays. As the client has to parse the HTML file to decide what files to load next, delays can sum up.

All in all, it’s a good idea to deal with the loading time of a page. But you have to search at the right place.

web, speed, xhtml, script, caching

Using a Feedback Form for Spam

Have you ever received weird spam via the feedback form of your site? Something with your own address as sender or with some Mime stuff in the body? Your form is likely to be misused for spamming.

How does it work?

For PHP, for example, there is the mail function that can be used to easily send an e-mail. Most probably you’d use some code like this to send the message from your feedback form.

< ?php $msg .= "Name: " . $_POST["name"] . "\n"; $msg .= "E-Mail: " . $_POST["email"] . "\n"; $msg .= $_POST["msg"]; mail("my.e.mail@addr.es", "feedback from my site", $msg); ?>

That’s simple and works well, but it’s a little annoying if you want to answer that e-mail. You click the e-mail address to open a new message and have to paste the whole message into the new window for quoting. There’s an easy solution: Pretend that the e-mail comes from the customer requesting some info. This can be simply done via the additional_headers parameter of mail.

< ?php $sender = "From: " . $_POST["email"] . "\r\n"; $sender = "From: " . $_POST["name"] . " <" . $_POST["email"] . ">\r\n"; // even nicer, shows the name, not the address
mail("my.e.mail@addr.es", "feedback from my site", $msg, $sender);
?>

Well. We’ve just introduced 2 potential spamming opportunities. Why? Let’s see. For mail transport we use SMTP. Our outgoing mail might look like this (generated by mail).

From: tester < test@test.com>
To: my.e.mail@addr.es
Subject: feedback from my site

this is my message

(Before, the From would have looked something like From: webserver@mydomain.com)
So if the spammer manages to insert another field (like To, CC, or BCC), not only we would receive that e-mail but also the guy entered as CC. This works by inserting a line break into the name or e-mail address. For example, for a given name such as

Alex
CC: other@e.mail.addr.es

that would be the case.
Although this is usually not possible through a normal textbox () a post request can easily be constructed containing that linebreak and the malicious CC.

So be sure to strip out at least the characters \r and \n from name or e-mail address or just strip out any non-latin characters (people with german umlauts in their names, for example, will have to live with that).

So a quite good method would be to use this piece of code:

$name = preg_replace("|[^a-z0-9 \-.,]|i", "", $_POST["name"]);
$email = preg_replace("|[^a-z0-9@.]|i", "", $_POST["email"]);
$sender = "From: " . $name . " < " . $email . ">\r\n";
mail("my.e.mail@addr.es", "feedback from my site", $msg, $sender);

The conclusion is simple (and always the same one): Never trust any data you receive from a user.
Verify all data you receive and strip potentially harmful characters. Common bad characters are:

  • for mails: \r, \n,
  • for HTML: < , > (you could use htmlspecialchars for that),
  • for URLs: &, =,
  • complete the list in the comments ;)

Ah, the conclusion. Never trust any data you receive from a user.

spam, form, e-mail, smtp