Title: Page 97 – Alex Kirk

---

 * 
   ## 󠀁[New Feature for HN Collapsible Threads: Collapse Whole Thread](https://alex.kirk.at/2011/12/06/new-feature-for-hn-collapsible-threads/)󠁿
   
 * December 6, 2011
 * I have added a feature to the [HN Collapsible Threads bookmarklet](https://alex.kirk.at/2010/02/16/collapsible-threads-for-hacker-news/)
   that enables you to close a whole thread from any point within the thread:
 * ![](https://alex.kirk.at/wp-content/uploads/sites/2/2011/12/hn-update.png "hn-
   update")
 * This is useful when you are reading a thread and decided that you are having 
   enough of it and want to move on to the next thread. Before you had to scroll
   all the way up to the top post and collapse that one.
 * Drag this to your bookmarks bar: [collapsible threads](https://alex.kirk.at/page/97/(function()%7Bvar%20s=document.createElement('script');s.type='text/javascript';s.src='http://ajax.googleapis.com/ajax/libs/jquery/1.4.1/jquery.min.js';document.documentElement.childNodes%5B0%5D.appendChild(s);s=document.createElement('script');s.type='text/javascript';s.src='https://alex.kirk.at/js/hackernews-collapsible-threads-v4.js';document.documentElement.childNodes%5B0%5D.appendChild(s);%7D)();?output_format=md&term_id=1122)
 * [Install Greasemonkey script](https://alex.kirk.at/js/hacker_news_comment_coll.user.js?output_format=md&term_id=1122)
 * [Web](https://alex.kirk.at/category/web/)
 * 
   ## 󠀁[preg_match, UTF-8 and whitespace](https://alex.kirk.at/2011/10/01/preg_match-utf-8-and-whitespace/)󠁿
   
 * October 1, 2011
 * Just a quick note, be careful when using the whitespace character `\s` in `preg_match`
   when operating with UTF-8 strings.
 * Suppose you have a string containing a dagger symbol. When you try to strip all
   whitespace from the string like this, you will end up with an invalid UTF-8 character:
 * `$ php -r 'echo preg_replace("#\s#", "", "?");' | xxd
    0000000: e280
 * (On a side note: `xxd` displays all bytes in hexadecimal representation. The 
   resulting string here consists of two bytes `e2` and `80`)
 * `\s` stripped away the `a0` byte. I was unaware that this character was included
   in the whitespace list, but actually it represents the **non-breaking space**.
 * So actually use the [u (PCRE8) modifier](http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php)
   as it will be aware of the `a0` “belonging” to the dagger:
 * `$ php -r 'echo preg_replace("#\s#u", "", "?");' | xxd
    0000000: e280 a0
 * By the way, `trim()` doesn’t strip non-breaking spaces and can therefore safely
   be used for UTF-8 strings. (If you still want to trim non-breaking spaces with`
   trim`, [read this comment on PHP.net](http://php.net/manual/en/function.trim.php#98812))
 * Finally here you can see the ASCII characters matched by `\s` when using the 
   u modifier.
 * `$ php -r '$i = 0; while (++$i < 256) echo preg_replace("#[^\s]#", "", chr($i));'
   | xxd 0000000: 090a 0c0d 2085 a0 $ php -r '$i = 0; while (++$i < 256) echo preg_replace("#[
   ^\s]#u", "", chr($i));' | xxd 0000000: 090a 0c0d 20`
 * Functions operating just on the ASCII characters (with a byte code below 128)
   are generally safe, as the multi-byte characters of UTF-8 have a leading bit 
   of one (and are therefore above 128).
 * [Code](https://alex.kirk.at/category/code/), [PHP](https://alex.kirk.at/category/code/php/)

 [Previous Page](https://alex.kirk.at/page/96/?output_format=md&term_id=1122) [Next Page](https://alex.kirk.at/page/98/?output_format=md&term_id=1122)