Saturday, February 27, 2010

When a unix dot means a whole different world...

Subtitle: new ways to shoot yourself in the foot.

A week ago I was changing ownership on a web site subdirectory:
chown -R darren:www *

There were some files/directories that start with a dot, such as .htaccess, so I then did:
chown -R darren:www .*

If you just screamed out "Nooooo!!!!" you are allowed to polish your Unix Guru badge and wear it with pride. Personally I just sat there wondering why it was taking so long to return and why my disk was suddenly sounding so busy.

That's right. ".." refers to the parent directory. It had gone up a directory, then was using -R to recurse into every subdirectory on my web site, destroying previous ownership settings with abandon.

I realized after a few moments and killed it, but the damage was done, and I'm still discovering small parts of web sites that got broken. But a question for the gurus: how should I change all files and directories in a tree including those that start with a dot? Did I need to use some complicated find and chown combination command?

Tuesday, February 16, 2010

Google Messing Up I18n

People often seem to think Google can do no wrong; an image helped no doubt by being surrounded by companies such as Microsoft and Apple. But, even with all those computer science PhD's it snapped up, it sometimes just does not get it.

Once a month or so, Google resets back to showing me the search results in Japanese, and only searching Japanese pages. My browser is set to say I prefer English pages, but it ignores that and uses my IP address. Japanese people use a Japanese browser that will send a header saying they want to see Japanese. Unless they actually want to see English. Why is Google ignoring this?

Perhaps Google needs fewer PhD's and more people who know what this header means:
Accept-Language en-gb,en;q=0.5

Google (!) tells me Google need to look at section 14.4 of RFC2616.

But, the user-unfriendliness goes deeper. Here is how I get out of Japanese mode.
Click: 検索オプション
Find: 言語 検索対象にする言語
and scroll down to find and select 英語

No! No, no, no! It just goes to show, even though I read Japanese I still messed up. Here is the correct way:

In the top-right of the search results click "設定 ▼" then 検索設定
In the line that says 表示言語の設定 scroll down to select 英語.
Then find the button in top-right that says 保存。When it pops up something that looks like an error message (but is actually a success message), press OK.

Hopefully I'm good for another month.

But the point of my rant is the only piece of English on any of those pages was "google". Not a single helpful line such as "switch this page to English". Perhaps they should use the Accept-Language header to decide what language the user would most like to see a "switch this page to XXX" link in?

Well, rant over. Google have been doing this for 10 years, so I don't expect they're going to change any time soon.

Thursday, February 4, 2010

Facebook Making Me Redundant

Back in 2007 I wrote an article on how to port your PHP applications into C++ (in the Oct 2007 issue of PHP Architect), showing the speed-up and memory-usage benefits. It's a nice idea, though I've only done it in a production project once. The programmer man-hours is very rarely worth it.

Well, Facebook have gone and automated the process, calling it Hip-Hop. Over 90% of Facebook's (400 billion PHP-based page views every month) traffic is now running on their Hip-Hop system, i.e. on PHP pages that have been compiled into C++.

This is very interesting to me as I have recently been writing less and less in C++, more and more in PHP. Even a tree searcher for computer go I have been doing in PHP, as an experiment to see how it goes. On reflection that particular experiment is probably a failure - I need to set bits in a 32-bit or 64-bit integer and PHP puts a layer of abstraction in the way.

The main php script is very slow (its been running non-stop for 5-6 weeks so far, with months left to run) but ironically all that time is being spent in calling other applications (written in C++!) to do the hard calculations. However I've another script that analyzes the data, needing to hold it all in memory at once, and that is showing signs of cracking - I'm already having to set my memory_limit to 1.5G, and that will probably hit 4G when the other script has chomped through all my data.

Hip Hop might be my saviour.

P.S. PHP Architect magazine also think this is going to be a very important technology.