Monday, September 12, 2011

node.js: Good Tutorial

Chatting with a friend at the weekend, node.js came up; I'd only vaguely heard of it, but apparently it is what all the kewl kids are into. Server-side javascript, right up my street.

I took a very quick look yesterday, and it sounded interesting: especially for speed-critical websites. So, today I took a deeper look. First up you should know the official website has a documentation link that only goes to the API documentation. In real terms that means node.js is officially undocumented. Then from the Wiki I found a link to a free e-book called Mastering Node. I won't give the link as it is (currently) over-priced: poorly organized and unfinished.

I was  getting a bit demoralized when a search found http://www.nodebeginner.org/. Now this is more like it. In fact this is an excellent tutorial, that goes right from raw beginner to a reasonably complex mini-application. (I read it in HTML format, but as an ebook it is 60 pages, so you can get an idea of how involved it is going to get.)

I followed along, and it was fairly easy, though I did have a lot of trouble outputting the POST-ed data. It always said "undefined" in my browser. I could do a console.log() of the variable just before and just after writing it to the browser and it was set correctly. I cleared the cache repeatedly, and tried a second browser. Annoyingly I didn't solve that problem in the end.

The main point of this blog post is to recommend the above tutorial/ebook if you're interested in getting a feel for node.js, but What Do I Think About node.js?

Hhhmmm. First and foremost, it should only be used by expert programmers. It is a bit like C++ in that it is going to be easy to shoot yourself in the foot. Asynchronous programming is hard. But if you use a synchronous programming-style you will lock up the whole server. I'm thinking in terms of the web server example application here (which is the use-case I had in mind for it). Asynchronous programming is hard. Yes, I know I already said that but I don't think you thought through what that means in the Real World. What will happen is programmers will take a short-cut: they'll use little bits of synchronous code for jobs they know are so quick that no-one will notice. (Even the above tutorial does this: fs.renameSync)  The problem is any job involving any I/O device (like a hard disk file system or a TCP/IP socket) will take 10 times longer to finish than average, about 1% of the time. (I made that statistic up, but the principle is true, so stay with me...)

What does that mean? When that happens it will lock the whole web server up, and all the other requests will block. Take the extreme case of doing a sync action that results in a time-out because the resource has gone offline. If the time-out is 30 seconds, the whole web server is down for 30 seconds. All of your customers are getting time-outs. Every image comes up broken.

Another nice thing about the Node Beginner tutorial is the links to deeper information... and nested in one of the comments is a link to a paper comparing threads and events: Why Events Are A Bad Idea  Well, their conclusion is in the title, but if you look at their charts the important thing to learn is that well-written event handling code and well-written threading code are basically as quick as each other. (You won't ever reach the right-side of the charts in a real website on a single server; their example is just serving a static image. So the differences are only of interest to academics.)

But, although dealing with threads is hard, asynchronous programming is even harder, IMHO.

Now, if node.js had a web server module where it maintained a thread pool and each new request got its own thread, then I could program in a synchronous style in my thread, happy in the knowledge I won't break the web server, and also happy I'll be able to meet my deadlines...

No comments: