Node File Read Performance

In my last post, Node.js Can Really Scale, I demonstrated the impressive scalability of Node.js, and more generally, the scalability of the evented model, whereby increasing concurrency barely affects total throughput. You may have noticed however, that 1000 reqs took about 17 secs giving us 58 reqs/second. That's pretty bad, but this is mostly reflective of node's slowness when transferring in binary mode. Apparently the same test today runs twice as fast, but more interestingly, performance was further dramatically improved by reducing the chunk size of reads from ~500 KiB (the full file size) to the default size of 4 KiB. The larger chunk size took 44% more time, I'm using fs.createReadStream to perform the reads, and on each read the data is written out to the client. The smaller chunk size means that for that 500 KiB, res.write() gets called 125 more times, yet that doesn't even seem to matter. That means, in terms of total throughput, I was able to reach ~ 150 res/second, or ~10 MiB / Second, a pretty decent improvement. While a big improvement from previous numbers, node really is not a speed demon when it comes to serving files.

 
I'm not exactly sure why this is faster, but I have a hunch that this has to do with the fact that binary data in Node.js (and V8) needs to be represented in UTF-16, and/or the fact that string concatenation is worse than O(n), I'm not sure what it is in V8, but I would have to guess that flushing data more frequently into the kernel buffer (which I'd guess from my horribly limited and outdated and limited kernel networking knowledge is O(n)) is vastly more efficient. I've got no idea how efficient UTF-16 conversion is, but it doesn't sound fast at all.
 
Strangely, those same benchmarks I ran yesterday seem to be running twice as fast in terms of throughput, and I don't know why, since I changed nothing, and the server was unloaded. The 44% improvement is still present when I change the chunk size, but there must have been some unknown factor I missed.

Node.js Really Can Scale

EDIT: The concurrency #s are accurate but in terms of speed, its actually faster by 150%, see this post for more info.

Node_js_concurrency_2

 

Node JS really does scale, check out the following graph of performance for 1000 requests on an app I recently wrote for work (all times are in milliseconds, with a total of 1000 reqs). 

Each request has 1 memcached call and then a 500 kb file read (these happen in serial), which is then written to the socket. This is on a 2.33 Ghz xeon w/ 4 gigs of ram, unloaded, running ubuntu Karmic. The file is loaded in the OS cache since it gets hit so often, so HD performance doesn't affect this. I had to stop after 500 connections because node won't open 500 file descriptors at a time. The file sending was handled by my fork of node-paperboy.
 
This app always pegged a single core (node's evented design doesn't use SMP capabilities), I'd have to think that if you ran one node per-core you'd get even better performance, hopefully I'll have time later to setup haproxy as a load balancer and try this.

PaNU

The Paleo Diet is a sexy proposition: eat tons of meat, and look smugly upon the world secure in your knowledge that those around you are living less virile and foolish lives. As the last two sentences show, it's a diet easy to praise, and easy to poke fun at. On the one hand it's a kind of anti-diet, on the other it seems so much like the sort of thing that was designed to catalyze a populist uprising that it's easy to sneer at. It also looks like the kind of thing that'll be ripe for parody in 10 years, when everyone's moved on.

With all that being said, PaNU , a Paleo-Diet blog, written by an MD is a fascinating, if verbose read. I find it easy to agree with the anti-processed food, anti-corn fed beef, anti-cereal grain propaganda, but I find it less hard to believe in the strong pro-meat bias Paleo practitioners show. From what I know, paleo diets varied geographically to a large (and in fact largely unkown) extent, and the prevalence of predominantly carnivorous diets seems unsettled.

I'm pretty fascinated by it, but what I see is a lot of speculation, with a lot of science about why it should work, but since it's so new, no studies as to weather it does work, especially as practiced in a modern context. Of course like any diet conceived of in one's lifetime, there's no real way to know what its long term effects are, as even the early adopters will follow you into old age, at which point the data isn't so useful.

So, what am I doing for food at the moment?

1. Eating more vegetables
2. Eating plants with my grains (because I do love my grains)
3. Eating more meats
4. Nearly completely stopped eating sugar
5. Cooking as much as possible myself

As far as exercise, I've been doing what PaNU recommends already, based off research I obtained from other sources quite some time ago. Most of my excercise is strength training, and the cardio I do is predominantly intervals. I actually made these choices predominantly because from the research I've read they give the most bang for the (time) buck, and I only have so much time to work out.

Non-blocking Image Preloads

Preloading images with conventional Javascript techniques can be problematic when you don't want the preloading to block all other requests. Since web browsers by default only allow two connections to a given server at the same time, if you preload say 20 images from that server, then issue another request, either for another image or perform some AJAX, your request will be blocked until those images are done loading.

I recently encountered this issue writing a Colorbox photo gallery, where I wanted all the full-sized images to preload in the background. Problems arose if while the large sizes were preloading someone clicked on one of the last images. Things would freeze, as the request for that image was simply added to request queue.

The solution? Well, a truly optimal solution would be to manage your own global image queue, but since time for didn't allow for that, a second-best solution is posted below, which is to restrict the preload script to downloading a single image at a time. This code is jQuery flavored, but obviously needn't be.

To kick it off:

$.sPreLoad.run('url1.jpg','url2.jpg')

Full Source:

Clojure is next

I've been mulling over learning a new language for the last few months, I've done bits of reading here and there on Haskell, Lisp (via SICP), and a couple other languages. The more time I spend looking the more Clojure likes the answer. Haskell looks fascinating, but at the end of the day, perhaps too baroque and just not as useful. Lisp + the power and libs of the JVM seems flat out awesome. Worth reading are these excellent slides by Clojure's creator: