Kernel Async IO, not worth it?

I'm currently preparing my slides for my presentation about async IO at next week's Los Angeles Hacker News meetup, and in the process stumbled across this interesting thread on kernelĀ  AIO on the libevent mailing list. This segment of a post from William Ahern stood out in particular. To put this in context, the question is whether to use the kernel's own AIO API, or just roll your own with your own threadpool using one blocking call per thread.

There are many ways to do AIO. I know of none that don't use threads, either kernel threads or user process threads. Now, the threads could be "light weight", but there's still a calling context that blocks on an operation. But no implementation is equivalent to, say, how the socket subsystem [state machines] work, where the only state that needs to be maintained is a queue of waiting objects (where object != thread). To put it simply, all the AIO implementations use too much memory and do too much book keeping (like thread scheduling) than strictly necessary, because kernels don't have a way to to "poll" on VM pages (as opposed to polling on disk interrupts, which you can accomplish when do doing direct I/O).

From what I know, Node.js manages its own thread pool, as does Ruby's event machine (using Ruby green threads). If anyone has any more info on this just tweet me, @andrewvc with what you know.

ZeroMQ, It's a big deal

ZeroMQ's got some interesting ideas, but unfortunately hasn't quite gotten the press it deserves. I'm just starting to toy around with it, mostly due to Zed Shaw's use of it in Mongrel2. Regardless of Mongrel 2, ZeroMQ IS fascinating. Why? A few reasons. I'll get to some of them in a bit, but first I'll let this short block of text from the ZMQ::Socket class's Ruby docs explain why:

Generally speaking, conventional sockets present a synchronous interface to either connection-oriented reliable byte streams (SOCK_STREAM), or connection-less unreliable datagrams (SOCK_DGRAM). In comparison, 0MQ sockets present an abstraction of an asynchronous message queue, with the exact queueing semantics depending on the socket type in use. Where conventional sockets transfer streams of bytes or discrete datagrams, 0MQ sockets transfer discrete messages.

0MQ sockets being asynchronous means that the timings of the physical connection setup and teardown, reconnect and effective delivery are transparent to the user and organized by 0MQ itself. Further, messages may be queued in the event that a peer is unavailable to receive them.

Conventional sockets allow only strict one-to-one (two peers), many-to-one (many clients, one server), or in some cases one-to-many (multicast) relationships. With the exception of ZMQ::PAIR, 0MQ sockets may be connected to multiple endpoints using connect(), while simultaneously accepting incoming connections from multiple endpoints bound to the socket using bind(), thus allowing many-to-many relationships.

The full ZeroMQ Ruby docs are a good read, as are the papers on http://www.zeromq.org/.

So, why else go ZMQ? Well, it's got interchangeable transports. ZeroMQ supports ultra-fast inter-thread messaging, inter-processĀ  communication, TCP, and multicast, as supported transports.

These messages can be exchanged between any language that support ZMQ, at the moment, that's all your faves, from C/C++ to Java, to Ruby, to Python, and more.

Another good source for ZMQ info is this excellent post on Nicholas Piel's blog. Resplendent with diagrams and wonderful explanations.

At any rate, I'm just getting started with ZMQ, as my ZMQ exploration continues I'll update this blog.