Learn ZeroMQ in Ruby With These Examples...

I've started releasing a series of learn by example ZeroMQ lessons on github, you'll be able to receive updates with a simple git pull. There wasn't much in the way of docs / manuals / guides for ZeroMQ in the Ruby world, hopefully this will help fill that void.

Most of the current docs are either for the C api, or in the man pages. While fairly comprehensive, they're daunting, dense, and take a lot of time to go through.

These examples are a work in progress, I'd appreciate feedback and contributions as I move forward. If you finish these examples, I highly recommend reading the ZeroMQ man pages and the official site, even if they're a bit dense, for a more in-depth look at ZeroMQ.

Tailing MongoDB Capped Collections in Ruby

This wasn't documented anywhere, so I figured others might find it useful. In MongoDB you can tail a capped collection, much like using the tail -f command. This works because Mongo uses cursors. Tailing's a nifty trick, especially if you're using Mongo for logging. 

I'm using Phil Burrows mongo_db_logger to log all Rails reqs to a capped collection at the moment, I'd recommend dropping this into your rails app, after modifying it to read your mongo config so you always have it ready.

ZeroMQ: What You Need to Know Braindump

In a recent email conversation, I was asked about getting started with ZeroMQ by another Rubyist (the awesome Ilya Grigorik)

This was my caffeine addled response, which others may find useful as well:

I'm going to warn you, some things you hear when you first read about ZMQ sound crazy. If there's one thing to be said for ZeroMQ, it's this sentence, from the Mongrel2 book "[ZeroMQ is] sockets the way programmers think sockets work". 

The more time I spend with ZeroMQ, the less I can think of a reason I'd ever have to open up a raw TCP or UDP socket, except in extraordinary circumstances, again. I think of ZMQ as common IPC and network communication patterns abstracted into messages and sockets that don't require a broker infrastructure. The whole message queue aspect of it is great, but ZMQ is really designed for a whole range of situations you'd never use AMQP for, and IMHO, that's the truly interesting thing about it. You don't really run ZMQ brokers (mostly), you communicate socket to socket using queue semantics.

The mongrel2 book has the most concise rundown of ZMQ, and IMHO best frames how to use it properly. It's not very in depth, maybe 15 minutes of reading, but is probably the single best introduction to ZeroMQ.

http://mongrel2.org/doc/tip/docs/manual/book.wiki#x1-590005.2

After that, I'd read the rbzmq rdocs for ZMQ::Socket. It's actually a fantastic rdoc, and a better intro to ZMQ than a lot of the docs on the official ZMQ site. It's kind of sad that someone wrote that fantastic doc and it doesn't get any publicity.

Another good post on ZMQ is this blog post:

The mongrel2 project would probably be the best example of a project that really is aggressively using ZMQ in an interesting way. You could check out the m2r ruby adapter to get an idea about ZMQ use in an actual app: http://github.com/perplexes/m2r . You may notice a lot of people using ffi-rzmq rather than the zmq gem. They're almost completely compatible at the API level, except ffi-rzmq works on non MRI/YARV rubies. I'm personally using it exclusively. 

The project I'm working on, dripdrop. is just a simple serialization format and reactor api on top of ZMQ, to make building an app out of a lot of ZMQ building blocks easy, and as much of the tedium out possible. It initially started because I wanted a simple async interface to events on multiple servers running Rails, but I think it made me realize ZMQ needs a proper super-simple API, preferably one that integrates non-ZMQ exit points, like HTTP,  Websockets, XMPP, etc.

DripDrop is still definitely still at the playing around with different ideas stage, and isn't really stable yet. I'm happy with the serialization format though (which is pretty much just BERT with a header for PUB/SUB filtering at the ZMQ level, similar in approach to mongrel2's format, but using BERT). I already have plans to redo about most of it, and need to start adding in support for socket types other than pub/sub. I really do like pub/sub though, it may not have the right semantics as far as making sure messages get delivered, but it enables semi-aspect-oriented architectures where random other processes can hook into a message stream (am I abusing the term aspect-oriented?)

What I plan on working on is redoing it style-wise into a simple DSL like this (this is a simple forwarder that uppercases and copies everything to a websocket and a ZMQ pub socket as an example):


I like the conciseness of this, I can see integrating ZMQ, HTTP, Websockets and other protocols in an async fashion, using this style of API, as worthwhile. What do you think? Biting off too much?

Anyway, I've clearly had too much coffee :)

Thanks Chuck! ffi-rzmq and zmqmachine gems released. 1.8.x compat as well!

Big thanks to Chuck Remes for releasing the ffi-rzmq and zmqmachine gems on rubygems.org. This also makes ffi-rzmq 1.8.7 compatible. I haven't had a chance to test this out much (and neither has Chuck, thanks for humoring me w/ this patch Chuck!), but it seems stable to me. This is great news for my new project dripdrop (which is still in a very tumultuous phase right now), and the ruby mongrel2 driver m2r .

ffi-rzmq definitely seems to be the preferred way of connecting to ZMQ with ruby. Don't, however skip the official rbzmq docs, which are a fantastic intro do ZMQ intself.

App monitoring/messaging with ZeroMQ + Ruby = dripdrop

I just this week had an idea for realtime app stats/messaging using ZeroMQ. I wanted to be able to view events from my app in real time, and be able to archive them or do any random thing with them. Hence, dripdrop was born. It's a pretty cool way of doing collecting stats or performing async tasks. There's a full description of it on the GitHub project page at: http://github.com/andrewvc/dripdrop.

I'm mostly leveraging the awesome libraries that are zmq, zmqmachine, bert, and em-websocket.

I've diagrammed my use case for it below:

My New Job: Developer at Online Greetings Company Cocodot

So, it's been a month, and I still haven't put it up here, I'm now working as a Rails developer for online invitations and greetings company cocodot.

If you haven't see cocodot, it's a pretty interesting concept that I'm quite excited about. We occupy the the greetings space, but have better execution than the competition, we probably have the best designed greetings and cards of anyone out there, thanks to an amazing creative staff. The wedding invitations system is pretty slick as well, it's probably the best way to do online wedding invitations at the moment.

A Beautiful Photo, by Kertesz

Andre Kertesz was an amazing photographer. This photo is my current desktop background.

Sendgrid + Sendgrid Toolkit is pretty awesome

Sendgrid's a great way to send email from your app. They provide statistics, reliable delivery, and a fantastic XML/JSON API. If you're using Ruby, I recommend checkout out Sendgrid Toolkit, which I just found out about.

Sendgrid Toolkit is pretty much a thin wrapper using jnunemaker's HTTParty, which I haven't heard about till now. HTTParty's a pretty cool way to painlessly create an interface to a Web API.

Sendgrid Toolkit's use of HTTParty makes adding functionality very easy. I added bounce retrieve and delete functionality by merely adding the small number of lines seen in the code sample.

Kernel Async IO, not worth it?

I'm currently preparing my slides for my presentation about async IO at next week's Los Angeles Hacker News meetup, and in the process stumbled across this interesting thread on kernel  AIO on the libevent mailing list. This segment of a post from William Ahern stood out in particular. To put this in context, the question is whether to use the kernel's own AIO API, or just roll your own with your own threadpool using one blocking call per thread.

There are many ways to do AIO. I know of none that don't use threads, either kernel threads or user process threads. Now, the threads could be "light weight", but there's still a calling context that blocks on an operation. But no implementation is equivalent to, say, how the socket subsystem [state machines] work, where the only state that needs to be maintained is a queue of waiting objects (where object != thread). To put it simply, all the AIO implementations use too much memory and do too much book keeping (like thread scheduling) than strictly necessary, because kernels don't have a way to to "poll" on VM pages (as opposed to polling on disk interrupts, which you can accomplish when do doing direct I/O).

From what I know, Node.js manages its own thread pool, as does Ruby's event machine (using Ruby green threads). If anyone has any more info on this just tweet me, @andrewvc with what you know.