ThoughtBlog: On XMPP and building distributed applications – 2/19
So, i’ve been spending a lot of time thinking, prototyping, and just generally mucking around with what a federated Get Satisfaction would look like. I’ve got a Curio document bursting at the seems with possibilities, problems, questions, etc. and felt like it might help the process to try and put this exploration into prose. In general these ThoughtBlog posts will be out-loud wonderings… feel free to comment and contribute, most of the stuff I’ll by writing about will be musings and questions (to myself and to the world).
I would expect that these entries will end and start at odd places… don’t expect any grand conclusions
So, I’ve got it stuck in my head that I want XMPP to be the glue that binds a sharded Get Satisfaction. Whether it’s the right choice or just mental masturbation, I don’t know yet… but I’m having a blast.
So why XMPP?
Besides the enthusiasm that has built around Jabber over the past couple years, when looking at it objectively, there seems to be many parallels to how I think of clusters and what IM is. One of the reasons, in my mind, that Erlang is so successful at building distributed systems is that it makes very few guarantees with the messaging system: As an actor, the only way you can be assured that a message has been delivered is if the recipient sends a message back (more specific details at [http://www.erlang.org/faq/faq.html#AEN1189]).
Now, it may be that XMPP provides a more lossy delivery mechanism than what erlang messaging does, but hopefully it maintains the same general idea as erlang: Delivery is guaranteed as long as nothing breaks. When things break, we should be able to detect those from a system outside of core messaging system (Erlang uses the linked processes construct). I still haven’t figured out how best such a system would be implemented using XMPP.
Another appealing reason is toolchain support. I imagine having a XMPP MUC (Multi-user chat, think IRC) room in which resides every node in the Get Satisfaction system. From that room I can command the entire cluster (reboot, report statistics, re-balance?). Starting a chat with firstname.lastname@example.org/console would pull up an IRB session over XMPP. Now, you might say “Why not SSH? it works already dumbass”, which is a fair point. The wow-factor for me is that the nodes in the cluster now have the ability to contact me directly. It’s one thing to log into the server and check to see that mongrels are running, it’s a whole other thing to have a server tell you it lost a mongrel process. Now, obviously you should have a monitoring system in place to notify you when a server goes down, but it is very appealing to me aggregate all of those various webapps, emails, and console sessions into a single mechanism.
One day, I think it would really fucking cool if Get Satisfaction API provided two faces that worked in concert: A RESTful HTTP service like we have now for pulling from the data store, and an XMPP service that lets you receive push notifications of actions as they happen in the system. I think that would provide for a pretty elegant and complete set of methods to interact with GS data.
What not to use XMPP for
Among the exercises I’m working through, I think it’s important to weigh each decision against what is currently available. By remembering to augment decisions with the consideration, I hope to avoid the problem of getting caught up in my own hype and making a poor decision based on my excitement. To avoid the “When all you have is a hammer, everything looks like a nail” problem.
One specific case so far, is with regards to data-retrieval and querying. As you may or may not know, XMPP defines an IQ type message that is an analog to HTTP: You send an IQ stanza and you expect a response.
For querying the data store, using HTTP seems much nicer and is something I’m much more familiar with. I’m specifically talking here just about the GET & HEAD verbs. Stateless-ness, idempotence, ubiquity. There isn’t really any benefit to using IQ stanzas to re-implement what would be done today in HTTP.
The problems so far
Getting my head wrapped around what is available in the XMPP RFCs as well as all of the various extensions is surely quite an undertaking. I always run into this problem. When presented with a hundred different ways to solve a problem, I have a super hard time choosing one. I usually try to break things down to make the decision easier, but usually I get over things by jumping in feet first.
Part of the issues I find is that it seems that all of the people out there that have built applications on top of XMPP are tight-lipped; I spent a good deal of time wading through google to find a “lessons learned” type post. This has been done before, right? I’m hoping that one of the side-effects of these musings is that some guru comes out of the woodwork and learns it to me good.