ThoughtBlog: More on XMPP Apps. Writing-into-the-app edition – 2/21
Yesterday I went to sleep having talked about one method I propose of how to write into a data store that is federated using xmpp. Today I’ll continue the trend talking about other methods that jump out of mind.
XMPP has Publish-Subscribe functionality layered on top of itself (Relevant proposal). By itself pubsub doesn’t give us write scalability, but rather provides the extra layer of indirection between app and data node that will let us get tricky.
Similar to how we segment writes among data nodes in the direct messaging model, we would segment writes into the system into a number of topics into which messages would be published. We could slice things along several lines: One example would be along class (all Topics get published to /topics, Products to /products). Taking this approach we are bound by the choice of using class names for topics: if the write needs of a single class exceeds the capabilities of a single box we are stuck.
Like I mentioned before, I would probably want to run experiments on each method to determine the best course for partitioning: My intuition says that the appropriate choice would be different for each write pattern exposed throughout the system.
The benefit is that behind each topic would be any number of data nodes waiting to write the published messages into the system. With that, we could get quicker redundancy by having each redundant node subscribe to the appropriate topic directly, rather than relying on replication to push around changes.
Fault tolerance could be achieved in the same manner as with the system that directly messages data nodes, but we can do something even neater. Since we can forgo replication and have each redundant data node receive write messages by having each subscribe to the same topic, we can have a missed message or failed write be requested from a peer data node instead of relying on repititive messaging from the application node. Not a silver bullet (every data node in the topic could fail to write the message successfully), but leveraging such a method could reduce the ‘length’ that a given data node would need to reach out to get a resend. Whether that is helpful or not for the system is beyond me. Now that I think about it, I just described replication…
Next time i’ll get into my ideas for using Mult-user chat (MUC) to achieve scalability, fault-tolerance, and redundancy.