Combining Django and Node.js
An overview of the progr.es technology by Herman Schaaf, lead software engineer.
In Part 1 of this two-part series, I discussed how, in choosing the frameworks to power progr.es, we realized that it would be best to combine the strenghts of two different frameworks in particular: Django and Node.js. Django to power the application, chosen for its robustness, thousands of libraries and our experience using it, and Node.js for real-time (also called “comet”) capabilities. Combining these two was an unusual choice – one probably not attempted often up to now – and so we had to figure out a way to have the two frameworks communicate with each other. The common denominator we found for both of these is a messaging library called RabbitMQ.
RabbitMQ, quite simply, performs the function of keeping a queue of upcoming tasks to be performed until another program comes and asks for new tasks to perform. In Django, this is a standard method used to send emails asynchronously: an email needs to be sent, so you add the email to the RabbitMQ queue with Django, and a minute or so later you’ll have another Python script come and send this email. This way, the user never has to wait for the slow process of actually sending and delivering the mail.
In our case, however, RabbitMQ was merely to be used for its characteristic that it is agnostic about the language that sends or requests tasks. There are python and Node.js implementations for communicating with RabbitMQ, so it functions as the perfect glue between the two. Every time something happens in our Django application that users should immediately know about, Django adds a notification task to RabbitMQ, and soon thereafter a Node script comes along and picks up the task and sends the notification to the client. The beauty of this is that the Node script never needs to understand the message it is sending, or perform any duplication of the Django application logic. Django simply gives it a message to be sent to certain users, and Node.js (with socket.io) faithfully relays this message.
A High-Level Explanation
This communication between Django and Node.js using RabbitMQ as the middleman is how progr.es manages to do many of the really nifty things, and these three technologies form the entirety of our server-side application layer. Explaining the rest is now very simple.
As I mentioned earlier, we use the non-relational MongoDB as our storage. By design, only Django ever communicates with the database layer, retrieving and updating information as necessary. If it happens to be new information, Django simply just makes it trickle down to Node.js as well.
It is also Django that does the bulk of communication with the client. When someone requests a page on progr.es, Django evaluates the URL, calls the appropriate View, and returns the result of this view via standard HTTP protocol (see the figure on progr.es architecture above)
Let’s look deeper into the more interesting case of a modern browser, with tabs being loaded through AJAX and updates being received in real-time. How do we cache tabs, and make sure every tab is only loaded once but still kept up-to-date? For this we simply use a very basic (but extremely useful) feature of jQuery: In-memory HTML object representations.
With progr.es being as deceptively simple application as it is, I expect the depth and intricacy of the technology behind it may have come as a surprise to some. In truth, the simplicity and efficiency of a web application is only achieved through careful planning and thinking about the complexities at the time of development, so the end-user doesn’t have to. I hope this short two-part series helped explain some of those complexities, and the solutions we found to them.
Herman Schaaf is the lead software engineer for progr.es - you can follow him or ask questions @ironzeb on Twitter.