natural treatment of hypothyroidism

Gluing it all together: progr.es Behind the Screens – Part 2

Combining Django and Node.js

An overview of the progr.es technology by Herman Schaaf, lead software engineer.

In Part 1 of this two-part series, I discussed how, in choosing the frameworks to power progr.es, we realized that it would be best to combine the strenghts of two different frameworks in particular: Django and Node.js. Django to power the application, chosen for its robustness, thousands of libraries and our experience using it, and Node.js for real-time (also called “comet”) capabilities. Combining these two was an unusual choice – one probably not attempted often up to now – and so we had to figure out a way to have the two frameworks communicate with each other. The common denominator we found for both of these is a messaging library called RabbitMQ.

The progr.es Architecture

The progr.es Architecture: How the different frameworks communicate with each other.

RabbitMQ, quite simply, performs the function of keeping a queue of upcoming tasks to be performed until another program comes and asks for new tasks to perform. In Django, this is a standard method used to send emails asynchronously: an email needs to be sent, so you add the email to the RabbitMQ queue with Django, and a minute or so later you’ll have another Python script come and send this email. This way, the user never has to wait for the slow process of actually sending and delivering the mail.

In our case, however, RabbitMQ was merely to be used for its characteristic that it is agnostic about the language that sends or requests tasks. There are python and Node.js implementations for communicating with RabbitMQ, so it functions as the perfect glue between the two. Every time something happens in our Django application that users should immediately know about, Django adds a notification task to RabbitMQ, and soon thereafter a Node script comes along and picks up the task and sends the notification to the client. The beauty of this is that the Node script never needs to understand the message it is sending, or perform any duplication of the Django application logic. Django simply gives it a message to be sent to certain users, and Node.js (with socket.io) faithfully relays this message.

That’s how progr.es manages to immediately inform all users of notes or changes to a project, without ever anyone ever refreshing a page. This approach also has other advantages. For example, all changes are still communicated between Django and the client through standard HTTP POST, which maintains backwards-compatibility and keeps working even in the catastrophic event that the client’s JavaScript is outdated, disabled or not working.

A High-Level Explanation

This communication between Django and Node.js using RabbitMQ as the middleman is how progr.es manages to do many of the really nifty things, and these three technologies form the entirety of our server-side application layer. Explaining the rest is now very simple.

As I mentioned earlier, we use the non-relational MongoDB as our storage. By design, only Django ever communicates with the database layer, retrieving and updating information as necessary. If it happens to be new information, Django simply just makes it trickle down to Node.js as well.

It is also Django that does the bulk of communication with the client. When someone requests a page on progr.es, Django evaluates the URL, calls the appropriate View, and returns the result of this view via standard HTTP protocol (see the figure on progr.es architecture above)

Let us take as an example when a user is using the main application: the dashboard and projects. There are two possible cases here: in the first case, the user is loading the dashboard for the first time. If so, Django not only sends the full HTML page, but also all the CSS and JavaScript necessary for rendering and updating new tabs to be loaded in the future. In the second case, the user has already loaded the dashboard, and is requesting a new project tab to be opened. Django picks up that the request was made via AJAX, and only sends the HTML for the project in question, without the JavaScript and CSS, since those were already loaded when the dashboard opened. Once the client receives the page, JavaScript makes sure that the page gets rendered as intended, with the project showing below the newly opened tab.

But what if the user goes to a project page directly, maybe by opening a link from an email? This is where our approach really shines. If the user’s browser is modern enough (think anything after Internet Explorer 8, or similar) we redirect him to the dashboard page, with specific instructions to immediately open the project in a tab once the tab system is loaded. This way all the CSS and JavaScript still get loaded, and the entire situation is the same as when the user landed on the dashboard and then clicked a link to the project. Alternatively, if the user’s browser is too old or has JavaScript disabled, and therefore does not support most of our required functionality, we do not redirect to the dashboard but instead load the JavaScript and CSS only for the page in question. Updates will not happen in real-time and clicking on a new tab will reload the entire page – very clunky and not as smooth as we wish the process to be – but a desperate user will be able to perform some basic tasks if in a pinch.

Let’s look deeper into the more interesting case of a modern browser, with tabs being loaded through AJAX and updates being received in real-time. How do we cache tabs, and make sure every tab is only loaded once but still kept up-to-date? For this we simply use a very basic (but extremely useful) feature of jQuery: In-memory HTML object representations.

jQuery object storage

jQuery in-memory representation storage: progr.es uses in-memory representations of the full HTML structure of every page to seamlessly perform caching and live updates of loaded tabs.

Every time a tab is opened, we get the HTML from Django and keep a jQuery representation of this structure in JavaScript. When we want to show a tab that has been loaded before, we simply pull the jQuery object from memory, and put it inside the page’s document object model. If we receive new information on changes within that project, we again just access the in-memory jQuery representation, and make the changes as if the project is right there in the browser already. If the project happens to be showing while we make the changes, it will just show immediately. If it happens to not be showing, it’ll update in-memory and show the next time the user opens the tab. Either way, it doesn’t matter to us, since we only interact with the in-memory jQuery object. This allows us to load every page only once per session, and keep tab loading speed limited only by the user’s processor speed. The figure above does a good job of explaining this process more clearly.

Conclusion

With progr.es being as deceptively simple application as it is, I expect the depth and intricacy of the technology behind it may have come as a surprise to some. In truth, the simplicity and efficiency of a web application is only achieved through careful planning and thinking about the complexities at the time of development, so the end-user doesn’t have to. I hope this short two-part series helped explain some of those complexities, and the solutions we found to them.

If you have any questions you’d like answered in more detail, feel free to fire them to us at @progresapp or to me, Herman Schaaf, at @ironzeb on Twitter.

Herman Schaaf is the lead software engineer for progr.es - you can follow him or ask questions @ironzeb on Twitter.

Google+