Project Totem – A Long Polling server (Part 1)

Normal Polling

Let’s start with normal polling. The browser simply runs some Javascript on a timer that repeatedly checks for new data on the server. The problem with this is there a trade off between latency and bandwidth. If you were to use a timer that ran every minute your server load would be minimal…. but there’d also be up to a minute before the user saw the changed data. You could drop it to a very short interval but you’d have a LOT of requests to your site.

The server setup is unchanged from a normal web server setup:

Long Polling

If we don’t want the bandwidth/latency trade off there is another way. You can use the timeout function of most AJAX libraries (I use jQuery) to perform ‘long polling’. Instead of asking the conversation between the browser and the server going like this:

“Anything new…………………..? Anything new…………………..? Anything new…………………..? Anything new…………………..?Anything new…………………..? Anything new…………………..?Anything new…………………..? Anything new…………………..? Anything new…………………..?”

It goes more like this:

“Tell me if anything new comes along in the next 20 seconds ………………………………………… ………………………………………………………………….

Nothing? OK, let’s try again…

Tell me if anything new comes along in the next 20 seconds ………………………………………… …………………………………………………………………."

a much more efficient use of bandwidth, but here’s the double bubble bonus. So long as the server it’s asking returns the new data and closes the connection there’s actually less latency. It doesn’t matter when the new data arrives, but with standard polling you have to wait until the next poll.

In flowchart form, long polling is super simple:

So we’re all sorted right? Not quite.

The problem with long polling using a regular web server is, it’s not very efficient. You end up with a LOT of open connections, and other than having IIS sit there spinning on each ‘poll’ page waiting for new data to come in, there’s not really a nice notification structure either. Apache is even worse on this front as it really dislikes connections being held open. Another minor snag is that you don’t want to query the original hostname for the data. Most browsers only allow you 2 connections per site, so if you tie up one on the polling there’s only one left to actually fetch data.

So, the answer is a dedicated polling server.

These things exist in the *nix world, most notably CometD, but it’s a lot to learn just to do something simple.

After 10 minutes of pontificating, I decided to do the obvious. Make my own! Project Totem is born. ( because a Totem is a ‘long pole’ and also as a nod to my friend Sam who runs Totem Development )

In essence it’s a very simple Windows Sockets application that just pushes Javascript back to the browser. The browser then executes that script and gets the data from the original web server.

The server generates a GUID that’s sent to each page in the polling javascipt. The server also tells Totem that it’s served that GUID, and that page needs to know about changes to data sets A, B and C.

The browser then polls Totem using the GUID, and if there’s nothing new the request will just time out after 20 seconds. It then polls again, and repeats polling using a 20 second timeout. The very millisecond that Totem gets a notification from the webtier that say data set A has changed, it returns the appropriate Javascript back to the browser and shuts the connection. The browser then does whatever you want to go get the data etc.

I’ll explain more about how I’m tracking keys/scripts and GUIDs etc in part 2 🙂