Getting Started with HTML5 WebSockets

3
321
Our little app in action
Our little app in action

HTML5 WebSockets?

Ever felt that in the so called two-way Web, updating Web pages in real-time is a pain? Well, the HTML5 specifications took care of that in a big way. Introducing WebSockets — the full power of a UNIX socket, within your Web server.

Seven years ago, AJAX and Web 2.0 were all that was cool. There have been a lot of changes since then. For one thing, HTML5 and CSS3 have pretty much taken the whole world by storm. Apart from making the jobs of novice Web developers just that much easier, it’s given the more skilled people many more tools to work with. JavaScript has finally triumphed on the server side, thanks to Google’s V8 and Node.js. Cloud computing has finally taken off in a big way and the result has been two very different, yet very similar products putting our Web applications on the cloud — Google App Engine and Windows Azure.

Some of the more low-key advances have been the product of what we call the “engineer’s itch”. We needed it, so we made it. Until now, if you wanted to talk to a Web page and continually update it with real-time data, you had two very inefficient solutions. You either had an HTTP Keep-Alive connection, where the server would keep pushing data to the page. This, depending on how you made it, could even mean that theoretically, the page would never finish loading. Or you could have a script on your page poll the Web server at regular intervals, using…yes, XmlHttpRequests to grab data off the server. Ouch!
WebSockets pretty much makes all of that obsolete.

The idea behind WebSockets

You all know what a socket is. If you don’t, go Google it.

All right, you didn’t go anywhere, so here’s what a socket is, for dummies — it’s basically an object where you attach a pipe. Now a pipe is another object; it behaves just like a pipe in your bathroom, except that what’s passing through it is not water but data. In the context of the Internet, the pipe is your Internet connection.

So the Web browser opens a socket and connects it to a pipe that goes to the Web server. The Web server has a socket open which is listening for connections — pipes — and once both the server and the client are connected through that pipe, they can talk together, and fetch and push data.

Now sockets are a system feature. If you’re writing a Web application, you’re probably dealing with HTML, CSS and JavaScript, plus some kind of scripting code at the server end. You shouldn’t be dealing with sockets directly. An HTTP connection is an abstraction on top of a socket, and you should be staying within the HTTP connection. But an HTTP connection has limitations:

  1. You can only transmit one page of data per connection.
  2. You cannot transmit more than the content-length specified when the header is sent.
  3. HTTP connections are not conversational. You ask once, and you get once.
  4. HTTP connections can stay open for just as long as it takes to transmit the data.

Sure, there are ways around all these limitations. HTTP connections can be reused, Keep-Alives can be used to keep the connections open for the server to speak up later after the page has been loaded, and Web developers can choose what constitutes “one page” so that they can hack around the protocol to dynamically update pages. But these are all ugly hacks (which have a name — Comet; Google it sometime), and you really can’t have the conversational interface that a raw socket allows.

Enter WebSockets

WebSockets aren’t raw system sockets. Rather, they’re implemented as a special extension on top of HTTP (think of extensions like WebDAV and the like). When you open a WebSocket, it goes through a HTTP handshake — so you can use normal HTTP connection-establishing methods such as authentication or SSL. But after the handshake is over, what you’re left with is a perpetually open connection, where you can pump in and receive any data you like. It doesn’t even have to have a schema.

In short, this means that you can use it just like a pure network socket.

WebSockets are actually the bleeding-edge part of the upcoming set of Web standards. As such, it was only finalised as RFC 6455 as recently as December 2011. It did exist as a number of drafts before. In fact, one such version, Draft 76, was the standard for quite some time, until a gaping security hole was discovered.

With the standard being just two months old, precious few browsers support it. Mozilla Firefox prior to version 11 required a “Moz” prefix. Chrome has been supporting it since 14.0, but surprisingly, Safari supports only the earlier plagued Draft 76 version of the protocol, as does Opera. And needless to say, Internet Explorer doesn’t support any version of WebSockets. This is slated to change with Windows 8, because the Internet Explorer 10 Technology Preview already includes support for WebSockets.

This is all very well, but how on earth am I supposed to use it? Well, I hope you know Python.

Playing with WebSockets

No, Python isn’t the only way to use WebSockets, but for the purposes of this article, it’s  good for a demonstration. Also, it’ll let me introduce another piece of Python technology that’s used much less than it should be — Tornado.

Tornado is an application server written in Python. It includes its own HTTP server, a template language, and a Web framework that can be used like an MVC framework, but is meant to be used in a much easier fashion.

What makes Tornado unique is that it’s asynchronous and real-time. Everyone using FriendFeed retains one connection to them — and the updates are pushed out in real time. That’s easily more than 10,000 connections per server. Yes, Tornado is a FriendFeed technology that was open-sourced after the takeover by Facebook.

Let’s get started. Make sure you’re using Python 2.6 or higher. It’s highly unlikely that Tornado is in your distro’s package repository, though Tornado is included in the PyPI. So you can just use the following command:

sudo easy_install tornado

And the Tornado package will be installed, along with all dependencies. If you’re the more adventurous type, you could download the source archive from Tornado Web Server, extract it, and use Setuptools to install it the standard way, as follows:

tar -xvzf tornado-2.2.tar.gz
cd tornado-2.2
./setup.py build
sudo ./setup.py install

If you’re even more adventurous, and want to keep Tornado isolated, you could just extract the archive, and copy the tornado-2.2/tornado directory into your project. It’s all good.

Anyway, assuming that Tornado is installed (you can run the Python interpreter and import tornado.web without any errors), you’re set to go.

First of all, let us build an application that simply echoes back what we say to it. It’s lifted straight off the Tornado examples, with a few minor edits. Here’s our Python application that’ll run on the server:

#!/usr/bin/python2

import tornado.web
import tornado.websocket
import tornado.ioloop

class MainHandler(tornado.web.RequestHandler):
    def get(self):
        self.render("front.html")

class WebSocketHandler(tornado.websocket.WebSocketHandler):
    def on_message(self, message):
        self.write_message(u"Server echoed: " + message)

application = tornado.web.Application([
    (r"/", MainHandler),
    (r"/websocket", WebSocketHandler),
])

if __name__ == "__main__":
    application.listen(8888)
    tornado.ioloop.IOLoop.instance().start()

Tornado is still undergoing heavy development, but the developers say that the code has been cleaned up enough so that, theoretically, only the parts of Tornado that need to be used have to be imported. It’s worked pretty well for me so far, so in the first three lines, I’m importing the core web module, the websocket module and the IO loop module (which ties all the modules together and runs the application).

Next up is the main handler class. In tornado, you have a bunch of classes, which define the methods you want to support as a function. In this case, the MainHandler class supports the HTTP GET method, and thus, I’ve sub-classed the tornado.web.RequestHandler class (which is the root class you need to sub-class in order to create a serveable object) and overridden the get() method, so that it delivers the front.html page to the browser.

Then comes the WebSocket handler, which is the same deal as the MainHandler, except that WebSockets is a message-based protocol, so the only method you have to override is the on_message() function. The on_message() function is an event handler, which fires every time a message is received. As such, in our little program, all it does is echo back (using the write_message() method) whatever we’ve said, as a Unicode string.

Next up, I’ve instantiated the application, and mapped the / URL to the Main Handler, and the /websocket path to the WebSocket handler. In the next if-clause (which, for Python novices, is the equivalent of C’s main() function), I’ve instructed the app to listen on port 8888, and started the IO loop. Once you execute this file, our server is running.

Now, we’ve got to interface with this on the client side. Note that I’m using the standard interface, so the only browser this will work on now is Google Chrome/Chromium. It should work on Firefox 11 and above, or you can use a moz prefix to the WebSocket object to make it work with versions of Firefox later than version 6 (use MozWebSocket instead of WebSocket in the JavaScript).

Without further ado, here’s the front.html file, with the script embedded in it:

<!DOCTYPE html>
<html>
    <head>
        <title>WebSockets Demonstrator For LFY</title>
    </head>
    <body style="font-family: Lekton; font-size: 24pt">
        <p style="width: 800px">Powered By WebSockets. Use the form below to chat with the server.</p>
        <div id="chatbox" style="font-size: 14pt; height: 500px; width: 800px; overflow: scroll; border: 1px solid black"></div>
        <form id="conversation" onsubmit="DispatchText()" action="javascript:void(0);">
            <input type="text" id="message" name="message" autocomplete="off" style="width:700px" />
            <input type="submit" id="sub" name="sub" value="Send" style="width:90px" />
        </form>
        <script type="text/javascript">
            var ws = new WebSocket("ws://localhost:8888/websocket");
            ws.onmessage = function(evt){
                x = document.createElement("p");
                x.innerHTML = evt.data;
                document.getElementById("chatbox").appendChild(x);
            }

            function DispatchText(){
                var userInput = document.getElementById("message").value;
                document.getElementById("message").value = "";
                x = document.createElement("p");
                x.innerHTML = "You sent: " + userInput;
                document.getElementById("chatbox").appendChild(x);
                ws.send(userInput);
            }
        </script>
    </body>
</html>

Before I start explaining this piece of code, let me make it very clear that to me, JavaScript is Satan himself. As a result, the code you’ve seen above has the potential to be very ugly. With that out of the way, let me explain what I’ve done.

The file is standard HTML5 with a valid Doctype (you need to use HTML5 specifically, because WebSockets is an HTML5 feature). There are all sorts of in-line styling, because that’s one less file to send off for publication. Believe me, the number of mistakes in publishing is directly proportional to the number of files my article is split into ;-) The message is: in production code, keep your styling information separate. Use a CSS file. I’ve also used the Lekton font and large sizes — this keeps the screenshot legible. Lekton isn’t a Web-safe font. I have it installed, but most don’t. However, it’s available from Google Web Fonts if you want to use it, so it’s all good!

Our little app in action
Our little app in action

Coming down to the form, you’ll see that I’m hooking the onsubmit event to a JavaScript function, and using a JavaScript URL as the action. The idea is to make the form not refresh if I press Enter (the JS URL is a quick and dirty way of doing it), and the onsubmit event takes care of triggering the appropriate JS function when I submit the form.

Now, let’s come down to the script.

First of all, I’ve created a new WebSocket object. For versions of Firefox prior to version 11, you’ve got to use new MozWebSocket here. Anyway, the WebSocket points to the appropriate URL that we mapped our WebSocket handler to, in the Python server app. Notice the protocol: ws. If you want to use WebSockets over SSL, the protocol is wss.

Just like in the server app, I need to override the onmessage() function, which again is an event handler that fires every time the server sends a message. What I’ve done here is a simple three-step procedure: I created a new HTML paragraph, put whatever the server sent into the paragraph, and appended it inside the big chatbox div.

Now let’s look at the more interesting DispatchText() function.

  1. Here, I first get the contents of the input field and store it in a variable, and immediately clear the field. After I’m done with the field, and provided that the function has been properly coded, you can now use the input field again.
  2. Again, I do exactly what I’d done in the onmessage() function, displaying what I typed in the chatbox.
  3. And then, with the ws.send() function, I sent what I typed to the server over the WebSocket.

That’s it. WebSockets is this easy to use.

Where to, from here?

Well, now that you know how to use a WebSocket, it’s time to leverage this thing.

Here are a few things to remember. A WebSocket is meant to be used with Web applications, and as such, is pretty difficult to use in standard websites with standard tools like PHP and Apache. Remember that applications like Tornado exist so that you can directly expose them to the world without a reverse-proxy in front. If you need to use authentication, you can do that through the WebSocket itself. You don’t need to use HTTP 1.1 authentication. You can, but again, it’s pretty darned difficult.

If you do need to reverse-proxy a WebSocket, you could use Nginx with a patch, as described in Reverse-Proxying WebSockets with Nginx. Check Apache WebSocket if you want to use Apache.

Again, it’s the easiest to use a full-blown application server, like Tornado or Node.js. If you want to use Node.js, you can use Socket.IO, which is a JavaScript module that works with both Node.js (to implement the server-side) and on browsers (to implement the client side). It has the additional advantage of being able to emulate WebSocket functionality on browsers that don’t natively support HTML5 WebSockets — so with Socket.IO, you can use WebSockets on IE 5.5 and above, Safari 3 and above, Firefox 5 and above, Chrome 4 and above, as well as Opera 10.61 and above. Check it out at Socket IO. You can use the client part of Socket.IO without using the server part, so you can use it with Tornado too.

If you’re the geeky type, you can check out WebSockets W3C Specification for the W3C specification and RFC 6455 — The WebSocket Spec for the RFC, which defines HTML5 WebSockets. You’ll want to explore WebSockets W3C Specification in detail if you’re interested in the nitty-gritty on the JavaScript interface to WebSockets. Of course, you don’t need to read any of it if you’re just going to use Socket.IO.

If you’re not going to use Socket.IO, you can check out CanIUse WebSockets, which is continually updated, to see a list of browsers that support WebSockets, along with the level of support they provide.

And finally, you may look at my JS code in the code listing above and laugh, cry or curse me to oblivion. Me, I’m simply waiting for browsers to start natively supporting CoffeeScript. Ha! Check out CoffeeScript if you don’t know what that is.

That’s all. Now go code.

Feature image courtesy: Phil Woodbridge. Reused under the terms of CC-BY-NC-SA 2.0 License.

3 COMMENTS

  1. awesome article. In case of websocket the best part is that its available now in cloud. I tried with some cloud environment and almost in all of them it worked very well.

    one of them like in jelastic-

    http://jelastic.com/docs/websockets

    it worked very easy and the same example mentioned here as well worked like easy way.

LEAVE A REPLY

Please enter your comment!
Please enter your name here