Introducing Digg's Streaming API
Digg has had an API for a while now, and we've learned quite a few important lessons along the way. In particular, two of these lesson have stood out. First, external developers will surprise us with their creativity time and time again. Second, it's impossible for that amazing creativity to innovate if our API's functionality always trails the functionality on our site.
Today we're releasing a new API feature which we hope will ease the process of using Digg's data, and also freeing developers to model and store the data however they want. As such, we're excited to announce the Digg Streaming API.
Our Streaming API provides any developer with full access to a real-time stream of all submissions, Diggs and comments occurring on our site. You can connect to the stream via client side JavaScript, or you can setup a script to pull all our content into your own databases to expose however you want.
If you've used Twitter's Streaming API then you already understand the opportunity that awaits. If not, we're pretty sure that if you spend a few minutes playing around with the API that you'll find your curiosity sparked.
Using the API
The simplest way to get started with the Digg Streaming API is just follow this link http://services.digg.com/2.0/stream?types=comment&return_after=1. After a moment or two a comment should have shown up. The API is documented on developers.digg.com, but let's take a look at the different supported parameters:
- return_after indicates the number of events after which to close the connection. By default the value is unlimited, unless you specify format=javascript, which forces the number of events to be 1 (to support JSON-P, more on that below).
- types specifies the type of events to include in the event stream, and should be a comma-separated list containing these values "digg", "submission" and "comment". By default streams receive all three kinds of events.
- format determines the format of responses, and may be one of "json", "javascript" or "event-stream". If format is "javascript", then you must also specify the parameter "callback" with the name of the JavaScript function to wrap the response in (also note that if format is "javascript" then "return_after" is forced to have a value of 1). "event-stream" is an exciting format which allows some modern browsers to interact with streams of data (older browsers will have to rely on JSON-P and long-polling).
To help you get started, we've put together a couple of examples:
- a streaming example using web-sockets by Can Duruk, with the code available on Github
- an OS X Dashboard Widget by Andrew Hedges
- and two examples, Python loading Digg Streaming API data into Redis and a Chrome browser extension by Will Larson.
As always, if you have any issues with our API, please contact us on the DiggAPI forum on GoogleGroups.
Our Implementation
To round out this post we wanted to discuss the implementation behind the Digg Streaming API. From development until deployment, the project took only a couple of days, and that's really a testament to the versatility of open source software these days. In particular, our API servers all run Tornado, the asynchronous HTTP server used originally by Friendfeed, and collecting and distributing the notifications is handled by Redis.
As each event comes into our system, we publish them to a Redis queue corresponding to the event type. Every Tornado process (we run at least one per core per server for the machines which serve the API) subscribes to the Redis queues for every event type, so each process will receive a notification for every event. Upon receiving the event, the process will check if it has any connected subscribers listening for that event type, and if so will send the notification down the existing connection to the client in the appropriate JSON/JSON-P/Event-Stream format.
And then, well, no, I guess that's really all there is to it. All in all, it may be a neat experience but underneath is a very simple implementation. We're looking forward to seeing what amazing ideas the Digg Streaming API helps bring to fruition, and again, please let us know if you run into an issues by contacting us on the DiggAPI GoogleGroup.
Thanks,
WIll