The Digg Blog
Saying Yes to NoSQL; Going Steady with Cassandra
The last six months have been exciting for Digg's engineering team. We're working on a soup-to-nuts rewrite. Not only are we rewriting all our application code, but we're also rolling out a new client and server architecture. And if that doesn't sound like a big enough challenge, we're replacing most of our infrastructure components and moving away from LAMP.
Perhaps our most significant infrastructure change is abandoning MySQL in favor of a NoSQL alternative. To someone like me who's been building systems almost exclusively on relational databases for almost 20 years, this feels like a bold move.
What's Wrong with MySQL?
Our primary motivation for moving away from MySQL is the increasing difficulty of building a high performance, write intensive, application on a data set that is growing quickly, with no end in sight. This growth has forced us into horizontal and vertical partitioning strategies that have eliminated most of the value of a relational database, while still incurring all the overhead.
Relational database technology can be a blunt instrument and we're motivated to find a tool that matches our specific needs closely. Our domain area, news, doesn't exact strict consistency requirements, so (according to Brewer's theorem) relaxing this allows gains in availability and partition tolerance (i.e. operations completing, even in degraded system states). We're confident that our engineers can implement application level consistency controls much more efficiently than MySQL does generically.
As our system grows, it's important for us to span multiple data centers for redundancy and network performance and to add capacity or replace failed nodes with no downtime. We plan to continue using commodity hardware, and to continue assuming that it will fail regularly. All of this is increasingly difficult with MySQL.
Choosing an Alternative
Digg is committed to the use and development of open source software and we're keen to avoid the cost of proprietary large-scale storage solutions. We were inspired by Google and Amazon's broad use of their non-relational BigTable and Dynamo systems. We evaluated all the usual open source NoSQL suspects. After considerable debate, we decided to go with Cassandra.
Simplistically, Cassandra is a distributed database with a BigTable data model running on a Dynamo like infrastructure. It is column-oriented and allows for the storage of relatively structured data. It has a fully decentralized model; every node is identical and there is no single point of failure. It's also extremely fault tolerant; data is replicated to multiple nodes and across data centers. Cassandra is also very elastic; read and write throughput increase linearly as new machines are added.
We experimented on our live site, replacing a relatively high scale MySQL component with a Cassandra alernative. These tests went well. You can read more about these experiments here.
Where We Are
At the time of writing, we've reimplemented most of Digg's functionality using Cassandra as our primary datastore. We've supplemented Cassandra-based indexing using full text, relational and graph indexing systems. We're getting used to dealing with eventual consistency.
We've been working on Cassandra itself too. We've made massive performance improvements: increased comparitor speed, added better compaction threading, reduced logging overhead, added row-level caching and implemented multi-get capability. We've also implemented native atomic counters using Zookeeper (you can probably guess why were motivated to add that feature :)
We've tested and improved the operational capabilities of Cassandra, upgrading its Rackaware capability, added slow query logging, improved the bulk import functionality and implemented Scribe support for improved logging. We've also done a ton of operational testing.
We're open sourcing all our work on Cassandra.
What's Next?
Currently our main focus is getting Digg's latest release into general availability, but we'll continue to lead the way in championing Cassandra's development and adoption.
If you're interested in joining a world-class team using cutting edge, NoSQL technology at scale, check out http://jobs.digg.com
Take it easy,
John Quinn. VP Engineering. (Digg: doofdoofsf, Twitter: doofdoofsf)
Best of Digg: February Comments
Best of Digg: February Comments
February is a short month, but there was no shortage of sass in Digg comments last month. Let’s get straight to the handful of users who kept us at Digg HQ smiling during an otherwise rainy, gray month in San Francisco.
The Jay/Conan battle may have calmed down, but among Digg users the rancor hasn’t subsided. So it was no surprise that Team Coco was quick to turn news of their TV hero’s new Twitter account into an epic Jay-bashing session.

Touching on the Eubanks comment in the example above, we really appreciated the gigantic circle spotlighting why exactly we should find this image submission funny, as did the commenter below. We more genuinely appreciated the string of sarcasm that followed.

It’s impossible to accurately express the awesomeness of this string of comments in a screenshot, but it’s definitely worth checking out the whole thing. Here’s a peak at the commenter love that followed this masterpiece:

In this example, we see that Digg users work hard for their comment lulz, even busting out Photoshop to add to the conversation. Head over to the permalink and click around for the full effect.

Finally, this short but sweet zinger,, in response to a submission about Nickelback singer Chad Kroeger’s supposed opposition to a Facebook group about why his band sucks it hard, is not just a lol moment for many of us. You could go so far as to say it’s a rallying cry for all fans of music that doesn’t make you wanna bash your head against a wall.

Bonus: This isn’t about the comments, but rather a comment-focused submission from our friends at Funny or Die. Check out their video tracing the origin of a one-liner we’ve seen repeated on Digg one or two times: that’s what she said. And, as if the above examples aren’t enough to convince you that Digg users are awesome, the fine folks at Regretful Morning singled out the Digg community, featuring a take on a ditty we found particularly amusing in January. Bangin’, y/n?
Thanks to all the users who contribute funny, entertaining comments to Digg every day. We love your feedback, so hit us with any questions or comments by going to http://digg.com/contact or reach out via Twitter: @digg_community
Hasta pronto,
T.J.
10 Useful git commands
We have been using git for months now and I still learn something new about it almost every week. Here are some of the more useful commands that I have come across. Have some good ones yourself? Add them to the comments on the digg story.
git commit --amend- Allows you to change the very last commit in your history. You can change just the message itself (great for typos) or you can use
git add file.txtthengit commit --amendand add it to the commit. git pull --rebase- This command will save any commits you might have in a temporary place,grab any new ones you might be missing form your tracking branch and stick yours back on the end. Keep the history clean and say goodbye to those merge commits.
git checkout HEAD~3 path/to/file.txt- Don't like the last 3 changes on a given file? This will allow you to get a file back exactly as it was at a given point in time.
git cherry-pick 3c1f6a472- You can grab just a specific changeset and apply it to your history, no need to merge the entire branch. But be careful, it changes the commit hash.
git stash- Not ready to commit yet but need to change gears and make a quick change for someone else? Save your working copy in a temporary place then use
git stash popto get it back. git reset- Have some files ready to be committed then you decide you're not ready yet?
git resetwill remove all the changes from the 'ready to be committed' state but leave them intact in your working copy. git reset HEAD^- Will remove the last commit from your history but keep the changes in your working copy.
git reset --hard HEAD^- Like the command above, this will remove the last commit in your history, but it will also wipe away the changes in your working copy.
git merge --squash- Have a bunch of work/changesets on a branch and are ready to merge it to master? Use
git merge --squashto take all those changesets and merge them onto master in a 'ready to be committed' state. Then you can commit them all as a single changeset and keep the history nice and clean. git rebase -i HEAD~3- Have three commits that really should be one? This command will let you combine multiple commits into one or even split a single commit into multiple.
If you'd like to use git in a world class engineering environment, check out the engineering job opportunities at Digg.com
Update on Digg Ads
Hi Everyone,
Digg Ads have been live on the site for just over four months now, and I thought it would be a good time to explain them in a little more depth and update folks on how they’re doing.
What are Digg Ads?
Digg Ads are the ads that you see with other stories on the homepage, or just below the story description on a permalink. The placement on the homepage is currently the third story down from the top; DiggAds look and feel just like regular content, but they are marked as “Sponsored by [Advertiser Name]”. You can Digg or bury them just like any other story but, unlike standard Digging and burying, your feedback affects the price the advertiser pays and how often the ad gets shown. Basically, the more Diggs an ad gets, the less that advertiser will pay overall. This model encourages advertisers to give us great content that’s relevant to the community.
How Do Digg Ads Work?
We charge advertisers on a Performance CPC (Cost per Click) basis where the advertiser tells us the most they will pay for a click on the ad, and we charge them a fluctuating rate for that click without ever exceeding their maximum. The exact rate depends on that ad’s quality score (how many Diggs and buries it has received) and the market rate, which is determined by the DiggAds auction.
The DiggAds Auction:
The DiggAds auction is a complex system designed by some of our math PhDs, but here is a simplified version of how it works: Advertisers tell us the maximum that they want to pay per click by giving us a “bid.” Each day we take those bids and apply a quality score to them based on how the Digg community likes the ads. Then we figure out how much ad space we have for that day, and we go down the list of ads assigning space to the highest bids and highest quality ads until we run out of room. (If it’s a first time advertiser, we assign a base score in order to get them started and to elicit feedback from users). In this way an advertiser that invests in quality content can pay less, and an advertiser that has a low quality ad is forced to bid higher. We think this system ultimately leads to better advertisements.
Some feedback so far:
So far, the feedback we’ve gotten from users has been mostly positive. We have heard from users that they would like to be able to comment on ads, which is on our product roadmap for this year. We have also heard that sometimes users bury ads but then still see them again later - this is often because advertisers test out many variations of the same ad. So, while we never show a user the same exact ad they buried we may show a very similar ad from the same advertiser. Another issue we have noticed is that some advertisers keep their ads live for several weeks, which over time can lead to very high Digg counts. As a user, it is difficult to know if an ad has 2000 Diggs because it’s great content, or because it has been there for a long time. We plan to create more transparency around the issue of long-lived ads as well as we continue to iterate.
How are the DiggAds doing?
In the first four months, DiggAds has been extraordinarily successful for Digg. From a revenue perspective, things have been great – we view this as a positive sign that giving users control over the advertising they see is a good user experience. Advertisers have given us great feedback, and we have seen some big brands make repeat purchases after seeing initial success. One advertiser told us they cancelled all their other campaigns except for DiggAds because it was the only ad platform that worked. Paramount studios credits DiggAds as a major contributor to the success of their campaign to get their movie, Paranormal Activity, released nationwide. One of my favorite brands Threadless has also been very successful - I purchased a Three Keyboard Cat Moon T-shirt through one of their ads so I know they got at least one customer!
Questions or Feedback:
As always, if you have any questions or feedback, please contact us here.
Thanks for reading!
-Bob
Going for the Gold: Digg Dialogg with US Olympic skier Julia Mancuso
Hey everyone -
With the 2010 Winter Olympics just days away, we couldn't think of a more timely Dialogg than chatting with one of America's top Olympians. So we're heading to the slopes and taking your questions to Gold Medal Alpine Skier Julia Mancuso before she again goes for the Gold.
You know the deal: you submit and Digg the questions you'd like to hear answered on this page, and we'll ask Julia the ones that receive the most Diggs. It's a simple concept, but as you may have seen yesterday in our Dialogg with Toyota's President Jim Lentz, it's a powerful platform that gives you exclusive access to some of the top newsmakers.
Submit and Digg questions from now until Friday, February 12th at noon PT, then mark your calendars: the interview will be posted here on Wednesday, February 17th.
Thanks,
Aubrey
Special Event: Digg Dialogg LIVE with Toyota's US President
Hi everyone -
Let's make some news!
In light of the recent controversy around the Toyota vehicle recall, Digg is hosting an exclusive Digg Dialogg LIVE event just days before the car maker testifies before Congress. We know there are a lot of open questions about how Toyota is dealing with this situation, and we're giving Toyota's top US executive, Jim Lentz, the opportunity to address them.
Digg is the only place you'll have the chance to pose questions directly to Toyota's top management, and get the answers you're looking for. So, submit and Digg questions from now until Monday, February 8th at 8AM PT. Then come back later that afternoon, at 2PM PT, to watch the live event.
This is your chance to shape the conversation and make the news on this important issue. We look forward to hearing what you have to say!
Thanks,
Mike
Best of Digg: January Comments
We all know that Digg is a fantastic resource for discovering the most interesting content the Web has to offer. But just as entertaining as the submissions are the conversations the Digg community builds around the news stories, funny images and quirky videos that hit the homepage.
With that in mind, we’ve decided to spotlight a few of the comments from the past month that have made us at Digg HQ laugh. All three of these are among January’s 10 most-dugg comments.
The most popular contribution of the month, with +2345 diggs at our last count, makes fun of annoying celebs who we’re equally impressed and disgusted by — in this case, Michael Cera. User XYconfiguration (who, with just 17 comments since November 2005, goes for quality over quantity) also provides prime fodder for other commenters — see user CaughtThinking’s take on Zooey Deschanel.
![]()
Also making us chuckle in January was this classic Digg switcheroo, with +1976 diggs at last check. User scstraus, who pulls us in for an intriguing and seemingly serious story and flips the script, leading to — where else? — a bad-ass comment-off (and the G-rated rap that probably drove every parent on early-90s middle school students crazy). Yo homes, to Bel-Air!

Hey, I see what you did there! Proving that the comments Diggers provide are as funny as the much-loved stories The Onion submits, user dawnraid101 hits us with the pun factor in his 1565-digg beauty. A tip o’ the hate and a very good gay to you, sir.
![]()
Props to all the users who make comments so much fun. As always, we welcome feedback, so please hit us with your questions or comments. You can contact us via email at http://digg.com/contact or follow us on Twitter at http://twitter.com/digg_community
Hasta pronto,
T.J.
Testing at Digg
The Quality Assurance team at Digg is always looking for new ways to break things. We use a combination of unit and automated functional tests to make sure our code is working properly. As code is being developed, so are the unit tests which have 100% coverage. Of course, having machines do work for you is always better than doing it yourself, so we have a heavy focus on functional test automation. We use Selenium because it's open source and has support for lots of different browsers, Operating Systems, and programming languages.
We are an agile environment and practice test-driven development, continuous integration and deployment, and parallel development and testing. This means we are constantly interacting with development, design, and product to understand how features should behave and identify the best ways to test them.
We use really cool tools like Twist from Thoughtworks (to manage our automated suite), Sauce Labs (to run all of our tests on all of our supported browsers in parallel) and Hudson (to manage our continuous integration environment). We've hooked these all together so that when automated tests fail, an embedded video of the failure is already waiting for us in our reports. Look for more QA posts in the future about the specifics of our process.
If any of this sounds interesting to you, we are looking for amazing test engineers. To find out more about joining Digg's rock star QA team, go to digg.com/jobs; we're all business.

Are you a 10? Get Your Rating with Paramount’s “She’s Out of My League” Quiz
Hey everyone,
In anticipation of their upcoming movie, “She’s Out of My League”, Paramount Pictures has created a "get rated" quiz with Digg integration. Check out the quiz.
You may even get to star on a billboard in Times Square by participating in the quiz!
Here is how it works
- Fill out the short personality quiz
- Get rated with a 1 to 10, based on your quiz results
- Connect with Facebook or create an account and then Digg or bury yourself as well as other people
- Get a chance to have your photo and rating on a digital billboard in Times Square!
The movie comes out March 12th! Check out the trailer.
Have a good one,
Matt
New Digg Extensions for Firefox and Chrome
A month ago, we announced new capabilities that let developers create writable applications with the Digg API. We decided to put this to work ourselves and update our original Firefox Extension and also create a new Google Chrome extension at the same time. In both cases, you can now Digg stories as you browse the web, without having to come back to the Digg site each time.

The new Chrome extension for PC packs a lot of basic features in a small footprint:
- The Digg count for any URL is displayed to the right of the browser's address bar. Clicking this reveals the title and comment count for that URL, as well as the button to Digg it.
- There are several simple ways to share any URL, including Twitter, Facebook, and email.
- Note: Currently, Chrome only supports extensions for PCs - Mac support coming soon.
In the case of the Firefox extension, we made a bunch of improvements to the current version:
- To save space, we moved the Digg count and Digg button to the navigation bar, so the toolbar doesn't have to be open for you to Digg stories.
- The toolbar itself is shorter and we've added a keyboard shortcut (Cmd+Shift+E) to make it easy to show and hide it. This makes it really simple to check out the Digg story details for a URL and then close it when you're done.
- We've made notifications of friends and popular story activity less intrusive by providing additional controls in the toolbar settings, such as notification thresholds and a smaller notification box.
- Sharing via Twitter, Facebook, and email are easily done via the toolbar. You can also right-click on any page to generate a Digg short URL.
As always, we're looking for your suggestions and feedback so please use the feedback buttons in the extensions or contact us directly. Enjoy!
- Chris









