CAP, Consistent Hashing, etc.

I’ve been reading up on distributed systems again. For quite a while, my monthly copy of CACM has been my only connection to computer science topics. This time, I followed a few references and came across interesting concepts (most of them familiar from back in university).

Amazon doesn’t just sell books over the internet, they’re building fascinating systems, too. And sometimes they write articles and papers about it, like in January’s edition of CACM (see Eventually consistent). If you’re familiar with Eric Brewer’s CAP theorem, you know that in a distributed system, you can decide for only two design goals out of consistency, availability and partition resilience. In traditional systems, consistency is highly valued, but if you’re willing to trade in consistency for availability, you can build high-performance systems.

One of these systems is Dynamo, a highly available, distributed key-value store that is part of Amazon’s internal infrastructure. Dynamo is write-optimized, writes are almost always possible, even if some replicas aren’t updated. Inconsistencies are resolved during read, typically by business logic inside the application itself.

Dynamo uses Consistent Hashing, which is used in distributed hash tables (DHTs) like Chord. This way, nodes can join and leave the network without having to remap the whole key space. As a friend succinctly put it “Yes, in a non-hostile environment, DHTs really work. They’re not only useful for distributing cam-rips on the Internet”. If you’re ever going to build a cache consisting of more than one caching server, Consistent Hashing is definitely something to check out.

Damn, I miss building stuff like this! It’s a shame that only large operations really need this kind of infrastructure.

This entry was posted in computer science and tagged , , . Bookmark the permalink.

2 Responses to CAP, Consistent Hashing, etc.

  1. Christoph S. says:

    Amen to your last sentence :-) I’m still amazed how little software developers in the trenches have to do with computer science (particularly with those parts of it that are younger than, say, 25 years).

  2. mafr says:

    Yup, me too. But I’m glad that RDBMSs are no longer the answer to all things persistence. That’s a positive development we had over the last three years and it makes things interesting again. Some basic assumptions are re-evaluated, I like that.

    To me personally, things are starting to make sense. Is that why they taught me DHTs at university? Did they see use cases like this or were DHTs just hip? ;-)

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s