Berlin Buzzwords Conference 2010

This week I attended Berlin Buzzwords Conference 2010, a two-day event aimed at software developers. The conference offered two tracks, one on search and the other one on NoSQL systems. Typical attendees seemed to be MacBook-wielding, twittering lifestyle geeks, often with SQL-induced childhood issues. The hype level was high - a bit too high for my taste - but given the conference's title that was to be expected.

In retrospect, there were a few really interesting talks that I'd like to point out (I mostly attended the NoSQL track, so I may have missed other good ones).

First, Grant Ingersoll's keynote which made an important point: With hardware and software turning into commodities, intelligence is what helps to distinguish your application from others. This is a very important realization and I think a lot of money will be made around personalization and recommendation features.

Rusty Klophaus did an excellent job introducing Riak and Riak Search. Both talks were at an introductory level but still provided a few interesting technical details (i.e., term vs. document based partitioning in Riak Search). It's a shame that I don't have a use case for Riak right now.

The Nutch talk by Andrzej Bialecki provided a good architectural overview and was packed with interesting information. You could tell how much experience and engineering work went into the product and I would have loved to hear more about it. Now I know it would be pretty stupid trying to build something like it yourself.

Sean Owen's talk on Collaborative Filtering is my personal conference highlight. I've never seen a better explanation of Item-based recommenders. Parallelizing recommender systems is no trivial task but unavoidable for working on substantial amounts of data. I'm glad there are competent people who have done this conceptual heavy lifting for me.

What I was really missing was a good talk or two on basic concepts. I was already familiar with the most influential papers in the area (Dynamo, GFS, MapReduce, Bigtable, Consistent Hashing, ...), but I think explaining the basic concepts would have helped a lot of people. Building distributed systems (and developing for them) is hard and developers need a good understanding of what's happening at the infrastructure level.

Well, maybe next year, unless the NoSQL hype has already blown over until then. The classic database community had a bit of a late start, but I'm sure we'll be seeing improvements in scalability and flexibility that will render many NoSQL systems obsolete. New distributed relational databases like VoltDB promise comparable or higher performance with a richer and yet more familiar feature set.

Interesting times.

social