Skip to content
October 11, 2010 / cohodo

Surge 2010

Having now finally gotten over my jetlag, I’ve had a few minutes to write up my notes from Surge 2010, which was a really great couple of days, perfectly filling its niche. It also had probably the best lineup of speakers at any conference I’ve attended. Aside from the content, the whole thing was brilliantly organised and run by OmniTI, who deserve a massive amount of credit for initiating such an awesome event. Mostly for my own benefit, I’ve collected a few writeups from other folk who attended, and videos & slides from pretty much all of the sessions are due to be published any day now.

The main message coming through was read more, learn more, share more. This theme ran through a number of talks, from John Allspaw & Brian Cantrill‘s opening keynotes to Theo’s closing plenary where he delivered the 11 Commandments of Scaling. There’s a huge body of literature out there constantly being produced by the academic and research communitities. In general, we in industry are not particularly good at putting it to use and building on top of it – all too often we’re found re-inventing the wheel, making the same mistakes over and over, and then perpetuating this vicious circle by not sharing our experiences with our peers.

Standout sessions for me included Allspaw’s keynote, delivered with customary insight and aplomb, where he talked of the absolute immaturity of Web Operations as a discipline, and of the huge amount that we can learn from more established like civil & mechanical engineering, the aerospace and utilities industries which have been tackling similar-shaped problems for decades, if not centuries.

Another highlight for me was Basho CTO Justin Sheehy‘s session on concurrency in distributed systems. Here, we got right to the nub of the issue – in any complex system, both in the real universe and in computer systems, its usually not correct to think of time as a single linear flow of events occurring in lockstep. Any software system, particularly any distributed system, that attempts to hide the underlying asynchronicity that this entails is fundamentally flawed. There are no strong guarantees of consistency in the physical world and certain domains, like banking for example, have long recognised this and built compensating mechanisms into their systems. A great soundbite is that we shouldn’t aim to build reliable systems (i.e. one that do not fail), but that we should aim to make our systems resilient to the failures that they will inevitibly encounter.

There were also some great case studies and war stories including Artur Bergman‘s deep dive into operations at Wikia, Ruslan Belkin‘s ‘Going 0 to 60: Scaling LinkedIn’ and Geir Magnusson’s detailed walk through of how scaled up from a typical n-tier application by building out a loosely coupled, service oriented back end.

I definitely learned a lot, had a bunch of things reaffirmed, and also found a lot of great validation for the stuff we’re doing on our Platform. Can’t wait for next year.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

<span>%d</span> bloggers like this: