What I’m learning at Cloud Connect SC 2012

1 Comment

Get the short URL

I’m in Santa Clara this week at Cloud Connect. It’s arguably the leading cloud event in the world (particularly if you ignore those that are pay-to-play) and most definitely the densest gathering of people who are building and planning tomorrow’s on-demand infrastructure.

After a couple of furious days walking and talking with people, a few big trends stand out.

Openwashing is the new Cloudwashing

Everyone is claiming they’re open. Now that all the domains with “cloud” in them are gone, the “open” prefix is the latest brand seasoning. And like salt, once we add too much it’ll be hard to take it out. It’s not clear companies really want “open”—sure, they want portability, and self-determination, and the right to repatriate workloads.

Worse—for vendors at least—it’s not clear that open is a sustainable business model. Once things are open, the only way to compete is on legislation and economy of scale. Which means clouds become more about who’s got the most data (@mccrory’s law) or the best lobbyists.

Moving the machine to the data

This is the biggest, most subtle,  most tectonic shift in clouds. It means a big difference in the words “upload” and “download” change forever. When the data lives elsewhere, I send it code to run, and get back answers.

It’s not clear how this will play out. Will data marketplaces become data wrapped in high-end PaaS environments, where customers don’t retrieve data, but rather send a Mapreduce query and get back results?

This is subtle enough that we haven’t really thought it through. What does it mean for WAN acceleration? For data marketplaces? CDNs? Traditional BI?

Up the stack to platforms

There’s no doubt platforms are cool. When you don’t need to meddle with the machines, you can focus on building cool things. And because you’ve given up your opinion about infrastructure, someone else can optimize your workloads.

Openshift and Cloud Foundry are two big names here in terms of portability, and Microsoft’s Azure joins the ranks of Force.com and App Engine. We’re rushing up the stack, and once enterprises embrace PaaS properly, they’ll take an army of interns and rewrite the millions of lines of FORTRAN and COBOL in their basements.

Private clouds are for effectiveness; public clouds are for efficiency

Some big names are moving workloads in-house. Once you know and can predict your workload, it’s cheaper to own it—but that doesn’t mean you abandon cloud innovation. Instead, you use the cloud to get your cycle time down and change the data center overnight.

Public clouds are an economic issue. Startups can delay infrastructure spending and resist dilution; big companies can rent the spike the same way they rent a car or a seat on an airplane.

There’s more to this efficiency than just renting instead of buying, though. Amazon has over 25 services, and just one is virtual machines. The rest are things like dynamoDB, a shared, insanely fast, data storage system. Google converts millions of documents from one format to another a day. By using these services, companies can access highly optimized pipelines of computing power that are faster than anything they could build on their own.

Put another way: even if you own your hardware, Google can probably convert documents for you more cheaply. Even if you have your own private cloud, much of its computation will be portable agents, running tasks where they make the most sense, then reassembling the results.

Bake it from scratch or go pre-packaged

This is a pretty serious debate between folks that know a lot about clouds, but it’s not obvious to the mainstream yet.

Some cloud designers love the “gold master” model: bake a perfect machine image, and then use that over and over again. When you need to change, change the image. It’s like booting from a CD on a desktop; when you want to change the machine, change contents of the CD.  This is what Adrian Cockroft at Netflix advocates, in his endless pursuit of “no-ops”.

Other cloud operators prefer teaching the machine to fish, rather than feeding it fish. The machines have a script that knows how to configure itself programmatically. This is the much-mentioned, but seldom-understood, “infrastructure as code” espoused by Chef, Puppet, and other DevOps frameworks, and championed by Opscode.

It’s hard to reconcile these two worlds, and there are arguments and weaknesses to both. When true cloud wonks talk about operations, this usually comes up.