This Word, Intractable…

by February 28th, 2013

“This word, intractable… I do not think it means what you think it means.”

The Princess Bride Inconceivable

For those of you familiar with Princess Bride, you get the reference. For those who are not, you’re missing out on a cult classic (with a best ever performance from Andre the Giant). But, what does this have to do with operations management, VMTurbo, and what our products do for enterprise and service provider customers around the world? A lot, really.

Intractable – or intractability, more specifically – is a known term in computational complexity circles, like the ones frequented by our founders. Problems that can be solved in theory (as in, given infinite time) but which in practice take too long for their solutions to be useful are Intractable Problems. And if you read that sentence a few times, it actually makes sense. Our belief at VMTurbo is that the challenge of getting a virtualized environment into the “desired state” and keeping it there continuously is an intractable problem in environments that contain more than a few hundred VMs and more than a handful of servers. We refer to this problem, in our lexicon, as the Intelligent Workload Management Problem. It is defined as continuously assuring that workloads (applications) have the resources they need to perform as required while utilizing the underlying physical infrastructure as efficiently as possible.

Sounds simple, right? It isn’t. Doing this properly requires the virtualization administrator to consider a wide array of capacity and business constraints across the data center. In order to effectively solve for the “desired state” one has to understand the performance characteristics for VMs, applications, networks, storage arrays, and servers across a shared infrastructure and the “cause and effect” relationships between them. And the environment is constantly changing – VMs move, new ones arrive, workload demand fluctuates – which is really what drives the intractability. The rate of change is shorter than the time it takes to solve for the desired state. To effectively (and continuously) reduce the complexity of the problem you need to simplify it via an abstraction model that accurately represents the environment – and then solve for it with an effective heuristic algorithm. As you can probably tell, we have a strong opinion that software is far more capable of doing this than humans in complex virtual data centers. Our founder, Shmuel Kliger recently wrote about it – calling it Software-Defined Control. It elevates the role of operations from primary actor (tasked with trying to solve the problem) to supervisor (tasked with governing the policies and constraints that define the desired state) – and hopefully gives them more time to find the six fingered man.