Ubuntu Server is No Longer the Best OS for Cloud Computing.
Okay, so now that I got your attention….let me explain.
Over this past year and a half (maybe a little longer), I’ve seen Ubuntu Server explode in number and types of deployments, specifically around areas involving cloud computing, but also in situations involving big data and ARM server deployments. This has all occurred at a time when people and organizations are having to do more with less…less lab space…less power…less people, which of course all leads to the real desire of operating at less financial cost. I’ve come to the conclusion that me saying we should focus Ubuntu Server on being the best OS for cloud computing at the 11.10 UDS was aiming too low. It’s awesome that we’ve essentially done this with our OpenStack integration efforts for Ubuntu Cloud, but we can do more…we can do better. I now believe that for 12.04LTS and beyond, what Ubuntu Server should actually drive towards is being the best OS for scale-out computing.
Scale-Out is Better than Scale-Up
Scale-out computing is the next evolutionary step in enterprise server computing. It used to be that if you needed an enterprise worthy server you had to buy a machine with a bunch of memory, high-end CPU configuration, and a lot of fast storage. You also needed to plan ahead to ensure what you purchased had enough open CPU and memory slots, as well as drive bays, to make sure you could upgrade when demand required it. When the capacity limit (cpu, memory, and/or storage) of this server was hit, you had to replace it with a newer, often more expensive one, again planning for upgrades down the road. Finally, to ensure high availability, you had to have one or two more of these servers with the same configuration. Companies like Google, Amazon, and Facebook then came along and recognized that they could use low-cost, commodity hardware to build “pizza box” servers to do the same job, instead of relying on expensive, mainframe-like servers that needed costly redundancy built into every deployment. These organizations realized that they could rely on a lot of cheap, easy-to-find (and replace) servers to effectively do the job a few scaled-up, high-end (and cost) servers could tackle. More work could be accomplished, with a reduced risk of failure by exploiting the advantages a scale-out solution provided. If a machine were to die in a comparable scale-up configuration, it would be very costly in both time and money to repair or replace it. The scale-out approach allowed them to use only what they needed and quickly/easily replace systems when they went down.
Fast forward to today, and we have an explosion of service and infrastructure applications, like Hadoop, Ceph, and OpenStack, architected and built for scale-out deployments. We even have the Open Compute Project focused on designing servers, racks, and even datacenters to specifically meet the needs of scale-out computing. It’s clear that scale-out computing is overtaking scale-up as the preferred approach to most of today’s computational challenges.
With Great Scale, Comes Great Management Complexity
It’s not all rainbows and unicorns though…scale-out comes with it’s own inherent problems. There’s a great paper published by IBM Research called, Scale-up x Scale-out: A Case Study using Nutch/Lucene, where the researchers set out to measure and compare the performance of a scale-up versus scale-out approach to running a combined Nutch/Lucene workload. Nutch/Lucene is an opensource framework written in Java for implementing search applications consisting of three major components: crawling, indexing, and query. Their results indicated that “scale-out solutions have an indisputable performance and price/performance advantage over scale-up”, and that “even within a scale-up system, it was more effective to adopt a “scale-out-in-a-box” approach than a pure scale-up to utilize its processors efficiently”, i.e use virtualization technologies like KVM. However, they also go on to conclude that
“scale-out systems are still in a significant disadvantage with respect to scale-up when it comes to systems management. Using the traditional concept of management cost being proportional to the number of images, it is clear that a scale-out solution will have a higher management cost than a scale-up one.”
These disadvantages are precisely what I see Ubuntu Server attempting to account for over the next few years. I believe that in Ubuntu Server 12.04LTS, we have already started to address these issues in several specific ways.
One obvious issue with scale-out computing is the need for space to store your servers and provide enough power to run/cool them. We haven’t figured out how to shrink the size of your server through code, so we can’t help with the space constraints. However, we have started to develop solutions that can help administrators use less power to run their deployments. For example, we created PowerNap, which is a configurable daemon that can bring a running server to a lower power state according to a set of configuration preferences and triggers.
As a company, Canonical also began investing in supporting processor technologies that focused on delivering a high rate of operations at low-power consumption rates. ARM has a long-standing history of providing processors that use very little power. The potential for server applications, meant you could drive server processor density up and still keep power consumption relatively low. With this greater density, server manufacturers started to see opportunities for building very high speed interconnects that allow these processors to share data and cooperate quickly and easily. ARM server technology companies such as Calxeda can now build computing grids that won’t require watercooling and an in-house backup generator running when you turn them on. With the Cortex-A9 and Cortex-A15 processors in particular, the performance differential between ARM processors and x86 is starting to shrink significantly. We are getting closer to having full 64-bit support in the coming ARMv8 processors, that will still retain the low power and low cost heritage of the ARM processor. Enterprise server manufacturers are already planning to start putting ARM processors into very low-cost, very dense, and very robust systems to provide the kind of functionality, interconnectivity and compute power that used to only be possible in mainframes. Ubuntu Server 12.04 LTS will support ARM, specifically the hard float compilation configuration (armhf). With our pre-releases already receiving such good performance reviews, we are excited about the possibilities. If you want to know more about what we’ve done with ARM for Ubuntu Server, I recommend you start with a great FAQ posted on our wiki.
Traditional license and subscription support models are built for scale-up solutions, not scale-out. These offerings either price by number of users or number of cores per machine, which are within reason when deploying onto a small number of machines, i.e. under 100…maybe a bit higher depending on the size of the organization. The base price gets you access to security updates and bug fixes, and you have to pay more to get more, i.e. someone on the phone, email support, custom fixes, etc. This is still acceptable to most users in a scale-up model.
However, when the solution is scale-out, i.e. 1000s or more, this pricing gets way out of control. Many of the license and subscription vendors have recently wised up to this, and offer cluster-based pricing, which isn’t necessarily cheap, but certainly much less costly than the per socket/CPU/user approach. The idea is that you pay for the master or head node, and then can add as many slave nodes as you want for free.
Ubuntu Server provides security updates and maintenance for the life of the release…for free. That means for an LTS release of Ubuntu Server, users get five years of free maintenance, if you need someone to call or custom solutions, you can pay Canonical for that…but if you don’t…you pay nothing. It doesn’t matter if you have a few machines or over a 1000, security updates and maintenance for the set of supported packages shipped in Ubuntu is free.
Deploying interconnected services across a scale-out deployment is a PITA. After procuring the necessary hardware and finding lab space, you have to physically set them up, install the OS and required applications, and then configure and connect the various applications on each machine to provide the right desired services. Once you’ve deployed the entire solution, upgrading or replacing the service applications, modifying the connections between them, scaling out to account for higher load, and/or writing custom scripts for re-deployment elsewhere requires even more time…and pain.
Juju is our answer to this problem. It focuses on managing the services you need to deliver a single solution, above simply configuring the machines or cloud instances needed to run them. It was specifically designed, and built from the ground up, for service orchestration. Through the use of charms, Juju provides you with shareable, re-usable, and repeatable expressions of DevOps best practices. You can use them unmodified, or easily change and connect them to fit your needs. Deploying a charm is similar to installing a package on Ubuntu: ask for it and it’s there, remove it and it’s completely gone. We’ve dramatically improved Juju for Ubuntu Server 12.04LTS, from integrating our charm collection into the client (removing the need for bzr branches) to having rolled out a load of new charms for all the services you need…and probably some you didn’t know you wanted. As my good friend Jorge Castro says, the Juju Charm Store Will Change the Way You Use Ubuntu Server.
In terms of deployment, we recognized this hole in our offering last cycle and rolled out Orchestra as first step, to see what the uptake would be. Orchestra wasn’t an actual tool or product, but a meta-package pointing to existing technologies like cobbler, already in our archive. We simply ensured the tools we recommended worked, so that in 11.10 you can deploy Ubuntu Server across a cluster of machines easily.
After 11.10 released, we realized we could extend the idea from simple, multi-node OS install and deployment, to a more complex offering of multi-node service install and deployment. This effort would require us to do more than just integrate existing projects, so we decided to create our own project called MAAS (metal as a service), which would be tied into Juju, our service orchestration tool.
Ubuntu 12.04 LTS will include Canonical’s MAAS solution, making it trivial to deploy services such as OpenStack, Hadoop, and Cloud Foundry on your servers. Nodes can be allocated directly to a managed service, or simply have Ubuntu installed for manual configuration and setup. MAAS lets you treat farms of servers as a malleable resource for allocation to specific problems, and re-allocation on a dynamic basis. Using a pretty slick user interface, administrators can connect, commission and deploy physical servers in record time, re-allocate nodes between services dynamically, and keep them all up to date and in due course.
We’ve Come a Long Way, But
There’s a lot more we need to do. What if the MAAS commissioning process included hardware configuration, for example RAID setup and firmware updates? What if you could deploy and orchestrate your services by mouse click or touch…never touching a keyboard? What if your services were allocated to machines based on power footprint? What if your bare metal deployment could also be aware of the Canonical hardware certification database for systems and components, allowing you to quickly identify systems that are fully certified or might have potentially problematic components? What if your services auto-scaled based on load without you having to be involved? What if you could have a true hybrid cloud solution, bursting up to a public cloud(s) of your choosing without ever having to rewrite or rearchitect your services? These types of questions are just some of the challenges we look to take on over the next few releases, and if any of it interests you…I encourage you to please join us.