X86 Mainframe Update - Creating a Virtualization Beast - Part 2

This is a continuation from Part 1

Our goal was to move all production and just under one hundred test and development devices to a single environment, then double the equipment in that environment (for redundancy) in the primary SF area data center and add a another environment to the DC DR site. A year ago I calculated minimum computational requirements for the firm (another project) and plugged in the numbers for the entire company. I calculated that between two and four servers (with an x86 mainframe design) could probably cover the entire firm ... that's a ratio of 100:1 or 150:1 depending on how you calculate it.
We budgeted 2-3x redundancy, i.e. 6 servers.

In Jan our engineers, Jim, Tom, Ryan, Chris, Rolando and the rest of Albert's team began working in earnest on the project. As you can guess, the hardware design required a little more than:

Step 1: Pull Dell out of box ,
Step 2: Install VMware
Step 3: Auto migrate servers.
The X86 Mainframe design required two key changes from traditional virtualization efforts:

  1. The system must be able to sustain resource pipelines (CPU-to-Memory, CPU-to-Bus, Bus-to-Network, Bus-to-Storage Controller, Storage Controller-to-Disks ...) for every concurrent process, and
  2. The largest virtual resource pipeline requirement should consume no more than half (.5x) the resources of the smallest physical constraint (i.e. processing engine, bus, storage controller).

I will illustrate what the above means in another article, but the key point is that the virtualization environment was to be more capable than any physical environment in our infrastructure. At a minimum, the components of the new environment would be twice as powerful as any server in the old environment.

There were some early challenges as Platespin and FusionStorm worked with us initially to determine server requirements. They were still coming up to speed on how to calculate aggregate storage performance requirement, total PCI bus capacity, and in one case ... normalizing the difference between 50% utilization on a 10-Mbps network card and 50% utilization on a 1-Gbps network card. Within a few months, both FusionStorm and PlateSpin were able to show server consolidation calculations that in part validated our assumptions. I say in part, because their models assumed a maximum physical server configuration that would be approximately a fourth the capacity of our intended system.

Not only did the consultants progress but so did our network staff. At some point, the entire team began calculating hard core hardware requirements such as aggregate IOPs.

For the performance tests, which started in May, we purchased the storage and two of the specified servers. Entisys joined the team to to build the system, help test the platform and implement a systems management infrastructure.

We opted to purchase storage from NetApp who has worked with us for years. Despite early concerns, primarily as a result of the NAS image that still follows what is now the leading storage vendor, we received strong commitments that they meet the storage virtualization requirements and the virtualization related IO needs. With some trial and error, NetApp built two 400 platter all fibre clustered aggregates (15K FC and 10K FC drives) for the production and DR location. The large aggregates are critical to the design as well as special large data block sizes on the disks that enable full cache utilization.

During VMworld we were testing 2,400 heavy 64-bit Exchange users during other testing that included heavy accounting, payroll and document management processes in an effort to drive storage utilization over 50%. Since the last optimizations by NetApp, we typically don't exceed 40% utilization. To virtualize Microsoft Exchange with VMware actually requires far less system resources than needed to virtualize Exchange 2003. It seems from testing that we will be handle as many mail users in a single VMware virtual server as four RAID 10 equipped physical servers.

The server we chose is the Sun x4600 M3 with 32 cores and 128 GB RAM. Currently we are running the 16 core, 64 GB M2 version. Sun will be able to upgrade our boxes shortly, but we are still waiting on a date for the VMware upgrade that will support that many cores. The Suns were important to the project for two reasons ... PCI BUS and AMD. Sun has 2x the PCI bus of anything else we could find from HP or IBM with X86 chips. The bus was needed to drive 4x4Gbps fibre channel and 8xGigE at full throttle.

AMD as implemented on the X4600 is really elegant. We've replaced an 8-core Intel w/12GB RAM physical with a 4-core 8GB virtual servers with identical performance.

With Barcelona, we will need to re-examine the capacity of the x4600 bus. Currently the M2's are performing at the 100:1 consolidation level I thought would require the M3. I spent some time with Sun and the new M3 at VMworld and I now believe the Barcelona M3 can double the M2 from a processing/memory perspective. Unfortunately, if the bus can't handle double the fibre channel IO, we may never see the full potential ... ALL on One.

In the mainframe (as we call it), the physical network IO actually should go down with greater consolidation density because nearly all inter-server communications are on the Bus inside the box.

Entisys has been great with supporting the implementation and setting up the shadow network test environment. We are running tests now to determine how much over capacity we will be with the move from two M2s to four Barcelona M3s.

I guess we will have to consider virtual desktops sooner.
|