LAS4001 Lab Architecture Session: Cloud and Virtualization Architecture

For those coming here from Jason Boche (thanks Jason!), thanks for stopping by and see all my other VMworld 2011 posts (many other sessions).

Speakers = Clair Roberts (Sr. Product Integration Architect), Curtis Pope (Chief Global Architect)

Summary at the top = fun session, Clair is very engaging and has lots of anecdotes/deep dive details about various pieces of the lab infrastructure. Great session to end VMworld on.

  • Redid all the lab stuff this year…like almost everything.
  • Lots of it is written in Python. Clair isn’t a programmer but has written a ton of scripts.
  • Problems back in 2008 — Clair didn’t know Python quite as well…tested with just 1 VM but not with more than 1.
  • 2008 Monday morning 500 people sitting in line.
    • Total Time – 50 hours of Lab Time, 480 Concurrent User Stations, Totals 24,000 Available Lab Hours
    • Hourly Churn – 27 Unique Labs Available On-demand, Each lab averages 11 virtual machines
    • Total Virtual Machine Count – 5,000 Virtual Machines per hour, 50 hours
    • Possibility of 250,000 Virtual Machines in only 4 days.
  • Once you consider the scale, cloud ends up being a totally different way of doing things….it just to be due to scale.
  • Really hard to predict which labs are the most popular.
    • Spun up 40 instances of Lab 1, but 70 people on day 1 chose it right away (of the first 300 people in line)
    • First 40 people got it fast, next 30 people waited.
  • VMworld 2010 – 480 lab seats, 44 hours, 21,120 lab seat hours, 15,344 labs completed, 145,097 VMs deployed.
  • ESX installed inside ESX — each Pod takes 48 GB or so of RAM
  • Used UCS blades with 96 GB RAM – 48 GB each last year.
  • 3 Datacenters — SuperNAP in Vegas, Terremark in Miami, something in Amsterdam.
  • vCenter can run 10,000 VMs….but can’t handle starting up 4,000 to 5,000 per hour.
    • So have to do vertical vCenter blocks.
  • LabCloud app — last time using that.
    • using Django, PostgreSQL, couple year old version of Apache (back when LabCloud was written)
    • customer Adobe Message Framework
    • “bridges” – Python code that runs on Ubuntu server as standalone daemon, one per vDC, he’s a throttle that keeps from sending too many tasks to vCenter at one time
  • Finding lots of maximums — blew up View, blew up vCenter,
  • Shipping racks can be fun — last year ‘someone’ drove a forklift into a rack.
  • Challenges with power density in data centers (assuming that doesn’t apply to SuperNAP).
  • Thin clients — bring 15-20% extra thin clients and keyboards, etc. (sometimes thin clients walk off).
  • Dashboard showing # of VMs running, etc. — modeled after a car dashboard, Maritz looked at it and liked it at a lot but…”what happens when you hit the horn?” Doh….forgot to code anything for that.
  • Asking if we want it available online….everyone does. Willing to pay for it? Some hands go down but not all…
  • Challenge around the internet having more than 480 seats….go figure.
  • Why use NFS? VMFS in vSphere 4 can only have 8 ESX hosts in a cluster with linked clones….they run 28-30 ESX hosts per cluster.
    • Better in vSphere 5.
  • vCD cells are writing 200 MB of log data every 10 minutes….not useful for troubleshooting (HT to @jasonboche).
  • Core vCenter engineers thought could do it with 1 vCenter….he proved them wrong.
  • Labs at VMworld are biggest vCloud Director customer and also biggest vShield Endpoint customer.
  • Making 4,000+ VMs per hour caused ARP table overrun — network convergence (UCS blades, etc.) caused more ARPs than could handle.
    • When a switches ARP table gets overrun, the switch acts like a hub…laughter….ouch.
  • Final stats courtesy of @cgrossmeier
    • VMworld Labs Final Stats – 13,415 labs, 148,138 VM in 50 Hours! Powered by vSphere , View 5, vCD 1.5 and vShield 5

One thought on “LAS4001 Lab Architecture Session: Cloud and Virtualization Architecture

  1. Pingback: LAS4001: VMworld Labs Automation & Workflow Architecture | It's Just Another Layer

Leave a comment