“CPU Ready and CPU Usage, what are they and what are they good for?”
I was asked this question by my colleagues and after answering it with the official VMware explanation, they still didn't quite get it.
Actually, after some thought, without any background information on the subject, I probably wouldn’t have understood it either.
The following visualization helped explain the CPU Ready and CPU Usage is and their relationship to each other.
CPU Ready = % of time there is work to be done for VMs, but no physical CPU available to do it on. In other words, all host CPUs are busy serving other VMs. The higher the ready time, the slower the virtual machine is performing.
CPU Usage = raw, absolute amount of CPU used by corresponding VM at the given moment.
At one point VMWare had a recommendation that anything over 5% ready time per virtual CPU is something to monitor. For anything over 10% for extended periods of time, you should plan on taking action.
VMware has a very good paper on this subject here: http://www.vmware.com/pdf/esx3_ready_time.pdf
The following sample chart displaying CPU Ready and CPU Usage can be found through vSphere client, selecting the VM, then clicking performance tab, and then selecting the advanced “chart options” link. In the chart options, under Objects, only select 0. Under counters, select READY and USED then OK.
Another place to get CPU Ready is through command line by using ESXTOP. Simply ssh into the esx host’s shell and type esxtop.
Understanding the data that you are looking at and what it actually means is critical to making the right decisions about what is happening in a virtualized environment. CPU Ready time specifically requires a good understanding of what the value actually is showing and how it relates to the configuration of the VM, the other VMs on the host, and the physical host resources. If you are looking at summation data for the CPU Ready time, converting it to a CPU Ready percent value is what provides the proper meaning to the data for understanding whether or not it is actually a problem.