Tuesday, December 22, 2015

Understanding NUMA and Virtual NUMA (vNUMA) in vSphere

Understanding NUMA and Virtual NUMA (vNUMA) in vSphere

Working with a recently, we had the experience of designing a solution involving a number of very large (average 12-16 vCPU) machines. In order to maximize the performance of these VMs, we needed to fully understand the intricacies of server resource management technologies NUMA and vNUMA. Failing to understand how they worked could have cost the customer performance gains that these technologies offer.

So what are NUMA and vNUMA, exactly? And how does the proper understanding of them benefit an administrator’s virtual environment?

So, what is NUMA?
“NUMA,” which is short for “Non-Uniform Memory Access,” describes a system with more than one system bus. CPU resources and memory resources are grouped together into a “NUMA node.”

The memory in a NUMA node is thus much more easily accessed by an associated CPU. A CPU that needs to access memory in a different node (“Remote Access”) will experience much higher latency, and thus reduced application performance. 
NUMA is, in short, an alternative approach to server architecture that links several small, high-performing nodes together inside a single server case.


So, why NUMA?
So long as the memory and CPU being used falls within the bounds of the NUMA node, local communication within a NUMA node allows a CPU much faster access to memory than in an ordinary system layout. This is especially important in the multi-GHz era of today, when CPUs operate significantly faster than the memory they are using. NUMA helps keep CPUs from entering a stalled state, waiting for data, by allowing the fastest possible access to memory.

How do I determine the size of my NUMA nodes?
According to Microsoft, “In most cases you can determine your NUMA node boundaries by dividing the amount of physical RAM by the number of logical processors (cores).” This can be considered a very loose guideline. Further information on determining the specific setup for your NUMA nodes can be found here:
Virtual NUMA
ESX has been NUMA-aware since at least 2002, with VMware ESX Server 1.5 introducing memory management features to improve locality on NUMA hardware. This has worked well for placement of VMs and memory locality for resources being used by that virtual machine, particularly for virtual machines that are smaller than the NUMA node. Large VMs, however, could benefit from extra help when it comes to scheduling their applications.  
When enabled, vNUMA exposes a VM operating system to the physical NUMA topology. This allows for performance improvements within the VM by allowing the operating system and applications to take advantage of NUMA optimizations. This allows VMs to benefit from NUMA, even if the VM itself is larger than the physical size of the NUMA nodes.
A few quick points:

·         An administrator can adjust, enable, or disable vNUMA on VMs using advanced vNUMA controls.
·         If a VM has more than eight vCPUs, vNUMA is automatically enabled.
·         If you enable CPU HotAdd, vNUMA is disabled.
See section 14, “Using NUMA Systems with ESXi” in the vSphere Resource Management Guide for more details.


What happens regarding vNUMA during a vMotion between hosts?
A VM’s vNUMA topology will mimic the topology of the host on which it is initally placed; this topology does not adjust if a VM moves to a different host unless the VM is restarted.  This is another excellent argument for keeping your hardware consistent within an ESXi cluster, as moving the VM to an ESXi host with a different NUMA topology could result in lost CPU/Memory locality and reduced performance.

For more information, consult the "Guest Operating Systems" section of the VMware Performance Best Practices guide.


In conclusion
Regardless of your virtualization platform, NUMA plays an important part in understanding performance within a virtual environment. VMware, in ESXi versions 5.0 and beyond, has extended the capabilities of large VMs by intelligent NUMA scheduling and improving VM-level optimization with vNUMA. It is important to understand both your NUMA and vNUMA topologies when sizing your virtual machines.

Monday, December 14, 2015

ESXTOP

ESXTOP

This page is solely dedicated to one of the best tools in the world for ESX; esxtop.

Intro

I am a huge fan of esxtop! I read a couple of pages of the esxtop bible every day before I go to bed. Something I however am always struggling with is the “thresholds” of specific metrics. I fully understand that it is not black/white, performance is the perception of a user in the end.
There must be a certain threshold however. For instance it must be safe to say that when %RDY constantly exceeds the value of 20 it is very likely that the VM responds sluggish. I want to use this article to “define” these thresholds, but I need your help. There are many people reading these articles, together we must know at least a dozen metrics lets collect and document them with possible causes if known.
Please keep in mind that these should only be used as a guideline when doing performance troubleshooting! Also be aware that some metrics are not part of the default view. You can add fields to an esxtop view by clicking “f” on followed by the corresponding character.
I used VMworld presentations, VMware whitepapers, VMware documentation, VMTN Topics and of course my own experience as a source and these are the metrics and thresholds I came up with so far. Please comment and help build the main source for esxtop thresholds.

Metrics and Thresholds

DisplayMetricThresholdExplanation
CPU%RDY10Overprovisioning of vCPUs, excessive usage of vSMP or a limit(check %MLMTD) has been set. See Jason’sexplanation for vSMP VMs
CPU%CSTP3Excessive usage of vSMP. Decrease amount of vCPUs for this particular VM. This should lead to increased scheduling opportunities.
CPU%SYS20The percentage of time spent by system services on behalf of the world. Most likely caused by high IO VM. Check other metrics and VM for possible root cause
CPU%MLMTD0The percentage of time the vCPU was ready to run but deliberately wasn’t scheduled because that would violate the “CPU limit” settings. If larger than 0 the world is being throttled due to the limit on CPU.
CPU%SWPWT5VM waiting on swapped pages to be read from disk. Possible cause: Memory overcommitment.
MEMMCTLSZ1If larger than 0 host is forcing VMs to inflate balloon driver to reclaim memory as host is overcommited.
MEMSWCUR1If larger than 0 host has swapped memory pages in the past. Possible cause: Overcommitment.
MEMSWR/s1If larger than 0 host is actively reading from swap(vswp). Possible cause: Excessive memory overcommitment.
MEMSWW/s1If larger than 0 host is actively writing to swap(vswp). Possible cause: Excessive memory overcommitment.
MEMCACHEUSD0If larger than 0 host has compressed memory. Possible cause: Memory overcommitment.
MEMZIP/s0If larger than 0 host is actively compressing memory. Possible cause: Memory overcommitment.
MEMUNZIP/s0If larger than 0 host has accessing compressed memory. Possible cause: Previously host was overcommited on memory.
MEMN%L80If less than 80 VM experiences poor NUMA locality. If a VM has a memory size greater than the amount of memory local to each processor, the ESX scheduler does not attempt to use NUMA optimizations for that VM and “remotely” uses memory via “interconnect”. Check “GST_ND(X)” to find out which NUMA nodes are used.
NETWORK%DRPTX1Dropped packets transmitted, hardware overworked. Possible cause: very high network utilization
NETWORK%DRPRX1Dropped packets received, hardware overworked. Possible cause: very high network utilization
DISKGAVG25Look at “DAVG” and “KAVG” as the sum of both is GAVG.
DISKDAVG25Disk latency most likely to be caused by array.
DISKKAVG2Disk latency caused by the VMkernel, high KAVG usually means queuing. Check “QUED”.
DISKQUED1Queue maxed out. Possibly queue depth set to low. Check with array vendor for optimal queue depth value.
DISKABRTS/s1Aborts issued by guest(VM) because storage is not responding. For Windows VMs this happens after 60 seconds by default. Can be caused for instance when paths failed or array is not accepting any IO for whatever reason.
DISKRESETS/s1The number of commands reset per second.
DISKCONS/s20SCSI Reservation Conflicts per second. If many SCSI Reservation Conflicts occur performance could be degraded due to the lock on the VMFS.

Running esxtop

Although understanding all the metrics esxtop provides seem to be impossible using esxtop is fairly simple. When you get the hang of it you will notice yourself staring at the metrics/thresholds more often than ever. The following keys are the ones I use the most.
Open console session or ssh to ESX(i) and type:
esxtop
By default the screen will be refreshed every 5 seconds, change this by typing:
s 2
Changing views is easy type the following keys for the associated views:
c = cpu
m = memory
n = network
i = interrupts
d = disk adapter
u = disk device (includes NFS as of 4.0 Update 2)
v = disk VM
p = power states

V = only show virtual machine worlds
e = Expand/Rollup CPU statistics, show details of all worlds associated with group (GID)
k = kill world, for tech support purposes only!
l  = limit display to a single group (GID), enables you to focus on one VM
# = limiting the number of entitites, for instance the top 5

2 = highlight a row, moving down
8 = highlight a row, moving up
4 = remove selected row from view
e = statistics broken down per world
6 = statistics broken down per world
Add/Remove fields:
f
<type appropriate character>
Changing the order:
o
<move field by typing appropriate character uppercase = left, lowercase = right>
Saving all the settings you’ve changed:
W
Keep in mind that when you don’t change the file-name it will be saved and used as default settings.
Help:
?
In very large environments esxtop can high CPU utilization due to the amount of data that will need to be gathered and calculations that will need to be done. If CPU appears to highly utilized due to the amount of entities (VMs / LUNs etc) a command line option can be used which locks specific entities and keeps esxtop from gathering specific info to limit the amount of CPU power needed:
esxtop -l
More info about this command line option can be found here.

Capturing esxtop results

First things first. Make sure you only capture relevant info. Ditch the metrics you don’t need. In other words run esxtop and remove/add(f) the fields you don’t actually need or do need! When you are finished make sure to write(W) the configuration to disk. You can either write it to the default config file(esxtop4rc) or write the configuration to a new file.
Now that you have configured esxtop as needed run it in batch mode and save the results to a .csv file:
esxtop -b -d 2 -n 100 > esxtopcapture.csv
Where “-b” stands for batch mode, “-d 2” is a delay of 2 seconds and “-n 100” are 100 iterations. In this specific case esxtop will log all metrics for 200 seconds. If you want to record all metrics make sure to add “-a” to your string.
Or what about directly zipping the output as well? These .csv can grow fast and by zipping it a lot of precious diskspace can be saved!
esxtop -b -a -d 2 -n 100 | gzip -9c > esxtopoutput.csv.gz
Please note that when a new VM is powered on, a VM is vMotion to the host or a new world is created it will not show up within esxtop when “-b” is used as the entities are locked! This behavior is similar to starting esxtop with “-l”.

Analyzing results

You can use multiple tools to analyze the captured data.
  1. VisualEsxtop
  2. perfmon
  3. excel
  4. esxplot
What is VisualEsxtop as it is a relatively new tool (published 1st of July 2013).
VisualEsxtop is an enhanced version of resxtop and esxtop. VisualEsxtop can connect to VMware vCenter Server or ESX hosts, and display ESX server stats with a better user interface and more advanced features.
That sounds nice right? Lets have a look how it works, this is what I did to get it up and running:
  • Go to “http://labs.vmware.com/flings/visualesxtop” and click “download”
  • Unzip “VisualEsxtop.zip” in to a folder you want to store the tool
  • Go to the folder
  • Double click “visualesxtop.bat” when running Windows (Or follow William’s tip for the Mac)
  • Click “File” and “Connect to Live Server”
  • Enter the “Hostname”, “Username” and “Password” and hit “Connect”
  • That is it…
Now some simple tips:
  • By default the refresh interval is set to 5 seconds. You can change this by hitting “Configuration” and then “Change Interval”
  • You can also load Batch Output, this might come in handy when you are a consultant for instance and a customers sends you captured data, you can do this under: File -> Load Batch Output
  • You can filter output, very useful if you are looking for info on a specific virtual machine / world! See the filter section.
  • When you click “Charts”  and double click “Object Types” you will see a list of metrics that you can create a chart with. Just unfold the ones you need and double click them to add them to the right pane
There are a bunch of other cool features in their like color-coding of important metrics for instance. Also the fact that you can show multiple windows at the same time is useful if you ask me and of course the tooltips that provide a description of the counter! If you ask me, a tool everyone should download and check out.
Let’s continue with my second favorite tool, perfmon. I’ve used perfmon(part of Windows also know as “Performance Monitor”) multiple times and it’s probably the easiest as many people are already familiar with it. You can import a CSV as follows:
  1. Run: perfmon
  2. Right click on the graph and select “Properties”.
  3. Select the “Source” tab.
  4. Select the “Log files:” radio button from the “Data source” section.
  5. Click the “Add” button.
  6. Select the CSV file created by esxtop and click “OK”.
  7. Click the “Apply” button.
  8. Optionally: reduce the range of time over which the data will be displayed by using the sliders under the “Time Range” button.
  9. Select the “Data” tab.
  10. Remove all Counters.
  11. Click “Add” and select appropriate counters.
  12. Click “OK”.
  13. Click “OK”.
The result of the above would be:
Imported ESXTOP data
With MS Excel it is also possible to import the data as a CSV. Keep in mind though that the amount of captured data is insane so you might want to limit it by first importing it into perfmon and then select the correct timeframe and counters and export this to a CSV. When you have done so you can import the CSV as follows:
  1. Run: excel
  2. Click on “Data”
  3. Click “Import External Data” and click “Import Data”
  4. Select “Text files” as “Files of Type”
  5. Select file and click “Open”
  6. Make sure “Delimited” is selected and click “Next”
  7. Deselect “Tab” and select “Comma”
  8. Click “Next” and “Finish”
All data should be imported and can be shaped / modelled / diagrammed as needed.
Another option is to use a tool called “esxplot“. It hasn’t been updated in a while, and I am not sure what the state of the tool is. You can download the latest version here though, but personally I would recommend using VisualEsxtop instead of esxplot, just because it is more recent.
  1. Run: esxplot
  2. Click File -> Import -> Dataset
  3. Select file and click “Open”
  4. Double click host name and click on metric
Using ESXPLOT for ESXTOP data
As you can clearly see in the screenshot above the legend(right of the graph) is too long. You can modify that as follows:
  1. Click on “File” -> preferences
  2. Select “Abbreviated legends”
  3. Enter appropriate value
For those using a Mac, esxplot uses specific libraries which are only available on the 32Bit version of Python. In order for esxplot to function correctly set the following environment variable:
export VERSIONER_PYTHON_PREFER_32_BIT=yes

Limiting your view

In environments with a very high consolidation ratio (high number of VMs per host) it could occur that the VM you need to have performance counters for isn’t shown on your screen. This happens purely due to the fact that height of the screen is limited in what it can display. Unfortunately there is currently no command line option for esxtop to specify specific VMs that need to be displayed. However you can export the current list of worlds and import it again to limit the amount of VMs shown.
esxtop -export-entity filename
Now you should be able to edit your file and comment out specific worlds that are not needed to be displayed.
esxtop -import-entity filename
I figured that there should be a way to get the info through the command line as and this is what I came up with. Please note that <virtualmachinename> needs to be replaced with the name of the virtual machine that you need the GID for.
VMWID=`vm-support -x | grep <virtualmachinename> |awk '{gsub("wid=", "");print $1}'`
VMXCARTEL=`vsish -e cat /vm/$VMWID/vmxCartelID`
vsish -e cat /sched/memClients/$VMXCARTEL/SchedGroupID
Now you can use the outcome within esxtop to limit(l) your view to that single GID. William Lam has written an article a couple of days after I added the GID section. The following is a lot simpler than what I came up with, thanks William!
VM_NAME=STA202G ;grep "${VM_NAME}" /proc/vmware/sched/drm-stats  | awk '{print $1}'

References

The following documents / articles have been used as a reference:

Troubleshooting Storage Performance in vSphere

Troubleshooting Storage Performance in vSphere

I frequently present at the various VMware User Group (VMUG) meetings, VMworld and partner conferences.  If you have ever attended one of my talks, you will know it is like trying to drink from a fire hose, it is hard to cover everything in just a 45 min session. Therefore I will take the time here to  write a few blogs that go over the concepts discussed in these talks in more detail (or at least slower). One of the most popular yet very fast paced talks I present is the Troubleshooting Storage Performance in vSphere. I’ll slow things down a bit and discuss each topic here, this might be just a review for some of you but hopefully as we get into more details there will be some new nuggets of VMware specific information that can help even the more advanced storage folks. 
Today’s post is just the basics.  What is bad storage performance and where do I measure it?
Poor storage performance is generally the result of high I/O latency. vCenter or esxtop will report the various latencies at each level in the storage stack from the VM down to the storage hardware.  vCenter cannot provide information for the actual latency seen by the application since that includes the latency at the Guest OS and the application itself, and these items are not visible to vCenter. vCenter can report on the following storage stack I/O latencies in vSphere.
 Storage Stack Components in a vSphere environment
LatencyInStorageStack
GAVG (Guest Average Latency) total latency as seen from vSphere
KAVG (Kernel Average Latency) time an I/O request spent waiting inside the vSphere storage stack. 
QAVG (Queue Average latency) time spent waiting in a queue inside the vSphere Storage Stack.
DAVG (Device Average Latency) latency coming from the physical hardware, HBA and Storage device.


To provide some rough guidance, for most application workloads (typically 8k I/O size, 80% Random, 80% Read) we generally say anything greater than 20 to 30 ms of I/O Latency may be a performance concern. Of course as with all things performance related some applications are more sensitive to I/O latency then others so the 20-30ms guidance is a rough guidance rather than a hard rule. So we expect that GAVG or total latency as seen from vCenter should be less than 20 to 30 ms.  as seen in the picture, GAVG is made up of KAVG and DAVG.  Ideally we would like all our I/O to quickly get out on to the wire and thus spend no significant amount of time just sitting in the vSphere storage stack,  so we would ideally like to see KAVG very low.  As a rough guideline KAVG should usual be 0 ms and anything greater than 2ms may be an indicator of a performance issue. 
So what are the rule of thumb indicators of bad storage performance? 
•             High Device Latency: Device Average Latency (DAVG) consistently greater than 20 to 30 ms may cause a performance problem for your typical application. 
•             High Kernel Latency: Kernel Average Latency (KAVG) should usually be 0 in an ideal environment, but anything greater than 2 ms may be a performance problem.
So what can cause bad storage performance and how to address it, well that is for next time…
Poor storage performance is generally the result of high I/O latency, but what can cause high storage performance and how to address it?   There are a lot of things that can cause poor storage performance…
– Under sized storage arrays/devices unable to provide the needed performance
– I/O Stack Queue congestion
– I/O Bandwidth saturation, Link/Pipe Saturation
– Host CPU Saturation
– Guest Level Driver and Queuing Interactions
– Incorrectly Tuned Applications
– Under sized storage arrays (Did I say that twice!)
As I mentioned in the previous post the key storage performance indicators to look out for are 1. High Device Latency  (DAVG consistently greater than 20 to 30 ms) and 2. High Kernel Latency( KAVG greater than 2 ms). Once you have identified that you have High Latency you can now proceed to trying to understand why the latency is high and what is causing the poor storage performance. In this post, we will look at the top reason for high Device latency.
The Top reason for high device latency is simply not having enough storage hardware to meet your application’s needs (Yes, I have said it a third time now), that is a sure fire way to have storage performance issues.  It may seem basic, but too often administrators only size their storage on the capacity size they need to support their environment but not on the Performance IOPS/Latency/Throughput that they need.   When sizing your environment you really should consult your Application and Storage Vendor’s best practices and sizing guidelines to understand what storage performance your application will need any what your storage hardware can deliver.
How you configure your storage hardware, the type of drives you use, the raid configuration, the number of disk spindles in the array, etc… will all affect the maximum storage performance your hardware will be able to deliver.  Your storage vendor will be able to provide you the most accurate model and advice for the particular storage product you own, but if you need some rough guidance you can use the guidance provided in the chart below.
  Untitled-1 copy

The slide shows the general IOPs and Read & Write throughput you can expect per spindle depending on the RAID configuration and/or drive type you have in your array.    Also frequently I’m asked what is the typical I/O profile for a VM, the guidance varies greatly depending on the applications running in your environment, but a “typical” I/O workload for a VM would roughly be 8KB I/O size, 80% Random, 80% Read.  Storage intensive applications like Databases, Mail Servers,  Media Streaming, … have their own I/O profiles that may differ greatly from this “typical” profile.
One good way to make sure your storage is able to handle the demands of your datacenter, is to benchmark your storage.  There are several free and Open Source tools like IOmeter that can be used to stress test and benchmark your storage.  If you haven’t already taken a look at the I/O Analyzer tool delivered as a VMware Fling,  you might want to take a peek at it.  I/O Analyzer is a virtual appliance tool that provides a simple and standardized approach to storage performance analysis in VMware vSphere virtualized environments ( http://labs.vmware.com/flings/io-analyzer ).
Also when sizing your storage make sure your storage workloads are balanced “appropriately” across the paths in the environment, across the controllers and storage processors in the array and balanced and spread across the appropriate number of spindles in the array.  I’ll talk a bit more about “appropriately” balanced later on in this series as it varies depending on your storage array and your particular goals/needs.     
Simply sizing your storage correctly for the expected workload, in terms of size and performance capabilities, will go very far to making sure you don’t run into storage performance problems and making sure your Device Latency (DAVG) is less than that 20-30ms guidance.  There are other things to consider which we will see in future post, but sizing your storage is key.

Troubleshooting Storage Performance in vSphere (Part 3) – SSD Performance

While presenting the storage performance talks, I frequently get asked about Solid State Device (SSD) performance in a virtualized environment. Well obviously, SSD’s or EFD’s (Enterprise Flash Disks) are great for performance especially if you have storage intensive workloads. As seen in the previous post in this series, SSDs can provide significantly more IOPs and significantly lower latencies. But the two big questions are ”how much of a gain might I expect” and “how much SSD storage do I need to achieve that gain” when using SSDs in a virtualized environment.
There are two studies that do a great job at painting a good picture for the performance of SSDs in a virtualized environment, and answering those two questions. Both studies use VMware’s VMmark benchmark, a virtualization platform benchmark for x86-based computers used mostly by our hardware partners to determine the performance of their hardware platforms when running in a virtualized environment.
The first study answers the questions of how much of a gain might I be able to achieve from using SSDs in my environment. As with all performance data your mileage may vary, but by using VMmark which simulates a collection of different workloads typically seen in a vSphere environment we can form a general idea for the impact of SSDs in a “typical” virtualized environment.
SSDPerf1
The results of the study showed that the average improvement in score for the SSD configuration was approximately 25% when compared to traditional rotating storage. Also, SSDs allowed for more consolidation, the traditional storage couldn’t support the level of consolidation at the high end (while meeting the QoS required in the VMmark benchmark). The SSD configuration could not only support the higher consolidation of six VMmark workload tiles while meeting the QoS requirements, but it also improved the overall VMmark score slightly while supporting the heaver consolidation load.
The second study provides guidance for the question how many SSDs do I need. Again using the VMmark benchmark as the workload, our VMmark performance team studied the performance impact of SSDs using the auto-tiering capabilities of the storage array. Unlike the previous test where ALL the traditional storage was replaced with SSDs, this study only had SSD capacity for approximately 8% of the storage footprint of the test. So only 8% of the storage required for the workload could fit in the SSDs and the rest had to utilize the traditional rotating storage. The storage array, using its auto-tiering capability, intelligently detected which storage blocks were hot and promoted those hot blocks into the SSD storage.
SSDPerf2
The results were that with only 8% of the workload’s storage footprint being able to fit in the faster tier of SSD storage, the VMmark workload was still able to achieve the 25% plus improvement that was seen in the previous study where all the storage was replaced with SSDs. A 90/10 rule is observed here. 90% of your typical workload’s IOPs are generated from just 10% of that workload’s storage footprint.
Again these studies just provide some guidance which greatly depends on your workloads, but the two studies help answer those two big questions of “how much better” and “how many do I need”.


esxtop and resxtop

esxtop and resxtop


These two tools are used to monitor and gather performance data from an ESXi host.
  • esxtop. This gives real time CPU, memory, disk and network data for hosts and virtual machines. You can run esxtop from a direct connection to a host’s CLI
  • resxtop. This is a remote version of esxtop. It is included as part of vCLI and is present on the vMA (vSphere Management Assistant). resxtop has three modes of operation including Interactive, Batch, and Replay.
Example resxtop output
 7:30:37pm up  1:38, 206 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.01, 0.01, 0.00
PCPU USED(%): 0.9 1.2 0.5 0.6 AVG: 0.8
PCPU UTIL(%): 1.0 1.6 0.6 0.7 AVG: 1.0

      ID      GID NAME             NWLD   %USED    %RUN    %SYS   %WAIT %VMWAIT    %RDY   %IDLE  %OVRLP   %CSTP  %MLMTD  %SWPWT
       1        1 idle                4  397.21  400.00    0.00    0.00       -  400.00    0.00    0.94    0.00    0.00    0.00
     817      817 hostd.2861         13    1.61    1.70    0.01 1298.25       -    1.33    0.00    0.00    0.00    0.00    0.00
       8        8 helper             82    0.28    0.28    0.00 8200.00       -    0.09    0.00    0.01    0.00    0.00    0.00
     995      995 vpxa.3060          18    0.24    0.24    0.00 1800.00       -    0.35    0.00    0.00    0.00    0.00    0.00
     639      639 vmkiscsid.2670      2    0.05    0.05    0.00  200.00       -    0.01    0.00    0.00    0.00    0.00    0.00
     607      607 vmsyslogd.2625      3    0.02    0.03    0.00  299.87       -    0.37    0.00    0.01    0.00    0.00    0.00
       2        2 system              9    0.01    0.01    0.00  900.00       -    0.02    0.00    0.00    0.00    0.00    0.00
       9        9 drivers            11    0.01    0.01    0.00 1100.00       -    0.01    0.00    0.00    0.00    0.00    0.00
    1029     1029 vmware-usbarbit     2    0.01    0.01    0.00  200.00       -    0.01    0.00    0.00    0.00    0.00    0.00
     859      859 dcbd.2906           1    0.01    0.01    0.00  100.00       -    0.00    0.00    0.00    0.00    0.00    0.00
     702      702 storageRM.2745      2    0.01    0.01    0.00  200.00       -    0.00    0.00    0.00    0.00    0.00    0.00
Most of the examples you will see here will be ran from a vMA. So long as fastpass is configured correctly, there is no need to authenticate each time the tool is run.

How to Configure esxtop/resxtop custom profiles

There are a lot of options available to you when using esxtop. When running the tool, pressing ‘h’ will show you what options are available:
Interactive commands are:

fF      Add or remove fields
oO      Change the order of displayed fields
s       Set the delay in seconds between updates
#       Set the number of instances to display
W       Write configuration file ~/.esxtop50rc
e       Expand/Rollup Cpu Statistics
V       View only VM instances
L       Change the length of the NAME field
l       Limit display to a single group

Sort by:
        U:%USED         R:%RDY          N:GID
Switch display:
        c:cpu           i:interrupt     m:memory        n:network
        d:disk adapter  u:disk device   v:disk VM       p:power mgmt
So, as an example, you can press ‘d’ to access stats around the disk adapters:
7:44:11pm up  1:51, 205 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.01, 0.01, 0.01

 ADAPTR PATH                 NPTH   CMDS/s  READS/s WRITES/s MBREAD/s MBWRTN/s DAVG/cmd KAVG/cmd GAVG/cmd QAVG/cmd
 vmhba0 -                       0     0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00
 vmhba1 -                       3     0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00
vmhba32 -                       1     0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00
vmhba33 -                       2     0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00
From that screen you can press ‘f’ to select what fields to wish to view:
Current Field order: ABCdEfGhijkl

* A:  ADAPTR = Adapter Name
* B:  PATH = Path Name
* C:  NPATHS = Num Paths
  D:  QSTATS = Queue Stats
* E:  IOSTATS = I/O Stats
  F:  RESVSTATS = Reserve Stats
* G:  LATSTATS/cmd = Overall Latency Stats (ms)
  H:  LATSTATS/rd = Read Latency Stats (ms)
  I:  LATSTATS/wr = Write Latency Stats (ms)
  J:  ERRSTATS/s = Error Stats
  K:  PAESTATS/s = PAE Stats
  L:  SPLTSTATS/s = SPLIT Stats
Once you have set it up to display the values you are interested in you can press ‘W’ to save the config to a file. You can accept the default or type your own path as necessary.
Save config file to (Default to : /home/vi-admin/.esxtop50rc): /home/vi-admin/.esxtopdiskadrc
Next time you need to view those specific stats in resxtop you can open it, referring to the file you saved:
vi-admin@vma:~[192.168.88.134]> resxtop -c .esxtopdiskadrc

 8:16:58pm up  2:24, 206 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.01, 0.00, 0.00

 ADAPTR PATH                   CMDS/s  READS/s WRITES/s MBREAD/s MBWRTN/s
 vmhba0 -                        0.00     0.00     0.00     0.00     0.00
 vmhba1 -                        0.00     0.00     0.00     0.00     0.00
vmhba32 -                        0.00     0.00     0.00     0.00     0.00
vmhba33 -                        0.00     0.00     0.00     0.00     0.00

Determine use cases for and apply esxtop/resxtop Interactive, Batch and Replay modes

esxtop/resxtop has three modes in which it can be run:
  • Interactive. This is the default mode, and the one used in the example above. By default statistics are collected at a 5 second interval, although this can be changed.
  • Batch. This mode is used for collecting statistics for a period of time, for later analysis. Once the stats have been collected, they can be analysed using Excel, Perfmon or ESXplot amongst other tools.
  • Replay. This mode allows you to replay data collected using the vm-support tool. You cannot replay data collected in batch mode interactively.

Collecting Data using Batch Mode

There are a few options to set when doing a batch mode collection – the various resxtop options are shown here:
vi-admin@vma:~> resxtop --help
usage: resxtop [-h] [-v] [-b] [-s] [-a] [-c config file] [-d delay] [-n iterations]
               [--server server-name [--vihost host-name]] [--portnumber socket-port] [--username user-name]
              -h prints this help menu.
              -v prints version.
              -b enables batch mode.
              -s enables secure mode.
              -a show all statistics.
              -c sets the esxtop configuration file, which by default is .esxtop50rc
              -d sets the delay between updates in seconds.
              -n runs resxtop for only n iterations.
              --server      remote server name.
              --vihost      esx host name, if --server specifies vc server.
              --portnumber  socket port, default is 443.
              --username    user name on the remote server.
The example below, ran from a vMA, will collect stats at 5 second intervals, 20 times. The -a switch indicates that you want to collect all stats:
vi-admin@vma:~[192.168.88.134]> resxtop -b -a -d 5 -n 20 > output.csv
Rather than collect all stats, you can specify a configuration file such as the one I used earlier. Batch mode will then only capture the stats specified in the file.
vi-admin@vma:~[192.168.88.134]> resxtop -b -a -d 5 -n 20 -c .esxtopdiskadrc > output2.csv
Be aware that the size of the capture files can grow quickly!
-rw------- 1 vi-admin root 2.6M Oct 24 14:49 output.csv
With this in mind, it’s possible to compress the output file as it is collected:
vi-admin@vma:~[192.168.88.134]> resxtop -b -a -d 5 -n 20 | gzip -9c > output.csv.gz
The zipped file is significantly smaller than the unzipped version:
-rw------- 1 vi-admin root 114K Oct 24 15:01 output.csv.gz

Replaying Performance Data using ESXTOP

You can also capture performance data using the vm-support tool. vm-support is a tool most commonly used to capture log and configuration data to send to VMware, however it can also be used to capture performance data. There are a number of options that can be specified when running vm-support:
~ # vm-support -h
Usage: vm-support [options]

Options:
  -h, --help            show this help message and exit
  -g GROUPS, --groups=GROUPS
                        Gather data from listed groups
  -a MANIFESTS, --manifests=MANIFESTS
                        Gather from listed manifests
  -e EXCLUDEMANIFESTS, --excludemanifests=EXCLUDEMANIFESTS
                        Exclude the listed manifests
  --listmanifests       List available manifests
  -G, --listgroups      List available manifest groups
  -t, --listtags        List available manifest tags
  -p, --performance     Gather performance data
  -d DURATION, --duration=DURATION
                        Duration of performance monitoring (in seconds)
  -i INTERVAL, --interval=INTERVAL
                        Interval between performance snapshots
  -v VM, --vm=VM        Gather detailed information about this specific VM (ie
                        --vm )
  -V, --listvms         List currently registered VMs
  -w WORKINGDIR, --workingdir=WORKINGDIR
                        Directory to create .tgz in
  -D, --dryrun          Prints out the data that would have been gathered
  -s, --stream          stream data to stdout
  -q, --quiet           Output only the location of the bundle
  -E ERRORFILE, --errorfile=ERRORFILE
                        Prints (non-fatal) errors to specified file (overrides
                        --quiet and --stream)
  --loglevel=LOGLEVEL   Set logging to specified level: 0-50 (0=most verbose)
  --version             Display the vm-support version
  -l, --listfiles       List all files gathered by vm-support
~ #
To capture performance data only, the -p switch is used. As with batch mode for esxtop we have to set the length of the capture and the interval. The following command will capture performance statistics for 60 seconds at 10 second intervals, writing the capture to a VMFS datastore:
/var/log #  vm-support -p -d 60 -i 5 -w /vmfs/volumes/datastore1/
15:45:47: Creating /vmfs/volumes/datastore1/esx-esxi1.vmlab.local-2012-08-10--15.45.tgz
15:48:39: Done.
Please attach this file when submitting an incident report.
To file a support incident, go to http://www.vmware.com/support/sr/sr_login.jsp
To see the files collected, run: tar -tzf '/vmfs/volumes/datastore1/esx-esxi1.vmlab.local-2012-08-10--15.45.tgz'
The collected data will be in a zipped file, in order to save space. To work with it we will need to extract the contents as suggested in the commands output:
tar -tzf '/vmfs/volumes/datastore1/esx-esxi1.vmlab.local-2012-08-10--15.45.tgz
Once extracted it may be necessary to run the reconstruct.sh script which can be found in the extracted contents. This will be necessary if you receive an error stating ‘all vm-support snapshots have been used’
Once the files have been extracted, you can run esxtop in replay mode by running the following:
/var/log # esxtop -R esx-esxi1.vmlab.local-2012-08-10--15.23
The data will be replayed in esxtop:
 3:25:41pm up  1:05, 294 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.14, 0.07, 0.03

 ADAPTR PATH                 NPTH   CMDS/s  READS/s WRITES/s MBREAD/s MBWRTN/s DAVG/cmd KAVG/cmd GAVG/cmd QAVG/cmd
 vmhba0 -                       0     0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00     0.00
 vmhba1 -                       3    14.52     4.25     8.20     0.26     1.37     1.18     0.01     1.19     0.29
vmhba32 -                       1     0.49     0.00     0.00     0.00     0.00     0.28     0.02     0.30     0.01
vmhba33 -                       2     2.17     1.38     0.00     0.10     0.00     9.06     0.01     9.07     0.00
*** Read stats from esx-esxi1.vmlab.local-2012-08-10--15.23/snapshot-3/commands/vsi_traverse_-s.txt ***
Notice the last line of the output, which indicates that esxtop is being run in replay mode, with the performance stats being read from a file.