Monday, March 12, 2012

vCPU Sizing Considerations






Challenges and considerations in determining how many vCPU’s to allocate to virtual machines.
 
By VKernel's David Davis and Alex Rosemblat (http://www.vkernel.com/):

When sizing virtual machines a virtualization administrator must select the number of vCPU’s, size of the virtual disk, number of vNICs, and the amount of memory. Out of all those potential variables, the two most difficult to determine are always the number of vCPU’s and the amount of memory to allocate. This is because CPU and memory are the most finite resources that a server has and these resources are also the most dynamically demanded resources by the guest OS in each VM. We covered virtual machine memory sizing in our VKernel whitepaper (see VM Memory (vRAM) Sizing Considerations), now let’s cover proper vCPU sizing in your virtual machines.

Virtualization administrators should avoid over-allocating vCPU’s because doing so wastes expensive server resources and will minimize ROI on that infrastructure. In fact, over-allocation of vCPU’s in some VMs will actually cause vm performance problems for that VM and other VMs. On the other hand, the business critical applications running in virtual machines also need to maintain high performance and they need processing power to do so. The last thing a virtualization administrator wants is to have performance complaints from end users. VM administrators face a difficultly in balancing the need to maximize ROI of the server hardware and the requirement for applications to perform optimally.

Fortunately, with the right tools in place and the guidance from this whitepaper, virtualization administrators will gain a deeper understanding and be able to easily make the right decisions related to vCPU sizing.

By reading this whitepaper, VM administrators will learn:

• How to correctly size VMs in an environment
• How to approach appropriate sizing of new VMs that need to be deployed
• How to institute a regular CPU sizing process for the data center
• How CPU usage works at different levels of the virtualization “stack”
• How to screen for CPU-based VM performance issues before attempting to optimize an environment
• What VM performance metrics can provide insights on CPU usage and how to access them

CPU Usage through Different Levels of the Virtualization Stack
With the traditional physical server (or desktop PC), you have an entire CPU or multiple CPUs (each with multiple cores) dedicated to the OS and applications running on it. The virtualization hypervisor adds an additional layer between OS and the physical CPU, allowing multiple virtual machines to share the hardware. Instead of the CPU requests from applications going to the OS and then the OS scheduling them on the physical CPU, the OS in the VMs talks to virtual CPUs (which it thinks are real physical CPUs). Requests from the multiple virtual CPUs are scheduled, by the hypervisor, across the multiple physical CPU cores. Just like with memory sharing in virtualization, with CPU sharing in virtualization, there is the traditional OS CPU scheduler and the hypervisor CPU scheduler.

All of this enables greater utilization and massive sharing of the server’s physical CPUs.

Differences in CPU Usage in Physical and Virtual Servers
To summarize, the difference in CPU usage with physical and virtual servers is this:

• In the physical world, applications are scheduled by the OS onto the physical CPUs
• In the virtual world, applications running on each OS, in each VM, make requests of virtual CPUs that are then scheduled by the hypervisor
Single and Multi-Threading in CPUs
Not only do today’s servers have multiple CPU cores in each CPU but they are also multi-threading (aka “Hyper-threading” if you are using an Intel CPU). This means that each CPU core can execute “threads” in parallel. You can think of a thread as a process so envision a CPU core being able to run multiple processes all at the same time (in parallel). However, having more than one CPU will only benefit applications that are multi-threaded apps. In other words, servers that are dedicated task-based servers (database servers, web servers, etc.) but run a single-threaded application will see no performance benefit by adding more CPUs.
To know if an application is single-threaded or multi-threaded, you have two options:

1. Ask the software manufacturer if it is multi-threaded and supports SMP (symmetric multi-processing)
2. See if, when you have more than one processor or core on a server, the program’s multiple processes are using more than one processor or core at a time. In other words, say that you have a quad-core CPU, you would watch the application as the CPU demands increase. If the application can use only 25% of total CPU capacity (1 of the 4 cores), then it is a single-threaded application that can’t use more than one core. On the other hand, if the application is using 50, 75, or 100% of the total CPU capacity (4 of the 4 cores), then it is multi-threaded.
Whether you are running an application on a physical server or a virtual machine, you only want to have multiple CPUs available if the application running on that host can take advantage of those CPUs with multiple threads.

Performance Impacts of CPU Command Processing With One or Multiple vCPU’s
On a physical server if you have multiple CPUs available for the server’s primary application but that app isn’t multi-threaded, you are wasting the cost of those CPUs. However, in a virtual infrastructure, if you over-allocate multiple vCPU’s to a VM and the primary app on that VM isn’t multi-threaded, you could actually cause performance issues for that VM and others. The reason for this is that, with multiple vCPU’s, the hypervisor’s CPU scheduler must wait for multiple physical CPU time slots to become available before it can process requests from the multi-vCPU VM.

In other words:

• With one vCPU, CPU requests are quickly processed (or they are waiting on pCPU if no pCPU is available)
• With multiple vCPU’s, the hypervisor CPU scheduler must wait for multiple pCPUs to be available
• Having multiple vCPU’s when not needed will slow down VMs


Detecting Virtual Host and VM Performance Problems before Analyzing vCPU Allocations
Deciding the proper number of vCPU’s for a VM should be a long-term performance planning exercise which is periodically performed. Prior to starting that vCPU planning exercise, you should first determine if there is already CPU over-utilization within your virtual machines or on your virtual host.

Typically, this is done by checking the following:

• CPU Ready (cpu.ready.summation) in milliseconds (ms)
This per virtual machine and per vCPU statistic is the number of milliseconds that the VM was ready to execute requests on the virtual host’s CPU but there wasn’t pCPU available to do so. An increasing CPU Ready value for a VM indicates that there is an ongoing lack of CPU cores on the virtual host.

• CPU Usage (cpu.usage.average) in %
This CPU usage statistic is available per host, per VM, or per resource pool. It shows us what % of the time, over the time range selected, that the CPU was utilized. Note that the total CPU includes all vCPU’s on a VM or all pCPUs on a host. If you already have > 80% CPU utilization on a VM or host, over a 1 hour time period, you should find ways to solve that problem. Those
solutions could include reducing the number of vCPU’s on your VMs, migrating active VMs to another host, adding more pCPU to a host, or dealing with misbehaving applications.

If your virtual hosts don’t have the CPU capacity needed for today’s application demands you should solve that first. It should be noted that one of the potential solutions for solving high virtual host CPU utilization is to reduce the number of vCPU’s allocated to VMs if they were dramatically over allocated.

VMware Performance Metrics to use for Virtual Host pCPU and VM vCPU Usage Analysis
At this point, you’ve determined that your physical server isn’t over-utilized and it’s time to move on to a structured analysis of vCPU allocation on one or across all virtual machines. We’ll provide a detailed process below but first, let’s cover what you need to know before you start that process:

• How to Measure CPU Usage Maximum (cpu.usage.maximum) in percent
This shows the maximum amount of CPU that was used at any one time, as a percentage. Keep in mind that if you have 2 vCPU’s, a 50% utilization value is 100% of one vCPU and 0% of another, or 25% of one and 25% of another. This value will be used to determine if we hit the maximum amount of CPU already allocated to that VM during the time range.

• How to Measure CPU Usage Average (cpu.usage.average) in percent
This value shows the average CPU utilization over the time range. Note that this is based on the total number of vCPU’s such that if you have 2 vCPU’s, a 100% utilization over the time range is 100% of both vCPU’s.

• Understand Time Ranges and Sample Periods
To be able to accurately understand statistics that you are viewing, you must fully understand that these stats are measured over the default sample periods or whatever sample period you specify. For example, if you measure real-time or one minute CPU usage, it is very likely that you will see periodic CPU utilization peaks of 100%. Short periodic CPU usage at 100% isn’t cause for alarm or changes to CPU configuration. Instead, I recommend that you look at CPU statistics over a slightly longer sample time to ensure that you are seeing trends instead of instantaneous usage.

Some number of those statistical samples are then pulled out and stored in a historical database and used when you run a performance report over a specific time period. Understanding both the sample time and the time range are important when it comes to interpreting performance graphs.

• Know How to View and Modify vCPU’s for VMs
Viewing and modifying vCPU configurations on your VMs isn’t something VMware Admins need to do every day, however you do need to know where to do it and when you can make changes. You can view VM vCPU configuration by clicking on a VM and looking at the Summary tab. You can edit VM vCPU’s by going to a VM’s properties and then clicking on the CPU in the hardware tab. You can also view vCPU configurations across all VMs by going to a higher level in the virtual infrastructure (from a VM, go up to the ESXi server or up to the resource pool or cluster level) and then clicking on the Virtual Machines tab.

Additional Factors to Keep in Mind when Allocating VM CPUs
Limits

By configuring a limit on a virtual machine, an artificial cap is being placed on the maximum amount of CPU that the VM can use. Without a limit, the maximum CPU that can be used is the Mhz of the pCPUs on the server, per vCPU allocated to a VM. Keep in mind that while a VM has full access to the total Mhz of a pCPU (per vCPU allocated), those pCPUs are still shared between that VM and all others on that virtual host. Note that limits configured in the hypervisor aren’t visible to the guest OS, which can further cause unexpected application performance issues.

While functionality exists to configure a Mhz limit on a virtual machine's vCPU, it rarely makes sense. Instead, it makes more sense to set a CPU Mhz limit across all VMs in a particular resource pool.

Many times, VM CPU limits are put in place by an administrator who did not fully understand the poor performance and wasted resources that limits can create. When looking at CPU utilization, CPU limits skew performance metrics and can cause confusion while troubleshooting. In other cases, VM CPU limits have also been put in place by the VMware Admin to limit physical resource consumption by applications (unbeknown to the application owner).

ReservationsCPU reservations artificially set the minimum amount of CPU that a virtual machine (or resource pool) has access to. Even though those CPU resources may not be needed by the VM now, a reservation pulls those CPU resources away from other virtual machines that may need it. Like limits, reservations are better set on resource pools instead of individual virtual machines as they can hurt the performance of other virtual machines and skew CPU metrics due to the artificial requirements being put in place.

Knowing an Application’s NeedsTaking the time to look inside the virtual machine and analyze the CPU resources that an application uses can yield a great deal of information as to that VM’s CPU needs. When evaluating a Windows operating system, Resource Monitor and Performance Monitor can be run to expose which processes use the most CPU and how it varies.

Also, speaking to the business owner for an application helps determine when this application is used, who uses it, what would happen if it were unavailable, and how the use of the application is growing.

By taking time to understand these factors, VM administrators can draw further insights into properly sizing vCPU for the VM, configuring resource controls (if needed), and understanding the priority of the application as compared to other apps.

Special Considerations When Changing vCPU ConfigurationsWhen you determine that the number of vCPU’s that a VM has configured needs to be changed, there are a few considerations:

• When going from one vCPU to many OR from many to one vCPU kernel changes are required in the VM guest operating system. For example, with Windows Server 2003 you need to make a change to the HAL (see VMware KB article 1003978 for more information). However, with Windows Server 2008 you can switch between single and multiple CPUs without making any HAL changes.
• Virtual machines need to be shutdown to remove vCPU and most VMs need to be shutdown to change vCPU unless the vSphere “hot plug CPU” feature was enabled before boot AND the VM meets the operating system requirements to use that feature.

Instituting a VM CPU Sizing Process
Not only is a virtual environment dynamic, but also, the usage of the applications in the VMs will be in constant flux. As a VMware Admin working on a critical production virtual infrastructure with hundreds or thousands of constantly changing virtual machines, there must be a formal process for proper vCPU sizing, other than just a "rule of thumb".

This process, ideally, involves the application admins and will have to be undertaken BOTH when a new VM is being created for an application as well as on a periodic basis.

The workflow diagram below introduces a best practice for how to execute this process. While this process may need slight modification for certain companies, it will work as-is for most VMware Administrators.

Below, this vCPU sizing process is presented step by step:

Determining Time Frame for Analysis
As I mentioned earlier in this guide, when performing your analysis, you need to take the perspective of the time frame into consideration. Statistics can easily be interpreted incorrectly and improper vCPU configurations can be made just by looking at a poorly selected time range.

While it is difficult to pick one timeframe that is perfect for all situations, I generally recommend a timeframe that is no less than 1 week. However, in some cases, that time frame may be as much as a year (such as a retail company that has high utilization around the holiday season or a university that has high utilization around registration). On the other hand, if you have a common application that you know is relatively predictable, you can safely look at time frames between 1 week and 1 month.

Finding the Peak ValueSo what peak value do you choose? “Maximum” and “Peak” values are interesting because they show if the vCPU’s were ever 100% utilized during your time period. However, you need to 1) investigate WHEN that was (what it during backups?) and 2) weigh that with the average utilization. This is the case because an instantaneous value of 100% CPU utilization for a single vCPU VM over a week means nothing if it only happened once during the backup window and the average utilization is just 25%.

In other words, don't just look at the current value and use that. Use the tips above in determining a time frame to make sure that you identify the true peak.

Comparing the Peak Usage Value to Allocated CPUFrom there, the true peak value observed, if deemed relevant in context, can be compared to the number of vCPU’s configured for the virtual machine. The process should to be:

• Look at the average and peak (maximum) percent of vCPU utilized during your sample
• Compare that to the number of vCPU’s that are allocated to that VM
• If the average is less than 38% and the peak is less than 45% then you should consider downsizing vCPU’s
• If the average is greater than 75%, and the peak is above 90%, then you should consider adding vCPU’s.
For example, if the VM is configured with Qty 2 vCPU’s and the maximum utilization is 100%, you are maxing out both vCPU’s, at some point. If the average utilization is consistently > 80% then consider adding vCPU’s.

On the other side of this, if after the comparison, we find that that vCPU’s never hit their maximum, average utilization is low, and the number of vCPU’s is > 1 then consider migrating the VM down to 1 vCPU (or reducing the VM’s vCPU by one).

Forecasting Usage Changes until Next VM CPU Allocation ReviewBefore deciding to downsize a VM's vCPU configuration, it should be noted that the vCPU demands of the VM (and, more specifically, its applications) could increase between now and the next time that this vCPU sizing process is conducted again.

Has the application owner been contacted? Was an increasing trend in vCPU demand noted based on historical analysis? It is important to ensure that sufficient vCPU for both current usage and forecasted future usage growth is provisioned or a VM will face vCPU-related performance issues.

Factor in a Buffer to determine the Right AllocationBesides factoring in an expected growth rate, adding a buffer is important to ensure that the vCPU allocation is accurate. While historical peaks have likely been taken into account, it is still good to factor in a buffer to ensure that the virtual machine's vCPU has headroom to grow into and does not max out. For less critical applications, perhaps this is not necessary based on the application’s needs, and can be avoided to retain more CPU capacity for other VMs. However, for multi-threaded critical applications whose CPU utilization fluctuates, adding an additional vCPU is recommended. This way, in the event that an unexpected, business critical demand on CPU happens, end-user application performance won’t suffer.

Change the VM Sizing Allocation, Kernel-Level Changes and Document Change
Now that the timeframe for analysis has been determined, peak utilization found and compared to the allocated amount, and further forecasting for VM memory usage in the future has been undertaken, it is time to take action: the virtual machine's number of vCPU’s can be changed.

Unless the correct guest OS is installed and CPU hot plug is enabled (assuming you need to add additional vCPU), the guest OS must be shut down to add or remove vCPU from a virtual machine.

Once the guest OS is shut down, a virtual machine's vCPU can be resized by right-clicking on the VM (in the hosts and clusters inventory) and clicking on Properties. At that point, you'll see the window below where you can resize the VM’s CPUs on the Hardware tab.

Once the VM's vCPU has been resized, you need to take Guest OS Kernel / HAL changes into account.

Typically, the servers that need HAL / Kernel changes done when going from one to multi-CPU or from multi to one are the older Windows Server operating systems like Windows Server 2003. Without the proper changes, Windows 2003 Server systems will perform slowly or not at all. Windows Server 2008 and Linux-based systems don’t require any HAL/Kernel changes when their quantity of virtual CPUs change.

With Kernel / HAL changes done, you can document the change to the VM’s vCPU and power back on the VM.
Setting Up a Regular CPU Sizing Review Based on an Appropriate Time Frame
Because application usage changes, vCPU must be continuously evaluated to ensure that performance will not be impacted due to additional changes in the dynamic virtualized infrastructure.

A regular vCPU review process should be set at appropriate intervals. It is this paper’s recommendation that this process once per month. However, some organizations that are growing quickly may want to do this more frequently. Once a vm sizing process has been conducted a few times on the virtual machines in an environment, the steps will become more familiar and administrators can able to build on this experience to streamline the process.

Provisioning a New VM for Appropriate vCPU AllocationsNow, let's look at the vCPU sizing workflow from the beginning and take the alternate path: let's say that a new virtual machine is being provisioned. There won't be any historical performance data for this new virtual machine. Because of this, the vCPU configuration made for this VM could be much less accurate and the VM will need to be monitored more closely.

Still, following the sizing procedure below with frequent monitoring during the time shortly after the VM is created will provide the necessary insights into the number of vCPU’s that VM should have.

Initially Allocate the Minimum vCPU’s PossibleIn many cases, there will be some idea of how many vCPU’s a new VM should have based on the same application (or similar app) running on another VM. Even if this information is available, the process of monitoring and reviewing described below should still be followed.

On the other hand, if a brand new VM for an application or usage case that has not been previously deployed is being created, and this is no indication how many vCPU’s should be allocated for that app and its users, you could just go with the recommendations as if this app were being installed on a physical server. Or, you could start with just 1 vCPU and see how it goes from there.

While it may be tempting to go with a "gut feel" or use the same number of vCPU’s that a physical server is configured with (i.e., just give all Windows servers 2 x vCPU’s OR “the sharepoint admin said he’d like his VM to have 6 vCPU’s”), it's likely that the allocation may be grossly overshooting the number of vCPU’s needed or, even worse, underestimating the number of vCPU’s required by the application and its users.

Monitor CPU Metrics IntenselyOnce a new VM has been deployed with a set number of vCPU’s, CPU usage should be monitored intensely. Just how frequent and in-depth this monitoring is depends on the criticality of that VM. Administrators may also be able to rely on vSphere alarms to alert if and when the new VM is hitting high CPU utilization.

When monitoring, look both at the vSphere VM CPU usage level and inside the guest OS to see how much CPU is in use and by what application. Is the VM maxing out the base quantity of vCPU’s you configured? If so, then more vCPU’s are likely needed.

Add More vCPU’s One at a TimeIf the new VM is running low on CPU capacity, another vCPU should be added one at a time. Remember, you want to find the right number of vCPU’s for the VM, not just throw a bunch of resources at it and potentially cause performance problems later. When more vCPU is needed, add additional vCPU just one at a time and then monitor very carefully.

Add the New VM to the Regular CPU Sizing Review ProcessThis new VM should be included in the periodic vCPU review process discussed once the vCPU usage has been stabilized and the new VM has appropriately size quantity of vCPU.

Additional considerations are…Assessing SLA Requirements
As administrators, we strive to give applications what they need for peak performance. However, in the real world, due to service level agreements or cost models, this isn’t always possible. In these cases, we have to intentionally give fewer vCPU’s that would be ideal or use resource controls like limits to only give a VM (and it’s underlying CPU-hungry application) what should be allocated for business reasons.

Thus, keep service level agreements in mind when analyzing vCPU allocations and realize that there are cases where we have to allocate a vCPU value other than what is optimum.

ConclusionBesides memory, CPU is the most finite computing resource that a virtual infrastructure has. By leveraging the information presented in this whitepaper and implementing the workflow detailed above, virtualization administrators can more efficiently use their computing resources. This procedure will allow data centers to reclaim CPU resources for other VMs which results in increased VM density and will help defer purchases of hardware. Additionally, regularly monitoring of CPU usage will allow VM administrators to proactively spot problem areas in VMs that are underprovisioned and may experience performance issues. Ultimately, accurately sizing vCPU in virtual machines can result in a better return on investment for a virtualization initiative, application owners that are confident in the performance that a virtualization endeavor provides their applications, and end users that are content with their usage of a firm’s IT resources.

About the Author

David Davis is the author of the best-selling VMware vSphere video training library from Train Signal. He has written hundreds of virtualization articles on the Web, is a vExpert, VCP, VCAP-DCA, and CCIE #9369 with more than 18 years of enterprise IT experience. His personal Website is VMwareVideos.com.

1 comment: