Initial Experiences with XenServer 6.5 and NVIDIA GRID using HP DL380 Gen9 Hardware: Guest Blog Post by Richard Hoffman
I have mentioned a colleague of mine Richard Hoffman a couple of times now, including a previous article where I talked about XenServer 6.2 Dynamic Memory Control and some Blue Screen of Death (BSOD) events that can occur. I’ve also written a new blog post that this has been resolved in XenServer 6.5, so vGPU should now work without issue with DMC. Late last month I convinced Richard to share some of his key findings from a Citrix / NVIDIA GRID project he has been involved in for the last several months. You can find that blog post here: NVIDIA GRID vGPU Deep Dive! Lessons Learned from the Trenches: Guest Blog Post by Richard Hoffman.
Well, it seems Richard has been bitten by the blogging bug and is ready to share some more experiences from the field! In this blog post Richard will be sharing his initial experiences with XenServer 6.5 and NVIDIA GRID using HP DL380 Gen9 Hardware. I’m a big fan of these new servers and they feature Intel’s new E5 v3 Haswell EP processors, DDR4 memory and can hold two K1 or K2 cards. With XenServer 6.5 and NVIDIA GRID powered virtual desktops, they really scream! If you have content you would like to contribute and be a guest blogger (or regular) on itvce.com, feel free to reach out to me at dane@itvce.com or on Twitter @youngtech so we can discuss. Without further ado, below is the guest blog post by Richard Hoffman, you can find him on LinkedIn or on Twitter.
As a follow-up to my previous work with XenServer 6.2’s Dynamic Memory Control and NVIDIA GRID vGPU, I will share my initial experiences with XenServer 6.5 and NVIDIA GRID using HP DL380 Gen9 Hardware. First, I will highlight the differences between setting up XenServer 6.2 and XenServer 6.5 for NVIDIA GRID. After comparing these differences, I will run through a step-by-step setup in the second section of this article.
I am testing with XenDesktop 7.6 on an HP DL380 Gen9. This server currently holds two GRID K2 cards. Other options can hold up to two GRID K1 cards and require a slightly different configuration (GRID K1 SKUs instead of K2).
If you are interested in working with Entisys or your HP team to build a similar field validated configuration, the Bill of Materials (BOM) for this dual NVIDIA GRID K2 server is shown below for your reference:
HP Part Number | Quantity | Description |
719064-B21 | 1 | HP DL380 GEN9 8SFF CTO SERVER |
652497-B21 | 1 | HP ETHERNET 1GB 2P 361T ADPTR |
665243-B21 | 1 | HP ETHERNET 10GB 2P 560FLR-SFP+ ADPTR |
719064-B21 #ABA | 1 | HP DL380 GEN9 8-SFF CTO SERVER |
719073-B21 | 1 | HP DL380 GEN9 SECONDARY 3 SLOT RISER KIT |
719076-B21 | 1 | HP DL380 GEN9 PRIMARY 2 SLOT RISER KIT |
719079-B21 | 1 | HP DL380 GEN9 HIGH PERF TEMP FAN KIT |
719082-B21 | 1 | HP DL380 GEN9 GRAPHICS ENABLEMENT KIT |
720620-B21 | 2 | HP 1400W FS PLAT PL HT PLG PWR SPPLY KIT |
726719-B21 | 16 | HP 16GB 2RX4 PC4-2133P-R KIT |
733660-B21 | 1 | HP 2U SFF EASY INSTALL RAIL KIT |
734360-B21 | 2 | HP 80GB 6G SATA VE 2.5IN SC EB SSD |
749974-B21 | 1 | HP SMART ARRAY P440AR/2G FIO CONTROLLER |
753958-B21 | 2 | NVIDIA GRID K2 RAF PCIE GPU KIT |
762768-B21 | 1 | HP DL380 GEN9 E5-2687WV3 KIT |
762768-L21 | 1 | HP DL380 GEN9 E5-2687WV3 FIO KIT |
455883-B21 | 2 | HP BLC 10GB SR SFP+ OPT |
Update!
The DL380 Gen9’s are now on all four relevant HCL’s for a GRID deployment. The four HCL’s are below:
NVIDIA GRID HCL:
http://www.nvidia.com/object/enterprise-virtualization-where-to-buy.html
XenServer HCL:
http://hcl.xenserver.org/BrowsableServerList.aspx
XenServer vGPU HCL:
http://hcl.xenserver.org/vGPUDeviceList.aspx
XenServer GPU Passthrough HCL:
http://hcl.xenserver.org/GPUPass-throughDeviceList.aspx
First, I will start with some compatibility differences that I found.
Compatibility:
XenDesktop 7.1 is not officially supported with XenServer 6.5. I did test to see if it would work. However, the creation of the host connection from XenDesktop 7.1 to the XenServer 6.5 failed and I made no further attempts to make it work. XenDesktop 7.5 and 7.6 officially support XenServer 6.5. As I noted above, I am testing with XenDesktop 7.6.
On an HP-centric note, the DL380 Gen9’s were designed for only 64 bit Operating Systems. Consequently, HP is not offering RAID driver support for XenServer 6.2 (which has a 32 bit Dom0) on a Gen9 server. XenServer 6.5 is supported on the DL380 Gen 9 servers and works with the embedded RAID controller drivers.
XenServer 6.5 requires the latest Citrix license server, 11.12.1. XenServer 6.5’s licensing window has slightly different license options than in 6.2. Notice the new “XenServer Desktop” and “XenServer Desktop+” options in the below screenshot.
“XenServer Standard” and the free version do not give you vGPU functionality but the other options (Desktop, Desktop+, and Enterprise) do. Thomas Poppelgaard has a good explanation of the new license model here: http://www.poppelgaard.com/citrix-xenserver-6-5
You can also reference the XenServer 6.5 Licensing FAQ. http://support.citrix.com/article/CTX141511
Connecting XenCenter to a host with GRID installed normally displays a “GPU” tab as shown below. If that is unexpectedly missing, check the XenServer licensing. If the XenServer does not have sufficient licensing to support vGPU, this tab will be absent.
Another symptom of having insufficient licensing is shown in the below screenshot. In the properties of a VM, the GPU type only shows two generic options.
Properly licensed, all GPU profiles are present as shown below.
UEFI Not Supported:
XenServer 6.5 does not support a UEFI BIOS. The corresponding excerpt from the 6.5 release notes is below (http://support.citrix.com/servlet/KbServlet/download/38334-102-714582/XenServer-6.5.0-releasenotes.pdf)
“UEFI boot is currently not supported in XenServer. Customers should ensure that their XenServer hosts are configured to boot in Legacy BIOS mode. Consult your hardware vendor for detailed instructions.”
HP’s compatibility matrix for XenServer confirms the same below. http://www.hp.com/go/citrixcert
The two screenshots below show the change from EUFI to Legacy BIOS Mode in the DL380 Gen 9.
Power Settings in XenServer 6.5:
Take note of the following article that details changes in CPU management by XenServer 6.5. http://support.citrix.com/article/CTX200390 In 6.2, the wisdom for high performance graphics was to set the XenServer frequency governor to performance mode which will keep the p-states (performance states) of the CPU in the BIOS at zero, which is the highest, or most active level. Given that some BIOS’s don’t have an option to let the OS (XenServer) control the p-states, I normally also set the CPUs to high-performance mode via the BIOS. The idea behind keeping the p-states high is that the CPU is driving graphics commands to the GPU. Therefore, you don’t want a lag while the CPU ramps up its p-state or also comes out of a c-state (sleep state).
In 6.5, the frequency governor is set to performance mode by default. Therefore you no longer need to modify this at the command line.
In XenServer 6.5, by default, all available C-states are used regardless of the BIOS settings. This behavior can be changed at the command line should you want it to behave like XenServer 6.2. Doing so could allow you to set the c-states to zero for workloads that require low-latency wake-ups.
This document also advises on Intel TurboBoost options. TurboBoost reaches its highest frequencies if some cores are inactive. Therefore, TurboBoost is much less likely to reach its highest frequencies if all the cores are active. In other words, c-states should be enabled if you want TurboBoost to function at its highest levels. The higher the c-state number, the deeper the sleep level, and the longer it takes to return to an active state. You will need to decide the right balance of performance, power savings, and processor longevity. Do you want consistent high-performance and no wake-up lag (c-states at zero) or do you want the ability to burst to high CPU frequencies (TurboBoost)? I welcome the posting of any real-world metrics comparing performance for differences in c-state level. I recommend taking a close look at this document to make the right choice for your environment!
Above 4G Decoding:
In XenServer 6.2, vGPU required that “Above 4G Decoding” or “PCI Express 64-Bit BAR Support” be disabled. This was due to the limitations of the 32bit Dom0. 6.5’s 64bit Dom0 allows you to leave this setting enabled. This setting will allow the connection of more PCI devices. A screenshot of this setting on the DL380 Gen 9 is below.
VDA Patch:
For virtual desktops running on XenServer 6.5 with the corresponding Version 340.57, 341.08 of NVIDIA GRID, the hotfix for the virtual desktop OS’s should be applied.
http://support.citrix.com/article/CTX140263
Update NVIDIA Driver in Virtual Desktops:
For virtual desktops running on XenServer 6.5 with the corresponding Version 340.57, 341.08 of NVIDIA GRID installed on the XenServer hypervisor, the virtual desktops must have the corresponding NVIDIA driver installed. NVIDIA GRID drivers can be downloaded here: http://www.nvidia.com/Download/index.aspx?lang=en-us
Accurate vCPU Configuration in XenCenter:
In XenServer 6.2, hosts with two or more CPU sockets could not have the vCPU count of a VM accurately set in XenCenter. Setting the number of vCPUs would appear to take in XenCenter, however running Task Manager or Device Manager in the guest OS would show otherwise. You could though, set the vCPU count via the command line. With XenServer 6.5, these settings can now be accurately set through XenCenter by modifying the VM properties:
Previous Commands Needed to Accurately Set vCPU Count in XenServer 6.2:
xe vm-param-set platform:cores-per-socket=X uuid=<VM UUID>
xe vm-param-set VCPUs-max=<Maximum number of cores> uuid=<VM-UUID>
xe vm-param-set VCPUs-at-startup=<Number of VCPUs> uuid=<VM-UUID>
For example, to configure a virtual machine with 6 vCPUs on a host with 2 CPU sockets, run the following commands:
xe vm-param-set platform:cores-per-socket=3 uuid=698dc430-d9eb-8c59-ea32-783010dde169
(There are two sockets on the hardware and 3 cores from each socket will be used)
xe vm-param-set VCPUs-max=6 uuid=698dc430-d9eb-8c59-ea32-783010dde169
(The maximum number of vCPUs that the VM can see)
xe vm-param-set VCPUs-at-startup=6 uuid=698dc430-d9eb-8c59-ea32-783010dde169
(This is the number of vCPUs that the VM will be presented with)
Dynamic Memory Control:
In XenServer 6.2, enabling Dynamic Memory Control could make vGPU-enabled VM’s blue-screen. This is fixed in XenServer 6.5 so feel free to enable it. For more detail, see Dane’s two articles below:
Previous article regarding the incompatibility of Dynamic Memory Control with vGPU on XenServer 6.2:
http://blog.itvce.com/2015/01/02/xenserver-dynamic-memory-and-nvidia-grid-vgpu-dont-do-it/
Previous article announcing that Dynamic Memory Control is compatible with vGPU on XenServer 6.5:
http://blog.itvce.com/2015/01/13/xenserver-6-5-dynamic-memory-and-nvidia-grid-vgpu-now-fixed-in-6-5-go-for-it/
Step by Step Installation and Configuration Instructions
Below are the step-by-step screenshots to get XenServer 6.5 up and running on a DL380 Gen9.
-While viewing the server console, in my case using HP’s iLO, press F9 to enter the System Utilities screen. Select “System Configuration” and press “Enter.”
-Then select “BIOS/Platform Configuration (RBSU), as shown below.
-Select “System Options” and press “Enter.”
-Select “Processor Options” and press “Enter.”
-Enable “Intel® Hyperthreading Options” if desired.
-Navigate back to the below menu, select “Virtualization Options” and press “Enter.”
-Ensure “Virtualization Technology” is enabled.
-Ensure “Intel® VT-d” is enabled.
-Navigate back to the below menu, select “Power Management” and press “Enter.”
-Select “HP Power Profile” and set to “Maximum Performance.” As I mentioned above, consult this document to make the right choices for your environment. http://support.citrix.com/article/CTX200390
-After setting the HP Power Profile to Maximum Performance, all power and performance settings are set to maximum performance. Should you want to tune these individual settings based off of the maximum performance profile, then change the HP Power Profile to “Custom.” This keeps all settings as maximum performance but allows modification of individual settings.
-The c-states (sleep states) can be modified here should you choose. Press Esc to return to the previous menu.
-In the below menu, select “Performance Options.”
-Enable “Intel® Turbo Boost Technology” if desired.” Press Esc to return to the previous menu.
-On the below menu, press “ctrl” + “a” to enter the hidden Service Options menu.
***This following step does not need to be done on XenServer 6.5 as it has a 64 bit Dom0 OS.” However, I wanted to show this step as this setting is not accessible without entering “ctrl + a.”
-On the Service Options menu, select “PCI Express 64-Bit BAR Support” and set to “Disabled.” (Leave this Enabled in 6.5)
-Press “F10” to save changes and then press “Y” to confirm.
-Press Esc to return to the previous menu.
-As noted above, XenServer 6.5 does not support a UEFI BIOS. Enable “Legacy BIOS Mode” as shown in the following three screenshots.
-I will spare you the screenshots for configuring the RAID controller as that is really off-topic. The RAID drivers for the Smart Array P440ar were on the XenServer 6.5 media, which is always appreciated.
Attach Virtual Media:
-Navigate to the “Virtual Drives” menu at the top of iLO and select “Image File CD-ROM/DVD”
-Windows Explorer will open. Navigate to the XenServer 6.5 installation ISO on your computer or file share.
-Reboot and choose to boot from the optical drive upon boot.
-The XenServer installation wizard is the same as it was in XenServer 6.2. The self-explanatory steps are below:
Select Accept EULA and press Enter:
Enable Thin Provisioning if desired, select OK, and press Enter:
Select Local Media and select OK and press Enter:
Select Skip verification and press Enter:
Enter a password twice, select OK and press Enter:
Select the appropriate Ethernet adapter, select OK and press Enter:
Select Static configuration, enter an IP Address, Subnet Mask, and Gateway, select OK and press Enter:
Enter a hostname and DNS Server addresses, select OK and press Enter:
Select your applicable region. Select OK and press Enter:
Select the applicable time zone within your region. Select OK and press Enter:
Select Using NTP, select OK and press Enter:
Configure the appropriate NTP Servers. Select OK and press Enter:
Select Install XenServer and press Enter:
When completed, select OK and press Enter:
Enable SSH:
-Should your security policy allow it, enable SSH to XenServer as follows.
-Launch the iLO Remote Console.
-Select “Remote Service Configuration” (screenshot below)
-Select “Enable/Disable Remote Shell,” enter the root password, and then select “Enable.”
Set XenServer Frequency Governor to Performance mode:
XenServer 6.5 has the frequency governor set to performance mode by default, therefore the commands below are mainly for reference sake. XenServer 6.2’s frequency governor is not set to performance mode by default. Changes in this area for XenServer 6.5 are documented here: http://support.citrix.com/article/CTX200390
Command to set the frequency governor to performance mode:
/opt/xensource/libexec/xen-cmdline –set-xen cpufreq=xen:performance
Command to query the current setting of the frequency governor:
/opt/xensource/libexec/xen-cmdline –get-xen cpufreq
License XenServer:
-In XenCenter, Go to Tools>License Manager>Check the host to be licensed, Click “Assign License.”
-XenServer 6.5 requires Citrix license server at or above version 11.12.1.
-Choose Desktop, Desktop+, or Enterprise to allow vGPU functionality.
-Point to a license server with the corresponding licenses installed.
Patch XenServer:
-In XenServer 6.2, patches could only be applied using XenCenter if XenServer was licensed. Otherwise the command line was the only way. I understand that in XenServer 6.5, even the free version can have patches applied via XenCenter. There were no patches available to test with at the time of writing.
-In XenCenter, right-click on the top level in the left-pane and choose “Add”
-Enter the IP or FQDN of the new XenServer host.
-Once the new host is added, go to Tools>Check for Updates.
-Available updates for all hosts that XenCenter is connected to will be presented. Look for the new server under the “Applies To” column. For the desired updates for the new host, right-click on that update and choose “Download & Install.”
VDA Patch:
For virtual desktops running on XenServer 6.5 with the corresponding Version 340.57, 341.08 of NVIDIA GRID, the hotfix for the virtual desktop OS’s should be applied.
http://support.citrix.com/article/CTX140263
Update NVIDIA Driver in Virtual Desktops:
For virtual desktops running on XenServer 6.5 with the corresponding Version 340.57, 341.08 of NVIDIA GRID installed on the XenServer hypervisor, the virtual desktops must have the corresponding NVIDIA driver installed. NVIDIA GRID drivers can be downloaded here: http://www.nvidia.com/Download/index.aspx?lang=en-us
Install GRID Manager:
-Ensure that GRID installation in XenServer is done after all XenServer patches are applied as some XenServer patches require the reinstallation of GRID. (At least that was the fact in XenServer 6.2 so I would plan for similar behavior in 6.5.)
-Connect to the XenServer with an SCP application such as WinSCP.
-Copy the .rpm GRID installer to a desired folder on the XenServer.
-Connect to the XenServer with an SSH application such as PuTTY.
-Run this command to install GRID: rpm -iv /<yourDesiredFolder>/<yourGRIDinstallerPackage>.rpm For instance:
rpm -iv /install/NVIDIA-vgx-xenserver-6.5-340.57.x86_64.rpm
-Reboot XenServer: shutdown –r now
-Verify the GRID package installed by running the below command:
rpm –q NVIDIA-vgx-xenserver
-Verify the GRID installation by running the below command:
lsmod | grep nvidia
-Verify that the NVIDIA kernel driver can successfully communicate with the GRID physical GPUs in your system by running the nvidia-smi command, which should produce a listing of the GPUs in your platform:
nvidia-smi
*If you need to uninstall a previous version of the GRID Manager on XenServer, run the below command. When updating the NVIDIA driver, you may need to uninstall and then reinstall to avoid conflict.
-First find the name of the NVIDIA RPM package currently installed by running the below command:
rpm –q NVIDIA-vgx-xenserver
-Use the output of the above command to insert as the “packageName” variable in the below command:
rpm –e <packageName> (“e” stands for “erase”)
For example:
rpm –e NVIDIA-vgx-xenserver-6.2-331.59.01
The rest of the process to get virtual desktops provisioned is mostly similar to a non-GPU virtual desktop in XenDesktop. Therefore, I will summarize those steps.
-Create a Host Connection in Citrix Studio to connect XenDesktop to XenServer. The only thing that differs from the normal process of creating a Host Connection in XenDesktop is the need to define your desired GPU/vGPU profile in the Host Connection creation wizard.
-In Citrix Studio, create a Machine Catalog from your golden master that has the NVIDIA driver installed in the guest OS. Ensure that the Properties of this VM have your desired GPU profile assigned. When creating the Machine Catalog, be sure to select a Host Connection that has a matching GPU profile type as the GPU profile selected on your golden image.
-Create a Delivery Group for access to your Machine Catalog.
If you encounter any other differences in the configuration process between 6.2 and 6.5 on NVIDIA GRID, please use the comments section at the bottom of this blog.
Good luck and enjoy!
Richard
Good afternoon, I am currently trying to set up an HP Proliant DL380 G9 with two Nvidia Grid K2 Cards. I have performed all the steps mentioned above for Xenserver 6.5. However when I try to execute the nvidia-smi I get this error: Unable to determine the device handle for GPU 0000:0A:00.0: Unable to communicate with GPU because it is insufficiently powered.
This may be because not all required external power cables are
attached, or the attached cables are not seated properly.
Any idea why I am getting this message?
Lookiing Forward to your comments and many thanks to your excellent blog Richard.
Regards,
Nils
The DL380 Gen9’s are now on all four relevant HCL’s for a GRID deployment. The four HCL’s are below:
NVIDIA GRID HCL:
http://www.nvidia.com/object/enterprise-virtualization-where-to-buy.html
XenServer HCL:
http://hcl.xenserver.org/BrowsableServerList.aspx
XenServer vGPU HCL:
http://hcl.xenserver.org/vGPUDeviceList.aspx
XenServer GPU Passthrough HCL:
http://hcl.xenserver.org/GPUPass-throughDeviceList.aspx
I want to let everyone know that NVIDIA just released new GRID drivers yesterday (2/24/2015). You can download them here: http://www.nvidia.com/download/driverResults.aspx/82250/en-us
-Richard