NVIDIA GRID vGPU Deep Dive! Lessons Learned from the Trenches: Guest Blog Post by Richard Hoffman
I have mentioned a colleague of mine Richard Hoffman in a previous article where I talked about XenServer 6.2 Dynamic Memory Control and some Blue Screen of Death (BSOD) events that can occur (which he was instrumental in discovering). I’ve also written a new blog post that this has been resolved in XenServer 6.5, so vGPU should now work without issue with DMC. With a little encouragement, I have convinced Richard to share some of his key findings from a Citrix / NVIDIA GRID project he has been involved in for the last several months. If you have content you would like to contribute and be a guest blogger (or regular) on itvce.com, feel free to reach out to me at dane @ itvce.com or on Twitter @youngtech so we can discuss. Without further ado, below is the guest blog post by Richard Hoffman, you can find him on LinkedIn or on Twitter.
I want to share some information for any of you that are getting up to speed on an NVIDIA GRID vGPU project. There are lots of good guides and articles out there and I won’t try to replicate all that information. What I have included is information that was either not documented well or not documented at all.
Here are the subjects that I am covering:
-How are GPU’s shared?
-Where does GPU virtualization occur?
-Comparing vGPU and passthrough specifications
-Dynamic Memory Control incompatibility with vGPU in XenServer 6.2
-Direction-specific fans in the GRID cards
How Are GPU’s Shared?
We have two ways of presenting the virtual desktop with GPU resources. “Pass-through” presents the entire, physical GPU to the virtual desktop, giving you a 1:1 relationship. “Virtual GPU” allows multiple VM’s to access the same physical GPU.
Virtual GPUs give each VM a dedicated portion of video RAM. In other words, the vGPU’s do not share RAM in any way. The video RAM on the card is divided up and that portion is dedicated to a particular VM. However, the GPU cores are time-sliced similar to how a CPU is time-sliced on a hypervisor. So if a GRID GPU has 768 GPU cores, each virtual desktop gets all 768 cores for a split second, and then the next virtual desktop gets them for a split second, and so on. If there is no contention for GPU cores, then one virtual desktop gets full access to the cores for the duration of the session. If there is contention, then each desktop gets a time-slice of the GPU cores.
Where does GPU virtualization occur?
This answer is not documented anywhere and I have actually been told conflicting information along my search for the answer. I was first told that the scheduling of the GPU cores occurs within the hypervisor but this is incorrect. The virtualization occurs at the hardware level and the technology is proprietary to NVIDIA. Scheduling is handled by the scheduler in the GPU chip itself, at the GPU hardware level.
The NVIDIA GRID Manager, installed on XenServer, communicates with the physical GPUs to determine where a VM can be placed. Once the VM is assigned to a physical GPU, the GRID Manager steps out of the way and communication from the NVIDIA driver in the guest OS is direct to the GPU.
I have only seen the diagram below from one presentation and nowhere else online. This slide is unique in that it shows the NVIDIA Kernel Driver residing in Dom0 of XenServer. This driver allows the NVIDIA GRID Manager to communicate with the GRID board and GPUs in order to assign VMs and monitor ongoing usage of the GPUs. This driver is not responsible for graphics delivery between the VMs and the GPUs.
Graphics delivery occurs between the NVIDIA driver installed within the virtual desktop guest OS and the physical GPUs. That is the benefit of the GRID solution over VMWare’s current VSGA that uses a translation or “shim” driver installed on the hypervisor. This direct communication from the NVIDIA driver to the GPU gives better fidelity and less overhead when compared to VMware’s VSGA solution.
Comparing vGPU and passthrough Specifications:
I also found that NVIDIA’s documentation compares vGPU profiles to each other but does not compare vGPU profiles to passthrough. This chart below shows a comparison of the vGPU profiles but no passthrough specs are included.
The below chart shows the specs for the GRID cards and those can be used to calculate the specs of the passthrough GPUs.
The “Total Memory Size” is listed for the K1 and K2 cards as 16GB and 8GB, respectively. This is the memory for the entire card, not the memory allocated to a passthrough GPU. For instance, both a passthrough K1 GPU and a passthrough K2 GPU get 4GB of video RAM. The 16GB of video RAM on the K1 card is divided between its four physical GPUs. The 8GB of video RAM on the K2 card is divided between its two physical GPUs.
To assist discussing this with clients and end-users, I have combined the above two charts and also added specs for two NVIDIA Quadro GPU cards for physical workstations. See the chart below.
|GPU Board||GPU Profile||GPU Cores||Video RAM||Max Displays Per User||Max Resolution Per Display|
|K1||Passthrough||192 (Dedicated)||4 GB||4||2560 x 1600|
|K1||K180Q||192 (Time Slice)||4 GB||4||2560 x 1600|
|K1||K160Q||192 (Time Slice)||2 GB||4||2560 x 1600|
|K1||K140Q||192 (Time Slice)||1 GB||2||2560 x 1600|
|K1||K120Q||192 (Time Slice)||512 MB||2||2560 x 1600|
|K1||K100||192 (Time Slice)||256 MB||2||2560 x 1600|
|K2||Passthrough||1536 (Dedicated)||4 GB||4||2560 x 1600|
|K2||K280Q||1536 (Time Slice)||4 GB||4||2560 x 1600|
|K2||K260Q||1536 (Time Slice)||2 GB||4||2560 x 1600|
|K2||K240Q||1536 (Time Slice)||1 GB||2||2560 x 1600|
|K2||K220Q||1536 (Time Slice)||512 MB||2||2560 x 1600|
|K2||K200||1536 (Time Slice)||256 MB||2||2560 x 1600|
(for physical workstations)
|Same core count as K1 Passthrough. Video RAM is less.||192||1 GB||2||DP 1.2: 3840 × 2160
DVI-I DL: 2560 × 1600
DVI-I SL: 1920 × 1200
VGA: 2048 × 1536
(for physical workstations)
|Same core count and video RAM as K2 Passthrough.||1536||4GB||4||DP 1.2: 3840 × 2160
DVI-I DL: 2560 × 1600
It’s important to note that the comparison of the Quadro K600 and K5000 cards to the GRID GPUs is really for the core count. The video RAM on a K600 card is 1GB while a K1 passthrough gets 4GB of video RAM. This is shown in the above chart.
Another chart that will help your discussions is below. It shows how the K600 and K5000, used in the chart above, compare with the entire line of NVIDIA Quadro GPU cards. The “K” that precedes the card model number stands for “Kepler.” That is NVIDIA’s current GPU architecture. The cards listed at the bottom of the chart, without a “K,” use NVIDIA’s older architecture, called “Fermi.”
Dynamic Memory Control incompatibility with vGPU in XenServer 6.2
Aside from a post on the GRID forums that I started and the subsequent article that Dane wrote, there is no documentation online that Dynamic Memory Control is incompatible with vGPU in XenServer 6.2. In short, it causes the VMs to blue screen. I understand that this is fixed in XenServer 6.5.
By default, vSphere lets you overcommit RAM to virtual machines. The default behavior in XenServer is to dedicate RAM to each VM. It may be compelling to turn on Dynamic Memory Control to get better user density but it should not be done in XenServer 6.2. Dane’s write-up on this issue is below.
If you run into an issue where the virtual desktops fail to start and XenServer gives an error that the “vgpu exited unexpectedly,” check if ECC is enabled on your cards. ECC (Error Correcting Code) in the video RAM will cause this error to occur. ECC is not an option on the K1 cards but is on the K2 cards. I encountered some hosts that had one of the three K2 cards with ECC turned on. The cards came like this direct from the OEM. I recommend adding this check to your build steps to ensure ECC is turned off before workloads are put on these cards.
If you run an “nvidia-smi” command on the XenServer, the far right column, under “ECC,” will say, “N/A.” That confirms that it is not applicable for the K1 cards.
Running nvidia-smi on the K2 cards shows that it is appropriately set to zero.
This is the command to turn off ECC.
nvidia-smi -i <ID> -e 0 (where “ID” is the ID that nvidia-smi reports for each GPU. The ID starts at zero and goes up one by one.)
Disabled ECC support for GPU 0000:06:00.0.
Direction-specific fans in the GRID cards
NVIDIA GRID K1 cards are not particularly sensitive for airflow direction. The information below is primarily designated for K2 cards.
GRID cards have two variations in the direction that their fans blow. If you are installing or swapping out the cards be sure to note which is which or the cards could overheat if placed in the wrong position in the server. The form factor of the cards is identical so they could easily be confused. Fortunately, there are two ways to determine airflow direction: The white arrow on the front of the card, and the part numbers that are shown below. The white arrow is shown in the image below:
The part numbers show you how to tell which is which. At the time of my project, the part numbers were not put on the cards but the “PCB part number” is. I have cross referenced the two different part numbers below. The airflow directions, right-to-left and left-to-right, are noted too.
PCB part number 699-52055-0552-311 = Regular part number 900-52055-0020-000 (R2L)
PCB part number 699-52055-0550-311 = Regular part number 900-52055-0010-000 (L2R)
You cannot retrieve either of the part numbers from running the nvidia-smi CLI tool. The PCB part number is printed on the circuit board shown circled below.
Hopefully this will help make a smooth vGPU project for you!
I’d like to thank Richard for taking the time to put this content together. Well done pal, glad we could encourage you to write your first blog post! In our little world of Information Technology and virtualization, little nuggets of knowledge like this can go a long way to help fellow brothers and sisters in arms. If you have content you would like to contribute and be a guest (or regular) blogger on itvce.com, feel free to reach out to me at dane @ itvce.com or on Twitter @youngtech so we can discuss. Otherwise, feel free to leave comments, questions, or any feedback for Richard.