Citrix Provisioning Services 7.6 Issues and Lessons Learned from the Trenches: Guest Blog Post by John Meek
Over the last several years, I’ve had the honor of working with a Citrix and VMware virtualization focused engineer named John Meek. John has been focused on a large customer deployment over the last several months and has run into several interesting issues deploying the latest version of Citrix Provisioning Services in this customer’s environment.
John has graciously volunteered to share his experiences with the extended virtualization community in an attempt to minimize troubleshooting efforts he has recently gone through. If you have content you would like to contribute and be a guest blogger (or regular) on itvce.com, feel free to reach out to me at email@example.com or on Twitter @youngtech so we can discuss. Without further ado, below is the guest blog post by John Meek, you can find him on LinkedIn or on Twitter.
Virtual Machine Stuck at Windows Splash Screen
Recently I was working with a client and we implemented Provisioning Services 7.6 with the newest updates for ESXi 5.5u2. A major issue (headache) we ran into was that if the Master image was VMware hardware version 10, then when we rebooted to get the PVS vDisk mounted the virtual server would hang at the Windows splash screen forever. We were also using Windows 2008 R2 Service Pack 1 for our VDA PVS Targets.
This issue would also occur when we the Master image was set to boot from hard disk in Provisioning Services with a vDisk from our PVS store mounting as a secondary drive. Because of this we could not complete the Imaging Wizard and BNImage failed as well after a reboot.
Some testing and results:
- Provisioning Services 7.1 – At this client we were also building a PVS 7.1 Hotfix 3 environment in parallel for testing purposes in case we ran into issues with 7.6. They had 7.1 deployed in other areas and it worked well for them. Provisioning Services 7.1 did not experience this issue, we could use VMware hardware version 10 without issue.
- Changed the VMware NIC type to E1000 – We could get the vDisk to load and get past the splash screen if we changed the NIC type to E1000 while using VM hardware 10, but other online sources online mentioned there may be issues registering to the DDC once you changed the network adapter back to VMXNET3.
- Changing the virtual machine version to 8 – To do this we edited the .vmx file to change the parameter virtuaHW.version=”8”. After this we removed the virtual machine from inventory and then added it back. We still had the same issue.
- Deployed a new virtual machine using VMware version 8 and mounted vDisk – This worked using the VMXNET3 adapter.
In summary, we could get the vDisk to load if we changed the NIC type to E1000 on VM hardware 10, but with potential issues registering to the DDC and the client not wanting to use the E1000 adapters we decided to try other solutions.
The next thing we tested was creating a new virtual machine with version 8 of the VMware hardware, and using the VMXNET3 NIC. This worked perfectly and we are now able to mount the PVS vDisk and create our images. The client ultimately determined that they did not need the new hardware features of hardware version 10, so for now we are using this solution. You can verify the additional features here.
I have heard this does not happen for everyone with PVS 7.6 and VMware 5.5u2, and I certainly did not experience it in my lab. In this case we are using Cisco UCS so perhaps it is related to their converged networking, LOM cards, drivers, or some other hardware related issue, but now that we have a working solution we didn’t have time to isolate further.
This was a pretty challenging issue so I thought I would share, hopefully this saves someone a few hours of time if they run into this issue.
Logging into the Provisioning Services Console generates the error “This domain/user does not have access to the farm.”
At my current client when a user that is not the PVS service account tries to access the PVS console, they get the error “This domain/user does not have access to the farm.” If you use the PVS service account you can login successfully, though there were some intermittent times when we got the error with that account as well. We worked with Citrix support for about a week until we finally found a resolution. I did not have this issue in my lab, but we did experience it in their production environment.
Here are resolutions we tested…
- If the PVS service account is a member of Account Operators or better (domain admin) in AD, then users can log into the PVS console without the error
- This clients infosec team will not allow this to be a permanent solution
- Once you remove the service account from this group it breaks again
- Allowed the Domain Users account access to the PVS security groups, and we can log in successfully
- Set the service accounts SPN for PVSSoap services in active Directory
- Verified there were not related authentication errors in the SQL logs and no packets were being dropped at the firewall
- Made sure there were no spaces in the Distinguished Name of the user account as per this article: http://citrixgeeks.com/2014/11/19/pvs-7-6-bug/
- This was part of a domain migration project, so we created a PVS service account and PVS Administrators group that were new and therefore not migrated from the original domain
After working with Citrix for quite some time they determined that the actual fix was to make the PVS service account a member of the Account Operators group in Active Directory. This was unacceptable to my client’s infosec team since these rights are way too elevated for a service account. Citrix even had a support article with this as the recommended fix. They mentioned they were going to update the article after we found a better resolution, however I can no longer find the Citrix support article online. I am assuming it is in the process of being updated with the resolution below.
The new method we ended up finding to resolve this issue was to give the PVS Service account configured in the PVS Configuration Wizard read/write permissions to the Provisioning Services Admin group we had created in Active Directory. This was acceptable to my client’s infosec team and after applying this setting all users in the Provisioning Services Admin group were able to login to the console without generating this error.
Hopefully this blog post has been beneficial to you, and keeps you from several weeks of troubleshooting. If you have any comments, questions, or just want to leave feedback, please do so in the comments section below.