We had a large customer who fell off their chair when they seen the price on their Broadcom bill and wanted to get off the platform ASAP. They had a full VCenter/ESXi environment and there were a load of GPUs in them and that environment hosted a VDI that ran Autocad Software. There are several options to accomplish this, but we deployed Dell Stack and Azure Local.
What was the design?
- 2 node Switchless cluster with 25g Mellanox Cards
- Azure Virtual Desktop (AVD) session hosts (Windows 11/10 Enterprise multi-session*) running on Azure Local. I have stopped deploying Windows 10 a while ago. However, it’s still an option for another few weeks or if you have ESU.
- GPU-P (partitioning) from NVIDIA L4 GPUs assigned to VMs (recommended for multi-session density). This is what was designed we partitioned each card for 16gb.
- Optional: DDA (Discrete Device Assignment) for exclusive, near-bare-metal access to a full L4. We didn’t recommend this, I will be discussing DDA in its own post.
Prerequisites (Once per Cluster)
- Azure Stack HCI 23H2/25H1 with Hyper-V and Failover Clustering; (Windows Admin Center optional).
- NVIDIA vGPU host driver installed on each node with an L4 (guest driver will be installed inside VMs later).
- Decide licensing: NVIDIA License System (CLS cloud) or on-prem DLS (token-based).
- AVD Host Pools created and session host images ready.
1) Verify L4 is Recognized for GPU-P (Per Node)
Run on each HCI node that has an L4:
# List partitionable GPUs (modern Hyper-V cmdlet)
Get-VMHostPartitionableGpu | Format-List Name, PartitionCount, ValidPartitionCounts
If nothing returns, verify BIOS/firmware and that the NVIDIA vGPU host driver is installed correctly.
2) Choose Partition Count Per L4 (Per Node)
Pick a value from ValidPartitionCounts
and set it. Keep counts consistent across nodes for smooth failover.
$gpus = Get-VMHostPartitionableGpu
$gpus | ForEach-Object {
Set-VMHostPartitionableGpu -HostPartitionableGpu $_ -PartitionCount 8
}
Get-VMHostPartitionableGpu | Format-List Name, PartitionCount, ValidPartitionCounts
3) Attach a GPU-P Partition to a VM (Owner Node)
Run on the owner node of the VM (or remote-invoke to that node). VM must be Off for attach.
$vm = "AVD-Session-01"
Stop-VM -Name $vm
# Add GPU-P adapter (generic). If multiple GPUs exist and you want to target one, see InstancePath below.
Add-VMGpuPartitionAdapter -VMName $vm
# Optional: target a specific GPU by InstancePath
# Get-VMHostPartitionableGpu | Format-List Name, InstancePath
# Add-VMGpuPartitionAdapter -VMName $vm -InstancePath "<instance-path>"
# Pin VRAM allocation for the partition (values in bytes; 'GB' suffix supported)
Set-VMGpuPartitionAdapter -VMName $vm `
-MinPartitionVRAM 8GB -OptimalPartitionVRAM 8GB -MaxPartitionVRAM 8GB
Start-VM -Name $vm
Get-VMGpuPartitionAdapter -VMName $vm
You can also tune encode/decode/compute with -Min/Max/OptimalPartition{Encode|Decode|Compute}
switches if needed.
4) Inside the VM: Install NVIDIA vGPU Guest Driver
Run inside the AVD session host VM (Windows 11/10 EVD). Use the guest driver that matches the host vGPU release.
# Silent install inside the VM
Start-Process -FilePath "C:\Temp\NVIDIA-vGPU-Windows-Driver.exe" -ArgumentList "-s" -Wait
# Reboot recommended, then verify:
nvidia-smi
5) NVIDIA Licensing (CLS or On-Prem DLS)
Modern vGPU uses the NVIDIA License System (NLS). Clients use a token, not the legacy 7070 server/port. Open:
- From VM to CLS/DLS: TCP 443 (acquire/renew/release) and TCP 80 (Windows shutdown release).
- For on-prem DLS HA, also open the intra-cluster ports NVIDIA documents.
5a) Generate Client Token
In the NVIDIA Licensing Portal, bind licenses to your CLS/DLS instance and generate a .tok
client configuration token.
5b) Place Token & Registry (Inside Each AVD VM)
$tokDest = 'C:\ProgramData\NVIDIA Corporation\Licensing\clientConfigToken'
New-Item -ItemType Directory -Force -Path $tokDest | Out-Null
Copy-Item 'C:\Temp\client_configuration_token.tok' (Join-Path $tokDest 'client_configuration_token.tok') -Force
$regPath = 'HKLM:\SYSTEM\CurrentControlSet\Services\nvlddmkm\Global\GridLicensing'
New-Item -Path $regPath -Force | Out-Null
New-ItemProperty -Path $regPath -Name 'ClientConfigTokenPath' `
-Value (Join-Path $tokDest 'client_configuration_token.tok') -PropertyType String -Force | Out-Null
# Optional: check out license on user login (helpful during imaging/sysprep)
New-ItemProperty -Path $regPath -Name 'EnableLicenseOnLogin' -Value 1 -PropertyType DWord -Force | Out-Null
# Refresh NVIDIA service or reboot
$s = Get-Service -Name "NVDisplay.ContainerLocalSystem" -ErrorAction SilentlyContinue
if ($s) { Restart-Service $s -Force }
5c) Verify License
nvidia-smi -q | Select-String License
6) Optional: Host Guardrail for Allowed vGPU Series
If you must restrict which vGPU profile series (A/Q/B) can be created on a host GPU (mixed environments), set a host registry switch:
$gpu = Get-PnpDevice | Where-Object { $_.InstanceId -like 'PCI\VEN_10DE*' } | Select-Object -First 1
$driverKey = (Get-PnpDeviceProperty -InstanceId $gpu.InstanceId |
Where-Object {$_.KeyName -eq 'DEVPKEY_Device_Driver'}).Data
$hk = "HKLM:\SYSTEM\ControlSet001\Control\Class\$driverKey"
# Values: Q=1, A=2, B=3 (example: allow A-series profiles)
New-ItemProperty -Path $hk -Name 'GridGpupProfileType' -PropertyType DWord -Value 2 -Force | Out-Null
I changed my mind while wring this. I included the alternative just to be thorough and to give all the options. I can see if there’s a specific niche need for AVD with DDA, but most implementations will partition the GPUs.
Use only if needed; most AVD deployments don’t require this.
7) Alternative: DDA (Full Passthrough) for Exclusive Access
DDA gives one VM the full L4 (no sharing). Use when you need near bare-metal performance. VM must be Off for attach.
7a) Host (Owner Node)
$VM = "AVD-Render-01"
# Find the GPU's LocationPath
$gpu = Get-VMHostAssignableDevice | Where-Object { $_.Name -like "*NVIDIA*" } | Select-Object -First 1
if (-not $gpu) { throw "No assignable NVIDIA GPU found." }
Stop-VM -Name $VM -Force
# Remove from host & attach to VM
Dismount-VMHostAssignableDevice -LocationPath $gpu.LocationPath -Force
Add-VMAssignableDevice -LocationPath $gpu.LocationPath -VMName $VM
Start-VM -Name $VM
7b) Guest (Inside the VM)
Install the standard NVIDIA Windows driver (not the vGPU guest driver). Licensing still uses NLS token. Some entitlements use FeatureType
:
$RegPath = "HKLM:\SYSTEM\CurrentControlSet\Services\nvlddmkm\Global\GridLicensing"
New-Item -Path $RegPath -Force | Out-Null
# Example: 2 = RTX vWS (confirm with your entitlement)
New-ItemProperty -Path $RegPath -Name 'FeatureType' -Value 2 -PropertyType DWord -Force | Out-Null
8) Cluster-Aware Attach Helper (Run Anywhere)
This script finds each VM’s owner node, remotes to it, and adds a GPU-P partition with your VRAM baseline.
param(
[Parameter(Mandatory)] [string[]] $VMNames,
[int] $VramGB = 8
)
foreach ($vm in $VMNames) {
$owner = (Get-ClusterGroup -Name $vm -ErrorAction Stop).OwnerNode.Name
Write-Host "Processing $vm on owner node $owner ..."
Invoke-Command -ComputerName $owner -ScriptBlock {
param($VmName, $VramGB)
Stop-VM -Name $VmName -Force
if (-not (Get-VMGpuPartitionAdapter -VMName $VmName -ErrorAction SilentlyContinue)) {
Add-VMGpuPartitionAdapter -VMName $VmName
}
Set-VMGpuPartitionAdapter -VMName $VmName `
-MinPartitionVRAM ($VramGB*1GB) `
-OptimalPartitionVRAM ($VramGB*1GB) `
-MaxPartitionVRAM ($VramGB*1GB)
Start-VM -Name $VmName
Get-VMGpuPartitionAdapter -VMName $VmName | Format-List *
} -ArgumentList $vm, $VramGB
}
Validation Checklist
- Host:
Get-VMHostPartitionableGpu
returns L4 with the intendedPartitionCount
. - VM:
nvidia-smi
shows the vGPU;nvidia-smi -q
shows Licensed once NLS token is in place. - AVD: Session host properties show GPU present; graphics policies (AVC/H.264, optional HEVC/H.265) enabled; RDP Shortpath (if applicable).
Quick Host Script (GPU-P)
$gpus = Get-VMHostPartitionableGpu
if (-not $gpus) { throw "No partitionable GPUs found." }
$partitionCount = 8
$gpus | ForEach-Object { Set-VMHostPartitionableGpu -HostPartitionableGpu $_ -PartitionCount $partitionCount }
$vm = "AVD-Session-01"
Stop-VM -Name $vm -Force
Add-VMGpuPartitionAdapter -VMName $vm
Set-VMGpuPartitionAdapter -VMName $vm -MinPartitionVRAM 8GB -OptimalPartitionVRAM 8GB -MaxPartitionVRAM 8GB
Start-VM -Name $vm
Sizing Tips for NVIDIA L4 with GPU-P
- Start around 8 GB VRAM per session host and load-test; heavy 3D/ML may need 12–16 GB.
- Use consistent
PartitionCount
across nodes for easier failover. - Monitor with
nvidia-smi
(or vendor telemetry) and adjust density vs. performance.