How to Fix Azure VMware Solution Issues

Microsoft Fix Intermediate 14 min read Official Docs Grounded Updated April 20, 2026

Why Azure VMware Solution Issues Are So Frustrating

I've seen this exact situation play out on dozens of enterprise deployments: you've gotten approval, allocated the budget, and you're ready to extend your VMware workloads into Azure. You fire up the Azure Portal, start the private cloud provisioning wizard , and something breaks. Maybe the deployment stalls at 70% and never finishes. Maybe the VMware HCX Connector refuses to pair. Maybe your VMs can't talk to Azure services even though the VNET connection looks fine on paper. You stare at a red error banner with no useful message and wonder what you paid for.

Here's the root of the problem. Azure VMware Solution isn't a simple IaaS product. It provisions private clouds containing VMware vSphere clusters on dedicated bare-metal Azure infrastructure , the minimum initial deployment alone is three hosts. That means there are at least four distinct layers where things can go wrong: Azure resource provisioning, VMware vCenter/vSAN/NSX configuration, network connectivity between your private cloud and on-premises or Azure VNets, and the HCX migration stack on top of all that.

Microsoft manages the private cloud infrastructure and software, you're not supposed to touch the underlying hardware or most of the vSphere host configuration. That's actually good news for reliability, but it means that when something breaks, the error surfaces in a way that feels abstract. You get a generic Azure Resource Manager error instead of a vSphere event log entry. Or an NSX segment that won't come up but shows no fault in the Azure Portal.

The official docs are thorough, but they assume a clean-slate environment. Real enterprise environments have host quota limits that haven't been raised, existing ExpressRoute circuits with conflicting address spaces, DNS forwarders that need updating, and on-premises HCX deployments running older firmware. Any one of these can cause your Azure VMware Solution deployment to fail silently or partially.

The issues I see most often fall into five buckets: host quota not requested before deployment, private cloud provisioning failures due to address space conflicts, VMware HCX Connector installation and pairing errors, NSX network segment misconfiguration, and DNS/DHCP problems that kill workload connectivity after migration. All of them are fixable. Let's go through each one.

Browse all Microsoft fix guides →

The Quick Fix, Try This First

If your Azure VMware Solution private cloud is stuck in a "Provisioning" or "Failed" state, the single fastest thing to check is whether your subscription has an approved host quota for the region you're deploying into. This trips up more teams than any other single issue, including seasoned Azure architects who assume the quota is automatic.

Here's what you do. Open the Azure Portal and navigate to Help + Support > New Support Request. Set Issue type to "Service and subscription limits (quotas)", set Quota type to "Azure VMware Solution", and then select your subscription and target region. In the request details, specify the number of hosts you need, remember, the minimum initial deployment is three hosts per cluster, and the maximum is 16 per cluster. Submit the request.

Microsoft typically processes host quota requests within 3–5 business days for standard regions. For Azure Government regions, allow extra time. Once your quota is approved you'll get an email notification and the quota will reflect in your subscription's Usage + Quotas blade under the provider Microsoft.AVS.

After quota is confirmed, if your private cloud deployment is stuck, go to your private cloud resource in the Portal, open the Connectivity blade, and verify that the management network CIDR block (/22 minimum) doesn't overlap with your on-premises address space or your Azure VNet address ranges. An overlapping /22 is the second most common cause of silent provisioning failure.

If both of those check out, delete the failed private cloud resource, wait 10 minutes for the ARM resource state to clear, and redeploy. Yes, it's frustrating, but failed ARM deployments sometimes leave orphaned resource locks that block re-provisioning until the state fully clears.

Pro Tip
Request your host quota at least two weeks before you plan to deploy, not the day of. Even if you're just prototyping, the quota request is non-negotiable and there's no way to bypass it programmatically. Teams that plan the quota request in parallel with their network planning checklist never miss this.
1
Request and Verify Azure VMware Solution Host Quota

Before a single host provisions, Azure checks whether your subscription has approved capacity for Azure VMware Solution in your target region. Without it, the deployment will fail almost immediately, often with a generic QuotaExceeded error that doesn't make the VMware context obvious.

Navigate to Azure Portal > Help + Support > New support request. On the Basics tab, set Issue type to Service and subscription limits (quotas) and Quota type to Azure VMware Solution. Select your Subscription from the dropdown, then click Next: Solutions and proceed to Next: Details.

In the Details step, enter your target Azure region and the number of hosts you're requesting. For a production private cloud, you typically request at minimum six hosts to support two clusters with redundancy. If you're running stretched vSAN clusters for disaster recovery, you'll need hosts in two paired Azure regions simultaneously, plan both quota requests at the same time.

To check your current approved quota after the request is processed:

az vmware quota show \
  --location eastus \
  --subscription <your-subscription-id>

Once quota is approved, you'll also see the Microsoft.AVS resource provider listed under Subscriptions > Resource Providers with status Registered. If it shows NotRegistered, register it manually:

az provider register --namespace Microsoft.AVS

If this step succeeds, you should see "registrationState": "Registered" in the command output within a few minutes. That's your green light to proceed with deployment.

2
Fix Private Cloud Deployment Failures in Azure Portal

Private cloud provisioning failures almost always come down to one of three things: address space conflicts, missing permissions, or a partially completed prior attempt that left resource locks in place. I know this is frustrating, especially when the Portal gives you a red banner with just "Deployment failed" and no detail. Here's how to dig deeper.

First, check your CIDR blocks. The management network requires a /22 address range that does not overlap with anything in your on-premises network or your Azure VNets. Each workload segment you create later also needs its own non-overlapping CIDR. Common mistakes include reusing the 10.0.0.0/22 block that an ExpressRoute gateway is already using, or picking a range that technically doesn't conflict today but will when a new spoke VNet is added next quarter.

Go to your failed deployment in Subscriptions > Deployments and click the failed deployment entry. Expand the error details, the ARM error code will usually be one of InvalidParameter, AddressSpaceConflict, or ResourceGroupDeploymentFailed. Each has a different resolution path.

For AddressSpaceConflict, you must change your management CIDR before retrying. For InvalidParameter, double-check that your private cloud name is lowercase, alphanumeric, and under 15 characters, the resource name validation is stricter than the Portal's inline hints suggest.

If a prior deployment attempt is blocking you, use the following to remove a stuck resource lock:

az lock delete \
  --name <lock-name> \
  --resource-group <your-rg> \
  --resource-type Microsoft.AVS/privateClouds \
  --resource <private-cloud-name>

After clearing the lock, wait 10 minutes before attempting a fresh deployment. When the new private cloud status shows Succeeded in the Portal's Overview blade, you're through this step.

3
Resolve VMware HCX Connector Installation and Pairing Errors

VMware HCX is what makes Azure VMware Solution actually useful for migration, it's the layer that handles workload mobility, network extension, and disaster recovery between your on-premises VMware environment and your Azure private cloud. When HCX refuses to install or pair, migrations stall completely. I've seen teams lose weeks here because the error messages from HCX are notoriously unhelpful.

The install sequence, per official guidance, is: first install VMware HCX in your Azure VMware Solution private cloud (done through the Portal's Add-ons > Migration using HCX blade), then install and activate the on-premises VMware HCX Connector appliance, then configure the HCX site pairing and service mesh.

The most common pairing failure I see is a firewall blocking the required outbound ports from the on-premises HCX Connector to the Azure HCX Cloud Manager. The HCX Cloud Manager in Azure VMware Solution requires outbound connectivity on TCP 443 (HTTPS), TCP 9443 (HCX REST API), and UDP 4500/500 (IPSec tunnel negotiation). Check your on-premises perimeter firewall rules first, these ports are often blocked by default corporate policy.

During HCX Connector activation, if you see error HCX-E-0019 or a generic "License key invalid" message, the activation key has expired. Keys generated from the Azure Portal have a limited validity window. Generate a fresh activation key from the Portal:

az vmware addon hcx show \
  --private-cloud <private-cloud-name> \
  --resource-group <rg-name>

If the HCX Network Extension shows as "Disconnected" after pairing appears successful, the issue is almost always an MTU mismatch. Set the HCX Uplink Network Profile MTU to 1350 to account for IPSec encapsulation overhead over ExpressRoute or VPN. Navigate to HCX Manager > Infrastructure > Interconnect > Network Profiles and update the MTU value there. After saving, restart the HCX-IX appliance pair from the same interface. Reconnection typically takes 2–4 minutes.

4
Fix NSX Network Segment and Connectivity Problems

Azure VMware Solution uses VMware NSX for all software-defined networking inside the private cloud. Every VM you deploy needs to be connected to an NSX network segment, and creating or configuring those segments is where Azure VMware Solution beginners most often hit a wall. The Portal exposes a simplified interface for creating NSX segments, but it doesn't expose every configuration option, which creates gaps.

To create an NSX network segment through the Azure Portal, go to your private cloud resource, select Workload Networking > Segments, then click Add. You'll need to specify the segment name, the connected gateway (typically your Tier-1 NSX gateway, which is pre-provisioned by Microsoft), the gateway CIDR (e.g., 192.168.100.1/24), and optionally a DHCP range within that subnet.

If VMs connected to an NSX segment can't reach the internet or Azure services, check whether your Tier-1 gateway has a default route advertised. In the Azure Portal under Workload Networking > Gateways, select your Tier-1 gateway and verify that Internet connectivity is enabled if that's your intended design. Per official design guidance, internet connectivity from Azure VMware Solution can be provided either through an Azure Virtual WAN or through a default route originating from an NVA in a connected Azure VNet, make sure your internet connectivity design matches what's actually configured.

For NSX segments that fail to create with error Error 400: Gateway not found, the issue is usually that you're referencing a Tier-1 gateway ID from a different NSX environment. Run this to list available gateways in your private cloud:

az vmware workload-network gateway list \
  --private-cloud <private-cloud-name> \
  --resource-group <rg-name>

Use the id value from that output when creating segments programmatically. When the segment creation succeeds and VMs connected to it can ping the gateway IP (192.168.100.1 in the example above), this step is complete.

5
Troubleshoot DNS and DHCP Configuration Failures

After you've got your NSX segments working, the next thing that breaks workload connectivity is almost always DNS. Azure VMware Solution VMs need to resolve both internal hostnames (for vCenter, domain controllers, app servers) and external hostnames (for Azure services, internet access). Getting DNS right requires configuring a DNS forwarder in NSX and, if you're in a hybrid scenario, pointing it at the right upstream resolvers.

Azure VMware Solution private clouds support two DHCP modes: running a native DHCP server within NSX, or setting up an NSX DHCP relay to forward DHCP requests to an external DHCP server (like a Windows DHCP server on-premises). In the Azure Portal, go to Workload Networking > DHCP and click Add. Choose DHCP Server if you want NSX to handle leases natively, or DHCP Relay if you're forwarding to an existing server. The relay option requires line-of-sight connectivity to your DHCP server via ExpressRoute or VPN, if that connection isn't up yet, stick with the native DHCP Server option first.

For DNS, navigate to Workload Networking > DNS > DNS Zones and add a DNS zone. For a default zone (catch-all), leave the domain name as . (a single dot) and point it at your internal DNS server IP. For Azure Private DNS zones, point the forwarder at Azure's internal resolver 168.63.129.16.

If VMs are getting DHCP leases but DNS resolution is failing, run this from a VM in the private cloud to test the forwarder directly:

Resolve-DnsName -Name "myapp.internal.corp" `
  -Server 192.168.100.1 `
  -Type A

If that times out, the NSX Tier-1 gateway firewall policy is likely blocking UDP/TCP 53 between the segment and the DNS forwarder IP. Add a Gateway Firewall rule in NSX Manager to explicitly allow DNS traffic. When Resolve-DnsName returns a valid IP address, your DNS path is clean.

Advanced Troubleshooting for Azure VMware Solution

If the step-by-step fixes above haven't resolved your issue, you're likely dealing with a more nuanced infrastructure or enterprise-domain scenario. Here's where I go when the basics don't cut it.

Azure Monitor Alerts and Metrics. Azure VMware Solution exposes metrics directly in Azure Monitor, things like vSAN capacity utilization, CPU readiness, and disk latency. Go to your private cloud resource and select Monitoring > Metrics. The metrics namespace is Microsoft.AVS/privateClouds. Set up an alert rule on DiskUsedPercentage exceeding 70%, Microsoft's own guidance flags vSAN at 75% as a critical threshold that can cause provisioning failures for new VMs. I've seen teams wonder why new VM deployments fail with no obvious error, only to discover the vSAN datastore is at 78% full.

Syslog Collection for Deep Diagnostics. The Azure Portal doesn't expose vSphere host-level logs directly. To get them, configure VMware Syslog forwarding to an external collector. In the Portal, use the Run Command feature under Operations > Run Command and execute the Set-ToolsRepo command set to configure syslog destinations. You can then forward logs to Azure Log Analytics via Azure Logic Apps, the official docs describe this path under "Send syslogs to log management solutions via Azure Logic Apps." Once logs are in Log Analytics, query them with:

Syslog
| where Computer contains "esx"
| where SeverityLevel == "err"
| order by TimeGenerated desc
| take 100

ExpressRoute and Hub-and-Spoke Connectivity Gaps. In hub-and-spoke Azure network topologies, Azure VMware Solution private clouds attach to the hub VNet via an ExpressRoute circuit. A very common issue is that spoke VNets can't reach the private cloud even though the hub can. This happens because ExpressRoute gateways don't automatically propagate routes to spoke VNets that are connected via VNet peering, you need to enable Use Remote Gateway on the spoke VNet peering, or use Azure Virtual WAN to handle route propagation automatically. Check your effective routes on a spoke VNet NIC using:

az network nic show-effective-route-table \
  --name <nic-name> \
  --resource-group <rg-name> \
  --output table

If you don't see the Azure VMware Solution management CIDR (/22) in the effective route table, the peering configuration is the culprit.

Rotating cloudadmin Credentials. If you're locked out of vCenter or NSX because credentials were rotated manually outside the Azure Portal (a common mistake on shared admin accounts), use the Portal's built-in credential rotation feature under Manage > Identity. You can rotate the cloudadmin password from there without opening a support ticket. Never change the cloudadmin password directly inside vCenter, doing so breaks the Azure control plane's ability to manage your private cloud.

When to Call Microsoft Support
Escalate to support when: your private cloud is stuck in "Provisioning" for more than 4 hours with no ARM error detail, when vSAN shows a health alarm you cannot clear through the Portal, or when the ExpressRoute circuit between your private cloud and Azure VNet shows "Not Connected" despite correct configuration on both ends. These scenarios require Microsoft to inspect the underlying bare-metal infrastructure, you cannot resolve them from the Portal or CLI. Open a ticket at Microsoft Support and classify it as Severity B if it's blocking workloads, Severity A if it's a complete production outage.

Prevention & Best Practices for Azure VMware Solution

Most of the issues I've described are preventable with upfront planning. Azure VMware Solution is not a platform you want to troubleshoot reactively in production, the blast radius of a misconfigured NSX topology or a full vSAN datastore is significant. Here's what actually works.

Complete the network planning checklist before you deploy anything. Microsoft publishes a dedicated network planning checklist for Azure VMware Solution deployments. Work through it line by line and document every CIDR range, every DNS server IP, and every firewall rule before you touch the Portal. Address space conflicts are the number-one preventable deployment failure and they're easy to catch on a spreadsheet before they cost you a day of troubleshooting.

Use assessments to size your private cloud correctly. Before purchasing host capacity, run a formal Azure VMware Solution assessment using Azure Migrate. Go to Azure Migrate > Servers > Assess and select Azure VMware Solution (AVS) as the target. The assessment will calculate how many hosts you need based on your actual on-premises vSphere inventory, this prevents both under-provisioning (hitting the 16-host-per-cluster ceiling unexpectedly) and over-provisioning (wasting money on idle bare-metal nodes).

Set up Azure Backup Server for VMs from day one. Don't wait for a data loss event to think about backup. Microsoft's official guidance recommends deploying Azure Backup Server (MABS) to protect Azure VMware Solution VMs. The Backup Server is deployed as a VM in your private cloud and integrates with Azure Backup for offsite retention. Configure it during initial setup, not after you've migrated 200 VMs.

Monitor vSAN capacity proactively. Set Azure Monitor alert rules the day your private cloud goes live. Alert on DiskUsedPercentage at 60% (warning) and 70% (critical), this gives you enough runway to request additional hosts or clean up data before vSAN becomes a bottleneck. A full vSAN datastore causes VM provisioning failures and can trigger storage-policy violations that are painful to remediate.

Quick Wins
  • Request host quota 2+ weeks before your planned deployment date, never assume it's automatic
  • Document all IP address ranges (management CIDR, workload segments, on-premises, Azure VNets) in a single spreadsheet before provisioning
  • Enable Azure Monitor alerts on vSAN capacity, CPU readiness, and private cloud health on day one
  • Rotate the cloudadmin password only through the Azure Portal, never directly in vCenter or NSX Manager

Frequently Asked Questions

What is Azure VMware Solution and how is it different from just running VMware VMs in Azure?

Azure VMware Solution provisions private clouds made of dedicated bare-metal Azure hosts running VMware vSphere, vSAN, and NSX, it's not shared infrastructure. You get a full VMware stack (vCenter Server, vSAN datastore, NSX networking) that Microsoft manages and maintains, while you control your workloads. Regular Azure VMs run on Microsoft's Hyper-V hypervisor and use Azure-native networking. Azure VMware Solution is specifically for organizations that want to run existing VMware workloads in Azure without refactoring them, same VMware tools, same operational model, different physical location.

How many hosts do I need to deploy Azure VMware Solution, and what's the maximum per cluster?

The minimum initial deployment is three hosts, you cannot deploy a single-host or two-host private cloud. Three hosts is the minimum needed for vSAN to maintain the required Failures to Tolerate (FTT) policy. The maximum is 16 hosts per cluster, though you can have multiple clusters within a single private cloud. If you need more than 16 hosts in one logical group, you add a second cluster, each cluster can independently scale to 16. Plan your host count based on a formal Azure Migrate assessment rather than guessing.

Why is my Azure VMware Solution private cloud stuck in "Provisioning" for hours?

The most common causes are: unapproved host quota in the target region, a management CIDR block that overlaps with an existing Azure VNet or on-premises address space, or a lingering resource lock from a failed prior deployment attempt. Check your quota approval status first, then verify your /22 management CIDR doesn't conflict with anything in your network inventory. If the Portal shows no error detail after 4 hours, open a Microsoft Support ticket, this scenario sometimes requires Microsoft to inspect the underlying bare-metal allocation on the back end.

How do I connect my Azure VMware Solution private cloud to my on-premises environment?

Microsoft supports two primary connectivity paths: ExpressRoute and VPN. ExpressRoute is the recommended path for production, you connect your private cloud's managed ExpressRoute circuit to an ExpressRoute Gateway in an Azure VNet, which in turn connects to your on-premises environment via a second ExpressRoute circuit (ExpressRoute Global Reach can bridge the two circuits directly). VPN is supported as a secondary option but introduces higher latency and lower bandwidth. For VMware HCX migrations specifically, both paths work, but ExpressRoute delivers significantly better throughput for large-scale VM migrations.

Can I use VMware HCX for disaster recovery with Azure VMware Solution, or is it only for migration?

HCX supports both. For migration, you use HCX Bulk Migration or vMotion to move VMs from on-premises to your Azure VMware Solution private cloud. For disaster recovery, HCX Disaster Recovery (HCX-DR) provides continuous replication and failover orchestration between sites. Beyond HCX, Azure VMware Solution also officially supports VMware Site Recovery Manager (SRM), Zerto, and JetStream DR as disaster recovery solutions, each has different RTO/RPO characteristics, so the right choice depends on your specific recovery targets.

How do I plan the deployment of Azure VMware Solution without breaking existing Azure networking?

Start with the official network planning checklist before you deploy anything. The key design decisions are: internet connectivity model (directly via Azure VMware Solution or through an NVA in a hub VNet), DNS resolution path (NSX DNS forwarder pointing to internal resolvers), and whether you're using hub-and-spoke or Virtual WAN topology. For hub-and-spoke, make sure route propagation from the ExpressRoute Gateway is enabled so spoke VNets can reach the private cloud. The official design documentation on internet connectivity and network design considerations covers each option with specific trade-offs, read it before you click "Create" in the Portal.

Related Microsoft Fix Guides

H
Sai Kiran Pandrala
Our team includes certified Microsoft engineers, Azure architects, and system administrators with 10+ years of enterprise IT experience. Every guide is written from hands-on troubleshooting, not guesswork. We test every fix before publishing.