Azure Virtual Network: Fix Setup & Configuration Errors
Why Azure Virtual Network Problems Keep Biting You
Here's the scenario I see constantly: you've provisioned an Azure Virtual Network, deployed a couple of virtual machines inside it, and now they just won't talk to each other. Or maybe you set up virtual network peering between two VNets in different regions and the connection shows as "Connected" in the portal , but your pings still time out. Or you're trying to delete a VNet you no longer need and Azure throws an opaque error blocking you. I've been there, and I know exactly how maddening it is when you can't tell whether the problem is your NSG rules, your subnet layout, your routing table, or something buried three layers deep in a service endpoint configuration.
Azure Virtual Network is the foundational private networking layer inside Azure. Every VM, App Service Environment, AKS cluster, and Virtual Machine Scale Set you deploy lives inside one. It's the thing that determines whether your resources can reach each other, reach the internet, and reach your on-premises datacenters. When it breaks , or when it was never configured quite right to begin with, everything downstream breaks with it.
The hard truth is that Microsoft's error messages for Azure virtual network configuration errors are often spectacularly unhelpful. "Conflict" errors, generic 400 Bad Request responses in the portal, and peering states that flip between "Initiated" and "Disconnected" without explanation are all common. The Azure portal UI makes some things look simpler than they are, which means it's easy to skip a step that turns out to be mandatory.
What actually causes most Azure VNet setup problems? There are four main culprits: Network Security Group (NSG) rules that silently block traffic, address space overlaps that prevent peering, misconfigured route tables that send packets to the wrong next hop, and DNS resolution failures inside the VNet. Azure also has some non-obvious behaviors around outbound connectivity, for instance, if you're using an internal Standard Load Balancer without a NAT gateway or public IP, your VMs won't have outbound internet access at all. That one catches a lot of people off guard.
This guide walks through every common Azure virtual network issue in a logical sequence, starting with the fastest fix and moving toward the deeper enterprise-level troubleshooting that domain-joined and hybrid cloud environments need. Browse all Microsoft fix guides →
One more thing before we start: Azure Virtual Network sits within Microsoft's broader Network Foundations category alongside Azure DNS and Azure Private Link. Problems that look like VNet issues sometimes turn out to be DNS or Private Link misconfigurations, so keep those services in mind as we go through the steps.
The Quick Fix, Try This First
If you're seeing VMs that can't communicate inside the same Azure Virtual Network, the fastest thing to check, before anything else, is your Network Security Groups. NSGs are the invisible gatekeepers that most people forget to inspect when things stop working. I'd estimate that 60–70% of "my Azure VMs can't talk to each other" tickets I've seen come down to an NSG rule blocking the port in question.
Here's exactly what you do. Open the Azure portal, go to your Virtual Machine, and in the left sidebar click Networking. You'll see a table showing all the NSG rules applied to that VM's network interface, both inbound and outbound. Look at the effective security rules, not just the ones you think you added. Azure merges subnet-level NSGs and NIC-level NSGs together, and the combined result is what actually governs traffic. A rule at the subnet level you set up six months ago can quietly override a NIC-level rule you created today.
If you need to quickly test connectivity between two VMs in the same VNet without modifying NSG rules permanently, you can run the Azure Network Watcher IP Flow Verify tool:
az network watcher test-ip-flow \
--direction Inbound \
--local 10.0.0.4:80 \
--protocol TCP \
--remote 10.0.0.5:* \
--vm myVM \
--resource-group myResourceGroup \
--nic myNIC
This tells you definitively whether a specific flow is being allowed or denied, and which NSG rule is responsible. No guessing required.
If IP Flow Verify shows the traffic is allowed but VMs still can't connect, move on to checking whether your Virtual Network actually has the correct address space assigned and that your subnets aren't exhausted. You can do this in the portal under Virtual networks → [Your VNet] → Address space.
Before touching NSG rules or peering settings, confirm that your Azure Virtual Network address space is actually correct and that your subnets have enough available IP addresses. This sounds basic, but Azure VNet subnet configuration mistakes are more common than you'd think, especially in environments where multiple teams are provisioning resources.
In the Azure portal, navigate to Virtual networks → [Your VNet] → Subnets. Look at the "Available IPs" column for each subnet. If a subnet shows zero or very few available addresses, new resources will fail to deploy into it with a misleading error that doesn't always mention IP exhaustion.
Azure reserves five IP addresses in every subnet: the network address, the broadcast address, the default gateway, and two addresses Azure uses internally for DNS and platform services. So a /29 subnet (8 total addresses) gives you exactly 3 usable host addresses. Plan accordingly.
If you need to resize a subnet, you can expand an existing subnet's address range in the portal, go to Virtual networks → Subnets → [Subnet name] and edit the address range, as long as the new range doesn't overlap with other subnets in the VNet. You cannot shrink a subnet that has resources deployed in it.
For address space issues affecting VNet peering, the address spaces of two peered virtual networks must not overlap at all. If they do, peering will fail with error code SubnetAddressConflict. Check both VNets' address spaces under Virtual networks → Address space and confirm they're completely non-overlapping before attempting to peer them.
If everything looks right here, ample available IPs, no overlapping ranges, move to Step 2.
This is where most Azure virtual network traffic filtering problems actually live. NSGs are stateful firewalls attached either to a subnet, a network interface, or both. When you apply an NSG at both levels, Azure evaluates them in sequence, subnet NSG first for inbound traffic, NIC NSG first for outbound, and the most restrictive combination wins.
To see the effective rules applied to a specific VM's NIC, run this Azure CLI command:
az network nic list-effective-nsg \
--name myNICName \
--resource-group myResourceGroup
Or in PowerShell:
Get-AzEffectiveNetworkSecurityGroup `
-NetworkInterfaceName "myNICName" `
-ResourceGroupName "myResourceGroup"
Look through the output for any Deny rules that match your traffic's source IP, destination IP, port, and protocol. Pay special attention to the default deny-all rules that Azure puts in place, they have priority numbers in the 65000–65500 range. Any custom allow rule you create must have a lower priority number (lower number = higher precedence) to take effect before those defaults.
A common mistake is creating an allow rule for port 443 but forgetting that the traffic is actually hitting port 8443 on the application. Use the IP Flow Verify tool from Step 0 to confirm exactly which port and protocol is in play before editing rules.
When adding a new inbound rule, navigate to Network security groups → [Your NSG] → Inbound security rules → Add. Set Source, Source port ranges, Destination, Destination port ranges, Protocol, and Action. Assign a priority between 100 and 4096, lower numbers run first. After saving, give Azure about 30–60 seconds to propagate the rule before retesting.
If you're working with application security groups (ASGs) to tag groups of VMs instead of using individual IPs in your NSG rules, confirm the VM's NIC is actually associated with the correct ASG under Networking → Application security groups.
Virtual network peering is one of the most powerful features in Azure networking, it lets resources in two separate VNets communicate using private IP addresses, whether those VNets are in the same region or different Azure regions. But peering has a common failure mode that drives people crazy: both sides show "Connected" in the portal, yet traffic between the VNets still doesn't flow.
The first thing to check is whether peering was created on both sides. Azure VNet peering is not automatically bidirectional. You must create a peering link from VNet-A to VNet-B, and a separate peering link from VNet-B to VNet-A. If you only created one side, the peering state on the incomplete side will show "Initiated" instead of "Connected." Go to Virtual networks → [VNet-A] → Peerings and then Virtual networks → [VNet-B] → Peerings and confirm both show "Connected."
Even with both sides connected, there are two settings on each peering link that are often misconfigured:
# Check peering properties via CLI
az network vnet peering show \
--name myPeeringLink \
--resource-group myResourceGroup \
--vnet-name myVNet
Look for allowVirtualNetworkAccess, this must be true on both peering links. If it's false, traffic between the VNets is blocked at the peering layer regardless of NSG rules. You can update it with:
az network vnet peering update \
--name myPeeringLink \
--resource-group myResourceGroup \
--vnet-name myVNet \
--set allowVirtualNetworkAccess=true
Also check allowGatewayTransit and useRemoteGateways if you're trying to route traffic from a peered VNet through a VPN gateway in the hub VNet. One side must have allowGatewayTransit=true and the other must have useRemoteGateways=true, if both are set the same way, gateway transit won't work.
After updating peering settings, verify the NSG rules on both VNets also permit the traffic you're testing, peering being "Connected" doesn't bypass NSG evaluation.
DNS inside Azure Virtual Networks trips people up more than almost any other configuration issue. By default, Azure provides built-in DNS for VMs inside a VNet, VMs get DNS server addresses automatically via DHCP, and they can resolve other VM hostnames within the same VNet using Azure's internal DNS. But the moment you try to resolve hostnames across peered VNets, or when you deploy custom DNS servers, things get complicated fast.
First, check what DNS servers your VNet is actually configured to use. Go to Virtual networks → [Your VNet] → DNS servers. If it's set to "Azure-provided," VMs will use Azure's default DNS (168.63.129.16, more on this in a moment). If you've specified custom DNS server IPs, make sure those servers are reachable from within the VNet and are actually functioning.
The IP address 168.63.129.16 is Azure's magic DNS and platform health probe address. It's a special virtual IP that exists in every Azure VNet and provides name resolution for Azure resources. If you're seeing DNS failures and your VMs use custom DNS servers, confirm those custom servers can forward unresolved queries to 168.63.129.16 as an upstream resolver. Blocking this IP in an NSG is a common mistake that breaks DNS, Windows activation, and Azure health extension communication all at once.
To test DNS resolution from inside an Azure VM:
# On Windows VM (PowerShell)
Resolve-DnsName -Name myothervm.internal.cloudapp.net
# On Linux VM
nslookup myothervm.internal.cloudapp.net 168.63.129.16
If cross-VNet hostname resolution is failing even with peering configured correctly, this is expected behavior, Azure's default DNS does not automatically resolve hostnames across peered VNets. You'll need to either deploy Azure Private DNS zones and link them to both VNets, or set up a custom DNS forwarder VM that both VNets can reach. The Azure Private DNS approach is the cleaner long-term solution for Azure VNet DNS resolution across peered networks.
After changing DNS server settings on a VNet, existing VMs won't pick up the change until their DHCP lease renews or you manually run ipconfig /renew (Windows) or sudo dhclient -r && sudo dhclient (Linux).
Here's a scenario that confuses a lot of people new to Azure virtual network outbound connectivity: you deploy a VM, it's inside a VNet, and you try to curl an external website, and it just hangs. No response, no error, just timeout. What's going on?
By default, VMs in an Azure VNet do have outbound internet access through Azure's default outbound NAT. However, this behavior changed significantly. Microsoft deprecated the default outbound access for VMs not associated with a load balancer or NAT gateway, and as of newer VM deployments, you cannot rely on implicit outbound connectivity. If your VM was deployed without a public IP, without a NAT gateway on the subnet, and without a public load balancer handling outbound rules, it may have no outbound path at all.
The recommended approach for outbound internet access from Azure VMs is to deploy an Azure NAT Gateway on the subnet:
# Create a public IP for NAT Gateway
az network public-ip create \
--name myNATGatewayIP \
--resource-group myResourceGroup \
--sku Standard \
--allocation-method Static
# Create the NAT Gateway
az network nat gateway create \
--name myNATGateway \
--resource-group myResourceGroup \
--public-ip-addresses myNATGatewayIP \
--idle-timeout 10
# Associate NAT Gateway with your subnet
az network vnet subnet update \
--name mySubnet \
--resource-group myResourceGroup \
--vnet-name myVNet \
--nat-gateway myNATGateway
Once the NAT gateway is attached to the subnet, all VMs in that subnet will use it for outbound internet traffic. You should see outbound connectivity restored within a minute or two.
If you're using a Standard Internal Load Balancer in front of your VMs, outbound connectivity is explicitly not available until you either add a NAT gateway or configure outbound rules on a public load balancer, this is by design per Azure's official documentation. Don't let the "all resources communicate outbound by default" statement mislead you; that default has conditions and limitations that matter in practice.
Also check your custom route tables. If you have a User Defined Route (UDR) with a 0.0.0.0/0 next hop pointing to a Network Virtual Appliance (NVA) or VPN gateway, all your internet traffic is being forced through that path. If the NVA is down or misconfigured, outbound connectivity fails silently.
Advanced Troubleshooting for Azure Virtual Network
If the steps above haven't resolved your Azure Virtual Network configuration issue, it's time to go deeper. These are the techniques I pull out for enterprise scenarios, multi-VNet hub-spoke topologies, on-premises hybrid connectivity, and domain-joined environments where Group Policy interacts with Azure networking in unexpected ways.
Using Azure Network Watcher for Deep Packet Analysis
Azure Network Watcher is the single most underused diagnostic tool in Azure networking. Beyond IP Flow Verify and Connection Troubleshoot (covered earlier), it has packet capture capabilities that let you record actual traffic at the VM NIC level, without needing to install anything inside the VM. Enable it from Monitor → Network Watcher → Packet Capture → Add. Select your VM, set filters for source/destination IP and port, and let it run while you reproduce the issue. Download the .cap file and open it in Wireshark to see exactly what's happening at the packet level.
Route Table and BGP Troubleshooting
Custom route tables (User Defined Routes) are a major source of Azure VNet routing problems. To see the effective routes on a specific VM NIC:
az network nic show-effective-route-table \
--name myNICName \
--resource-group myResourceGroup \
--output table
Look for any 0.0.0.0/0 or specific-prefix routes with next hops pointing to VirtualAppliance, these override Azure's default system routes and can silently blackhole traffic if the appliance IP is wrong or the VM at that IP is down.
For hybrid Azure VNet connectivity using ExpressRoute or Site-to-Site VPN, BGP route propagation issues are common. If on-premises routes aren't appearing in your effective route table, check whether BGP route propagation is enabled on the route table associated with your gateway subnet. It's a single checkbox under Route tables → [Your table] → Configuration → Propagate gateway routes, and it's easy to accidentally disable during a routine update.
On-Premises Connectivity: VPN vs ExpressRoute
Azure offers three ways to connect your Azure Virtual Network to on-premises resources: Point-to-Site VPN (one machine at a time, good for developers), Site-to-Site VPN (connects your entire on-premises network through an encrypted tunnel over the internet), and ExpressRoute (a private dedicated circuit through an ExpressRoute partner, traffic never touches the public internet).
VPN gateway connection failures typically show up as IKE failed or Connection timeout in the VPN gateway diagnostics. Check that your on-premises VPN device's shared key matches exactly what's configured in Azure, that IKE version and algorithm settings are compatible, and that UDP port 500 and 4500 are not blocked at your on-premises firewall.
Event Viewer and Diagnostic Logs
For Windows VMs inside an Azure VNet that are losing connectivity intermittently, check the Windows Event Log under Event Viewer → Windows Logs → System for Event ID 4199 (TCP/IP duplicate IP detection), Event ID 1001 (DHCP lease failure), or any Tcpip errors. These often surface Azure-side DHCP or IP conflicts that the Azure portal won't explicitly flag.
Enable Azure VNet diagnostic logs by going to Virtual networks → [Your VNet] → Diagnostic settings → Add diagnostic setting. Send logs to a Log Analytics workspace and query them with KQL to find patterns across time.
Prevention & Best Practices for Azure Virtual Networks
I know it feels like "best practices" articles are just padding, but the ones I'm listing here are things I've seen teams skip and then pay for dearly. These are not generic advice. They're the specific Azure Virtual Network configuration choices that prevent the most common support escalations.
Plan your IP address space before you deploy anything. This sounds obvious, but I've seen enterprise customers get six months into an Azure migration before realizing their VNet address ranges overlap with their on-premises network, making VPN and ExpressRoute connectivity impossible without re-IPing everything. Use RFC 1918 ranges (10.x.x.x, 172.16.x.x–172.31.x.x, 192.168.x.x) and document your allocation scheme centrally. For large deployments, use /16 blocks for major regions, /24 for individual workload subnets. Leave gaps, you will add subnets you didn't anticipate.
Use a hub-spoke network topology. Microsoft's official guidance recommends a hub VNet containing shared services (VPN gateway, firewall, DNS servers, Azure Bastion) peered to spoke VNets for each workload. This topology means you manage central connectivity in one place instead of creating a full mesh of peerings that grows exponentially as you add VNets. It also makes it far easier to route all traffic through a central Azure Firewall for inspection and logging.
Apply NSG flow logs from day one. NSG flow logs record every accepted and denied flow through your security groups, stored in Azure Storage or streamed to Log Analytics. Enabling them costs very little but gives you an audit trail for security investigations and a diagnostic history when something breaks. Go to Network security groups → [NSG] → NSG flow logs → Create.
Never put a gateway subnet in the same /24 as your VM subnets. The GatewaySubnet (yes, it must be named exactly that) should be its own dedicated small subnet, typically a /27 or /28. Mixing it with general VM workloads causes routing complications and is explicitly against Azure's architectural guidance.
Test your VPN and ExpressRoute failover paths before you need them. Hybrid Azure virtual network connectivity setups that include both a VPN gateway and ExpressRoute for redundancy need to be tested under simulated failure conditions. Don't find out your failover path doesn't work during an actual outage.
- Enable Azure Network Watcher in every region where you have VNets, it's free and you'll need it when things break
- Tag every NSG rule with a description field explaining why it exists and who requested it
- Lock production VNets with an Azure Resource Manager delete lock to prevent accidental deletion (Resource group → Locks → Add)
- Use Azure Policy to enforce naming conventions and mandatory tags on all Virtual Network resources from the start
Frequently Asked Questions
What is Azure Virtual Network and why do I actually need it?
Azure Virtual Network is Microsoft's private networking layer inside Azure, think of it as your own isolated section of the Azure datacenter where your VMs, containers, and services live and communicate. Every Azure VM must live inside a VNet. You need it because without it, your Azure resources have no controlled way to talk to each other privately, no way to connect back to your on-premises network, and no layer to apply traffic filtering rules to. It's not optional, it's the foundation everything else builds on. Even if you're just running a single VM for a personal project, that VM is inside a VNet (Azure creates a default one if you don't specify one yourself).
Why can't I delete my Azure Virtual Network even though it looks empty?
Azure won't let you delete a Virtual Network if any dependent resources are still attached to it, even if those resources look inactive. Common culprits are network interfaces that belong to deleted VMs but weren't cleaned up, VPN gateways or Application Gateways sitting in their own subnets, Azure Bastion hosts, and Private Endpoints. Go to Virtual networks → [Your VNet] → Connected devices to see everything still attached. Delete or detach each resource first, then attempt to delete the VNet again. If the portal still blocks you with a vague "Conflict" error, check the Activity Log for more specific detail about what's still holding a reference.
How do I connect two Azure Virtual Networks so VMs in each can communicate?
The standard approach is Virtual Network Peering. You create a peering link from VNet-A pointing to VNet-B, and another link from VNet-B pointing to VNet-A, both sides must be created for it to work. Once peered, VMs in either VNet communicate using their private IP addresses as if they're on the same network. Peering works across Azure regions (called global VNet peering) and even across Azure subscriptions, as long as you have the appropriate permissions on both VNets. Traffic across peerings stays on Microsoft's backbone network and doesn't traverse the public internet. After setting up peering, still verify your NSG rules allow the traffic, peering doesn't bypass security groups.
What's the difference between a Site-to-Site VPN and Azure ExpressRoute for connecting to on-premises?
Site-to-Site VPN creates an encrypted tunnel between your on-premises VPN appliance and an Azure VPN Gateway, traffic travels over the public internet, encrypted. It's fast to set up, affordable, and good for smaller workloads or as a backup path. ExpressRoute is a private dedicated circuit through a Microsoft networking partner, your traffic never touches the public internet at all. ExpressRoute offers higher bandwidth options (up to 100 Gbps), more predictable latency, and is the right choice for large-scale data transfers or compliance requirements that prohibit data traversing the public internet. ExpressRoute is significantly more expensive and takes weeks to provision, while a VPN gateway can be up in an hour. Many enterprises run both: ExpressRoute for primary traffic and a VPN gateway as an automatic failover path.
Why does 168.63.129.16 keep showing up in my Azure network logs?
168.63.129.16 is a special virtual IP address that Microsoft maintains inside every Azure Virtual Network. It's not a real machine, it's a platform service address that handles DNS resolution for Azure resources, DHCP lease distribution, health probe traffic for Azure Load Balancers, and communication with the Azure VM agent and platform extensions. You should never block this IP in your NSG rules. If you do, your VMs will lose DNS resolution, Windows licensing/activation may break, and Azure monitoring extensions stop reporting health data. If you're seeing unexpected traffic to this address in flow logs, that's normal, it's the platform keeping your VMs healthy and connected to Azure services.
My VMs had internet access yesterday but now they don't, what changed?
A few things commonly cause sudden Azure VM outbound connectivity loss. First, check whether someone added or modified a route table on the subnet, a new 0.0.0.0/0 UDR pointing to a broken NVA or VPN gateway will silently blackhole all outbound traffic. Second, check if a new NSG rule was added that blocks outbound traffic, look at the outbound rules tab, not just inbound. Third, if your VMs were relying on Azure's default outbound internet access (no NAT gateway, no public IP, no load balancer with outbound rules), that access may have been removed as part of a subnet or VM configuration change. Deploy a NAT Gateway on the subnet to restore reliable, explicit outbound connectivity. Finally, check the Azure Service Health dashboard for any platform-level incidents in your region that might be affecting networking.