Azure Virtual Machines: Fix Setup & Config Errors

Microsoft Fix Intermediate 14 min read Official Docs Grounded Updated April 20, 2026

Why This Is Happening

Here's a scenario I've seen play out dozens of times: you spin up an Azure virtual machine, everything looks fine in the portal, and then nothing works. You can't SSH in, you're getting allocation failures, your costs are way higher than expected, or the VM simply won't start. Azure's error messages , things like AllocationFailed, OperationNotAllowed, or the cryptic InternalExecutionError , tell you almost nothing about what actually went wrong.

I know this is frustrating, especially when it's blocking a deployment or a demo that needed to be live yesterday.

The root causes almost always fall into one of four buckets. First, region and capacity issues: Azure resources are region-specific, and some VM sizes simply aren't available in every region. If you're trying to spin up a Dasv7-series VM in a region that doesn't offer it, the request will fail, but the portal error won't always make that obvious. Second, resource configuration mismatches: Azure VMs don't exist in isolation. When you create one, Azure automatically provisions a virtual network, a Network Interface Card (NIC), a private IP address, a Network Security Group (NSG), and an OS disk. If any of those supporting resources hits a quota limit or a misconfiguration, your entire VM deployment falls apart.

Third, networking and NSG rules: I've seen engineers spend hours trying to figure out why they can't connect to a freshly deployed VM, only to realize the NSG never had port 22 (SSH) or port 3389 (RDP) opened. Azure creates a default NSG with relatively locked-down rules, and if you don't know to configure it, you're locked out immediately. Fourth, sizing and billing confusion: Azure charges an hourly rate based on VM size and operating system. For partial hours, you only pay for the minutes used, but storage for your OS disk is billed separately at managed disk rates, and that surprises people who expected a simple all-in price.

Azure virtual machines are one of the most powerful tools in the Azure ecosystem, giving you full control over the compute environment without touching physical hardware. But that flexibility comes with responsibility. You have to configure, patch, and maintain everything that runs on the VM. Understanding why things go wrong is half the battle.

Whether you're hitting deployment failures, allocation errors, connectivity blocks, or unexpected costs, this guide walks through each fix methodically. Browse all Microsoft fix guides →

The Quick Fix, Try This First

If your Azure VM deployment just failed and you need it up fast, start here. The single most common reason deployments fail, especially for first-timers, is an allocation failure caused by VM size unavailability in the selected region.

Here's what you do. In the Azure portal, navigate to Virtual machines → Create → Azure virtual machine. On the Basics tab, look at the Region field and the Size field. These two settings have to be compatible. Hit See all sizes next to the Size field, the portal will show you which sizes are actually available in your current region right now. If your chosen size isn't there, either change the region or change the size.

If you're deploying via Azure CLI and hitting an AllocationFailed error, run this command first to see which sizes are genuinely available in your target region before you try again:

az vm list-sizes --location eastus --output table

Replace eastus with your actual target location. The az account list-locations command gives you the full list of valid location names if you're unsure of the exact string.

For connectivity issues, can't SSH or RDP into a VM that deployed successfully, go to your VM in the portal, click Networking in the left sidebar, then click Add inbound port rule. For Linux VMs, add port 22 (SSH). For Windows VMs, add port 3389 (RDP). Set the source to your own IP address, not "Any," for security. This change takes effect within about 30 seconds. Try connecting again immediately after.

Pro Tip
Azure virtual machines in a new subscription often hit default quota limits faster than you'd expect, especially for vCPU counts per region. Before any significant deployment, run az vm list-usage --location <your-region> --output table to see your current vCPU usage versus your quota ceiling. A quota request to Microsoft takes 1–3 business days to process, so check this before you need it, not after.
1
Plan Your VM Configuration Before You Click Create

The single biggest mistake people make with Azure virtual machines is clicking straight into the creation wizard without thinking through the decisions that are hard to change later. Microsoft's own documentation is explicit about this: before you create a VM, you need to settle on the resource names, the location, the size, the operating system, and all the related resources it will need.

Resource names matter more than they seem. Your VM name feeds into the DNS name, the OS hostname, and the names of all supporting resources Azure creates. Names can't be changed after deployment without redeployment. Use a consistent naming convention from day one, something like vm-prod-eastus-web-01 beats MyTestVM every time.

Location selection is about more than geography. Azure stores your virtual hard disks in the region you pick. If your on-premises datacenter or user base is in Europe but you deploy to West US, you're adding latency to every single operation. Check regional availability for your specific VM size before committing to a location. Use the Azure portal's virtual machine selector tool or run:

az vm list-skus --location westeurope --resource-type virtualMachines --output table

VM size determines processing power, memory, storage capacity, and network bandwidth, all of it flows from this one choice. The Dsv6 series is good for general-purpose workloads with balanced CPU and memory. The Fasv7 series is compute-optimized for workloads that are CPU-intensive. The Dasv7 series handles general-purpose workloads with high memory-to-CPU ratios. Pick based on workload, not habit.

If you've done this planning and the deployment still fails, check the VM's Activity Log in the portal (left sidebar → Activity log). The log shows the exact error code and timestamp for every failed operation, far more useful than the generic error banner on the overview page.

2
Fix Network Security Group Rules Blocking Connectivity

You deployed your Azure virtual machine successfully, you have the IP address, you open your SSH client, and nothing. The connection just times out. Before you assume the VM is broken, check the NSG. Nine times out of ten, this is an NSG rule issue.

In the Azure portal, open your VM, then click Networking in the left sidebar. You'll see two sections: Inbound port rules and Outbound port rules. Azure creates a default NSG that blocks most inbound traffic. For SSH access to Linux VMs, you need TCP port 22 open. For RDP on Windows, you need TCP port 3389. For web servers, you'll need port 80 (HTTP) and 443 (HTTPS).

Click Add inbound port rule and fill in these fields exactly:

Source: My IP address (or IP Addresses for a specific range)
Source port ranges: *
Destination: Any
Destination port ranges: 22
Protocol: TCP
Action: Allow
Priority: 300
Name: Allow-SSH

Lower priority numbers win when rules conflict, 300 is a safe priority that won't clash with Azure's default rules (which start at 65000+). Hit Add and wait about 30 seconds for the rule to propagate.

If you're managing this at scale across multiple VMs, use Azure CLI to apply NSG rules programmatically:

az network nsg rule create \
  --resource-group MyResourceGroup \
  --nsg-name MyNSG \
  --name Allow-SSH \
  --protocol tcp \
  --priority 300 \
  --destination-port-range 22 \
  --access allow

After adding the rule, try connecting again. If you still can't reach the VM, verify that the NIC is actually associated with the NSG, in the Networking tab, the NIC name is shown alongside the NSG name. A NIC without an NSG association, or a NIC associated with the wrong NSG, is a frequent misconfiguration in cloned or manually built environments.

3
Resolve Allocation Failures and Quota Errors

The AllocationFailed error is one of the most common Azure virtual machine errors, and it's also one of the most misunderstood. It does not always mean you've hit a billing limit. It often means Azure doesn't have enough capacity of your requested VM size in your selected region at that moment, or you've hit a vCPU quota.

First, check your regional vCPU quota. In the portal, go to Subscriptions → [your subscription] → Usage + quotas. Filter by your region. Look for entries like "Total Regional vCPUs" and specific quota lines for the VM family you're trying to deploy (e.g., "Standard Dasv7 Family vCPUs"). If your current usage is at or near the limit, you'll need to either request a quota increase or choose a VM family with remaining quota.

To request a quota increase, click Request Increase on the Usage + quotas page. Fill in your justification, Microsoft approves most reasonable requests within 1–3 business days for standard VM families.

If quota isn't the issue, the capacity shortage in your region might be temporary. Try these alternatives in order:

# Option 1: Try a different zone in the same region
az vm create --resource-group MyRG --name MyVM \
  --image Ubuntu2204 --size Standard_D4s_v6 \
  --zone 2 --location eastus

# Option 2: Try a different region entirely
az vm create --resource-group MyRG --name MyVM \
  --image Ubuntu2204 --size Standard_D4s_v6 \
  --location eastus2

# Option 3: Check available sizes in your target region
az vm list-skus --location eastus \
  --resource-type virtualMachines \
  --query "[?restrictions[?reasonCode=='NotAvailableForSubscription']]" \
  --output table

The last command is particularly useful, it shows you which VM sizes are explicitly restricted in your subscription for your chosen region, so you can stop guessing and pick something that will actually work.

4
Configure Availability Zones for High Availability

If you're running production workloads on Azure virtual machines, single-instance deployments are a reliability gamble you shouldn't take. Azure's availability zones are physically separated data centers within a single region, separate power, cooling, and networking. Microsoft guarantees VM connectivity to at least one instance 99.99% of the time when you deploy two or more instances across two or more availability zones in the same region. That's the SLA you want for anything business-critical.

Here's what many people get wrong: availability zones are not the same as availability sets. Availability sets protect against hardware failures within a single datacenter. Availability zones protect against entire datacenter failures. For most production workloads in 2026, you want zones, not sets.

To enable availability zones during VM creation in the portal, on the Basics tab look for the Availability options dropdown. Select Availability zone, then choose zones 1, 2, or 3 for your primary VM. Deploy your secondary VM in a different zone. That's it.

Via Azure CLI:

# Primary VM in zone 1
az vm create --resource-group ProdRG --name WebVM-01 \
  --image Win2022Datacenter --size Standard_D4s_v6 \
  --zone 1 --location eastus

# Secondary VM in zone 2
az vm create --resource-group ProdRG --name WebVM-02 \
  --image Win2022Datacenter --size Standard_D4s_v6 \
  --zone 2 --location eastus

If you need auto-scaling on top of high availability, Virtual Machine Scale Sets are the right tool. Scale sets let you define a minimum and maximum instance count, and Azure adds or removes VMs automatically based on CPU load, memory pressure, or a custom schedule. Scale set VMs can also span multiple availability zones for maximum resilience. The tradeoff is complexity, scale sets work best when your application is stateless or uses external session management.

After deploying across zones, validate the configuration by going to each VM's Overview page and checking the Availability zone field. If it shows "–" or "Not applicable," the zone assignment didn't take and you'll need to redeploy.

5
Manage Disks, Storage Costs, and OS Disk Configuration

Azure virtual machine billing regularly surprises people because the VM's hourly compute cost and the storage cost are completely separate line items. The compute charge is based on VM size and OS, billed per minute for partial hours. But the OS disk, typically 127 GiB unless you're using a smaller image, is billed at managed disk rates even when the VM is deallocated (stopped). That's a common source of unexpected charges: people stop their VM thinking costs stopped too, but the disk keeps accruing charges.

The architecture recommendation from Microsoft is to keep OS and data on separate disks. If a VM fails, you can detach the data disk and attach it to a new VM without losing your data. This is not just theory, I've seen it save teams hours of data recovery work after a failed OS update.

To add a data disk to an existing VM, go to your VM in the portal, click Disks in the left sidebar, then Add data disk. Choose between Premium SSD (for I/O-intensive workloads), Standard SSD (balanced cost and performance), and Standard HDD (archive or low-priority data). Premium SSDs have higher per-GB costs but dramatically lower latency, for databases or active application data, always use Premium.

Via Azure CLI, attaching a new managed data disk looks like this:

az vm disk attach \
  --resource-group MyRG \
  --vm-name MyVM \
  --name MyDataDisk \
  --new \
  --size-gb 256 \
  --sku Premium_LRS

For the Trusted Launch feature now available in preview, where new Generation 2 VMs default to Trusted Launch with secure boot and vTPM enabled, this changes the disk and boot configuration. If you're working with existing automation scripts that create VMs without specifying --security-type, those scripts may behave differently once TLaD (Trusted Launch as Default) rolls out of preview. Test your deployment scripts against Gen2 images now to avoid surprises. You can explicitly set the security type during creation:

az vm create --resource-group MyRG --name SecureVM \
  --image Ubuntu2204 --generation 2 \
  --security-type TrustedLaunch \
  --enable-secure-boot true \
  --enable-vtpm true

Check your Azure Cost Management dashboard monthly, filter by resource type "Microsoft.Compute/disks" to see exactly what your disk storage is costing across all VMs. Unattached disks from deleted VMs are a silent cost driver that many teams don't notice until the bill arrives.

Advanced Troubleshooting

When the standard fixes don't work, you need to dig into the platform-level diagnostics. Azure has several tools built specifically for this, and most people don't know they exist.

Boot Diagnostics is your first stop for VMs that deploy successfully but won't start properly, or where the OS is crashing at startup. In the portal, go to your VM → Boot diagnostics in the left sidebar. You'll see a screenshot of the VM's console at its last known state, this immediately tells you if the OS is panicking, stuck at a GRUB prompt, or hitting a BSOD. You can also view the serial console log directly in the portal. Enable boot diagnostics on every VM from day one by specifying a storage account during creation, or let Azure manage it automatically with managed storage.

Azure Monitor and VM Insights handle performance-level issues, CPU spikes, memory pressure, disk I/O saturation, or network bottlenecks that don't show up in the portal's basic metrics. Go to your VM → InsightsEnable. Once the Log Analytics workspace is connected, you get performance graphs with 1-minute granularity, top-process breakdowns, and dependency mapping. For VMs that are slow but not failing, this is where you find the answer.

For enterprise environments with domain-joined VMs, Azure Active Directory integration issues are a major source of problems. If users can't authenticate to a domain-joined Azure VM despite the NSG rules being correct, check whether the VM can reach your domain controllers, either in Azure or on-premises via a VPN gateway or ExpressRoute. Use the Run Command feature in the portal (VM → Run command → RunPowerShellScript) to run network diagnostics without needing RDP access:

Test-NetConnection -ComputerName yourdomain.local -Port 389
nslookup yourdomain.local
nltest /sc_verify:yourdomain.local

For deeper networking issues, asymmetric routing, packet drops, MTU mismatches, use Azure Network Watcher. Go to Network Watcher → IP flow verify to test whether a specific traffic flow is being allowed or blocked by NSG rules, without having to manually trace through every rule. Connection troubleshoot goes further and tests end-to-end connectivity between a source VM and a destination, showing you exactly where the path breaks.

If you're seeing intermittent failures that correlate with time-of-day patterns, check Azure's Service Health. Go to the portal's search bar, type "Service Health," and look at the health history for your region and VM service type. Azure outages are not common, but they happen, and the service health dashboard is the authoritative source, faster than Twitter, more accurate than third-party status sites.

For repeated OperationNotAllowed errors when you know you have correct permissions, check your subscription's Azure Policy assignments. Policies can block VM creation, restrict VM sizes, require specific tags, or enforce naming conventions. Go to Policy → Compliance and filter by your subscription. Non-compliant policies show exactly which rule is blocking your operation.

When to Call Microsoft Support
Some Azure virtual machine issues genuinely require Microsoft's involvement: persistent allocation failures that don't resolve after changing regions or sizes (indicates a platform-level issue with your subscription), VMs that show as running in the portal but are completely unreachable and boot diagnostics show normal (rare but real host-level failures), unexpected charges that don't match any resources you can identify (potential billing platform bug), and quota increase requests for specialized VM families like GPU instances (HBv4, NCv4) that require business justification review. Open a support ticket at Microsoft Support, for Severity A (production down) issues, response times are typically under one hour on paid support plans.

Prevention & Best Practices

Most Azure virtual machine problems are preventable. The teams I've seen operate Azure environments well aren't smarter than everyone else, they just set things up correctly from the start and don't have to fight fires constantly.

Design for availability from day one. Going back and adding availability zones to a production VM that wasn't deployed into a zone originally means redeployment, you can't add zone assignment after the fact. Decide your availability strategy before you click Create, not after your first incident. For any workload that matters, two or more VMs across two or more availability zones is the baseline.

Separate OS and data disks on every VM. Not just for databases, for everything. When (not if) you need to resize, reimagine, or recover a VM, having data on a separate managed disk means you just detach and reattach. It takes minutes instead of hours.

Tag every resource. Azure lets you attach key-value tags to VMs, disks, NICs, and NSGs. Tags like environment:production, team:backend, and costcenter:engineering let you filter costs in Azure Cost Management and find resources fast. Untagged resources in a large subscription become a maintenance nightmare within six months.

Use Azure Hybrid Benefit if you have existing Windows Server licenses. For Windows VMs, the OS license cost is a significant part of the hourly rate. If your organization has Software Assurance or subscription licenses for Windows Server, Azure Hybrid Benefit can cut that licensing cost substantially, sometimes by 40% or more on Windows VM pricing. It's toggled per-VM and can be applied to existing VMs without redeployment.

Deallocate rather than just stopping VMs. When you click "Stop" on a VM in the portal and it asks if you want to deallocate, say yes. A stopped-but-not-deallocated VM still holds its compute allocation and may still incur compute charges. Deallocated VMs only pay for disk storage, which is the correct behavior for dev/test VMs that don't need to run overnight.

Quick Wins
  • Run az vm list-usage --location <region> before major deployments to catch quota limits before they stop you
  • Enable boot diagnostics on every VM at creation time, retroactive enablement requires a VM restart
  • Use Azure Policy to enforce tagging, approved VM sizes, and required availability zone configuration across your subscription
  • Review unattached managed disks monthly in Azure Cost Management, orphaned disks from deleted VMs silently accumulate charges

Frequently Asked Questions

What do I need to think about before creating an Azure virtual machine?

The key decisions are: resource naming (hard to change post-deploy), region/location (where your VHDs are stored and affects latency), VM size (determines CPU, memory, network bandwidth, and cost), operating system, and which supporting resources you'll need, virtual network, NIC, NSG, OS disk, and possibly a public IP. Think through availability requirements too: if this VM needs high uptime, which availability zone should it go into? Getting these right upfront avoids the painful process of redeploying later. The Azure portal's VM creation wizard walks you through all of these, but it won't stop you from making choices that don't fit your workload.

Why does my Azure VM show "Running" but I still can't connect to it?

"Running" in the portal means the VM is powered on, it says nothing about whether your network path to it is open. The most common cause is an NSG that doesn't have an inbound rule allowing your traffic on the right port (22 for SSH, 3389 for RDP). Go to your VM → Networking and check the inbound port rules. Also verify you're connecting to the correct IP, Azure assigns both private IPs (for internal vNet communication) and optionally a public IP (for internet access); connecting to the private IP from outside the vNet won't work. If NSG rules look correct, use Azure Network Watcher's IP flow verify tool to trace exactly where the block is.

Why am I getting an AllocationFailed error when creating an Azure VM?

AllocationFailed usually means Azure doesn't have enough capacity of your requested VM size in your chosen region and zone, or your subscription has hit its vCPU quota for that VM family. First, check your quota under Subscriptions → Usage + quotas filtered by region. If quota is fine, try a different availability zone within the same region, or a different region entirely, run az vm list-sizes --location <region> to confirm the size is actually available. For quota increases, submit a support request from the Usage + quotas page; standard VM family increases are usually approved within 1–3 business days.

How much does an Azure virtual machine actually cost, and what am I being charged for that I didn't expect?

Azure VM billing has two completely separate components: the compute charge (hourly rate based on VM size and OS, billed per minute for partial hours) and the storage charge (your OS disk billed at managed disk rates, your data disks billed separately). The compute charge stops when you deallocate the VM, but disk charges continue until you delete the disks. Common surprise charges include: OS disks from deleted VMs that weren't explicitly deleted, stopped-but-not-deallocated VMs still holding compute allocation, public IP address charges, and outbound data transfer charges. Use Azure Cost Management filtered by resource type to see your exact breakdown.

What's the difference between stopping and deallocating an Azure VM?

Stopping a VM shuts down the OS but may keep the VM allocated on Azure's hardware, meaning compute charges can continue. Deallocating releases the VM from its physical host, stopping compute charges entirely (you only pay for disk storage). In the portal, when you click Stop, Azure asks if you want to deallocate, always say yes unless you have a specific reason to hold the allocation (like preserving a specific physical host for licensing reasons). Note that a deallocated VM may get a different public IP address when restarted unless you've assigned a static public IP, which costs a small additional fee.

How do I make my Azure virtual machines highly available so they don't go down during maintenance?

The right approach depends on your workload. For the strongest protection, deploy two or more VMs across two or more availability zones in the same Azure region, this gets you a 99.99% SLA on VM connectivity. Availability zones are physically separate datacenters with independent power and networking, so a zone failure doesn't take down your other instances. If you also need auto-scaling, Virtual Machine Scale Sets let Azure automatically add or remove instances based on load metrics, and scale set VMs can also be distributed across availability zones. For legacy workloads where you can't easily run multiple instances, at minimum use an availability set to protect against hardware failures within a single datacenter, though the SLA there (99.95%) is lower than zone-spread deployments.

Related Microsoft Fix Guides

H
Sai Kiran Pandrala
Our team includes certified Microsoft engineers, Azure architects, and system administrators with 10+ years of enterprise IT experience. Every guide is written from hands-on troubleshooting, not guesswork. We test every fix before publishing.