Azure Arc: The Complete Setup and Troubleshooting Guide
If you're managing infrastructure spread across on-premises data centers, multiple clouds, and edge locations, you already know the pain: different tools for every environment, inconsistent governance, and operational models that don't scale. Azure Arc is Microsoft's answer to that chaos, and this guide walks you through everything you need to know to get it working, fix common issues, and keep it running smoothly.
What Is Azure Arc and Why Do You Need It?
Azure Arc is a bridge, a centralized management layer that lets you extend Azure's governance, security, and management capabilities to resources that live outside of Azure. Whether your servers are sitting in a colocation facility, your Kubernetes clusters are running on a rival cloud provider, or your SQL Server instances are humming away in your basement server room, Azure Arc brings them all under one roof: the Azure portal and Azure Resource Manager.
Think of it this way. Before Arc, managing a hybrid environment meant jumping between your VMware vCenter console, a separate Kubernetes dashboard, your cloud provider's native portal, and Azure itself. Every tool had its own access model, its own alert system, its own policy engine. Keeping everything consistent was practically a full-time job, and the bigger your environment, the worse it got.
With Azure Arc, you project those non-Azure resources into Azure Resource Manager. They show up in the portal like native Azure resources. You can apply Azure Policy to them. You can use Microsoft Defender for Cloud to monitor their security posture. You can tag them, group them into resource groups, and manage role-based access control (RBAC), all from the same place you manage your Azure VMs.
Azure Arc currently supports the following resource types outside of Azure:
- Servers and virtual machines, Windows and Linux physical servers and VMs, including those on Azure Local, VMware vCenter, and System Center Virtual Machine Manager (SCVMM)
- Kubernetes clusters, Any conformant Kubernetes distribution, running anywhere
- Azure data services, SQL Managed Instance running on-premises or at the edge via Kubernetes
- SQL Server, Extend Azure services to SQL Server instances hosted outside Azure
In this guide, we're going to cover the full picture: how Arc works under the hood, how to connect your first resources, how to configure Arc resource bridge for private cloud environments like VMware vSphere and SCVMM, and how to fix the most common problems you'll run into.
How Azure Arc Works: The Architecture You Need to Understand
Before you touch a command line, it helps to understand what's actually happening when you "Arc-enable" a resource. This understanding will save you hours of debugging later.
At the core of Arc-enabled servers is the Azure Connected Machine agent (also called the Arc agent). This lightweight agent gets installed on your server, Windows or Linux, and establishes an outbound HTTPS connection to Azure. That connection is the channel through which Azure can project the machine into Resource Manager, push policies, collect logs, and run extensions.
For Kubernetes, the mechanism is different. You install the Arc agents as Kubernetes deployments inside your cluster. These agents handle the connectivity back to Azure, enable GitOps-based configuration management, and allow you to install cluster extensions like Azure Monitor, Defender for containers, or Azure App Services.
For private cloud environments, VMware vSphere and SCVMM, the architecture gets more interesting. You deploy an Arc resource bridge, which is a lightweight on-premises appliance (a virtual machine, actually) that acts as a proxy between your private cloud management plane and Azure. This is what allows Azure to not just monitor your VMs, but actually provision, resize, and delete them.
The resource bridge is a critical component to understand because it has its own lifecycle, its own versioning, and its own support window. We'll come back to that when we talk about maintenance and troubleshooting.
Prerequisites: What You Need Before You Start
Getting Arc set up requires a bit of preparation. Skipping this step is the number-one reason people run into problems during onboarding.
You need an active Azure subscription. For onboarding servers, you need the Azure Connected Machine Onboarding role at minimum. For full management (including resource group creation), you need Contributor or higher. For Arc resource bridge deployments, you'll typically need Owner or a custom role with specific permissions on the subscription.
The Arc agent communicates outbound over HTTPS (port 443). Your servers need to be able to reach specific Azure endpoints. If you're running through a proxy or firewall, you'll need to whitelist the required URLs for your region. Microsoft publishes the full list in the Azure Arc network requirements documentation. This is one of the most common setup failures, overlooked firewall rules.
Before you can onboard resources, you need to register the necessary resource providers in your Azure subscription. At minimum, you need Microsoft.HybridCompute and Microsoft.GuestConfiguration for servers. For Kubernetes, add Microsoft.Kubernetes and Microsoft.KubernetesConfiguration. You can do this from the Azure portal under Subscription > Resource providers, or via the Azure CLI.
az provider list --query "[?registrationState=='NotRegistered']" --output table to see which providers aren't yet registered in your subscription. Register everything you'll need upfront rather than discovering missing providers mid-deployment.
For servers, Arc supports Windows Server 2008 R2 SP1 and later, plus most mainstream Linux distributions including RHEL, Ubuntu, SUSE, Debian, and Amazon Linux. For Kubernetes, you need a CNCF-conformant distribution running Kubernetes 1.21 or later. Check the current compatibility matrix in the docs before you start, it does get updated.
Step-by-Step: Connecting Your First Server to Azure Arc
Let's walk through the most common onboarding scenario: connecting an existing Windows or Linux server to Azure Arc. This is the foundation everything else builds on.
Log into the Azure portal and search for "Azure Arc." Click on Azure Arc, then navigate to Infrastructure > Servers. Click Add, then choose Add a single server for a one-off connection, or Add servers at scale if you're onboarding multiple machines using a service principal.
Fill in the subscription, resource group, region, operating system, and optionally tags. The portal generates a script for you. For a single server, this is a PowerShell script (for Windows) or a bash script (for Linux). Download or copy it.
On Windows, open PowerShell as an Administrator and run the script. On Linux, run it as root or with sudo. The script downloads and installs the Azure Connected Machine agent, then registers the machine with Azure using the credentials embedded in the script (or prompts you to authenticate interactively).
The installation typically takes two to five minutes. You'll see output indicating the agent download, installation, and registration steps.
After the script completes, go back to Azure Arc > Servers in the portal. Your machine should appear with a status of Connected. If it shows Disconnected, the agent installed but isn't maintaining its heartbeat, usually a network issue. If the machine doesn't appear at all, the registration step failed.
Now that your server is connected, you can install extensions from the portal: the Log Analytics agent for monitoring, the Dependency agent for service map, the Azure Monitor agent, the Custom Script Extension for automation, or the Microsoft Defender for Endpoint extension for security. Navigate to your Arc server resource, click Extensions, and add what you need.
You can also now apply Azure Policy assignments to this server, enroll it in Microsoft Defender for Cloud, and manage it through Azure Update Manager.
Step-by-Step: Connecting a Kubernetes Cluster to Azure Arc
Arc-enabled Kubernetes opens up GitOps-based configuration management, Azure Monitor for containers, Defender for containers, and the ability to run Azure services like App Services and Machine Learning directly on your cluster. Here's how to connect one.
You'll need the Azure CLI with the connectedk8s and k8s-configuration extensions installed. Run:
az extension add --name connectedk8s
az extension add --name k8s-configuration
Make sure your kubectl context is pointed at the cluster you want to connect before proceeding.
Run the connection command, substituting your own values:
az connectedk8s connect --name MyCluster --resource-group MyResourceGroup --location eastus
This command deploys Arc agents into a new azure-arc namespace on your cluster. The agents establish the outbound connection to Azure and register the cluster as an Azure resource. Depending on your cluster's image pull speed and network conditions, this takes three to ten minutes.
Check that all Arc agent pods are running:
kubectl get pods -n azure-arc
You should see several pods in Running state, including clusteridentityoperator, config-agent, controller-manager, flux-logs-agent, metrics-agent, and others. Any pods in a CrashLoopBackOff or Pending state indicate a problem that needs investigation.
One of the most powerful Arc Kubernetes features is GitOps-based configuration management using Flux. You can apply configurations on clusters directly from a Git repository. Create a configuration with:
az k8s-configuration flux create --resource-group MyResourceGroup --cluster-name MyCluster --cluster-type connectedClusters --name my-config --namespace my-namespace --scope cluster --url https://github.com/my-org/my-repo --branch main --kustomization name=my-kustomization path=./manifests prune=true
Flux will continuously reconcile the cluster state with your Git repository, giving you declarative, auditable configuration management.
Deploying Arc Resource Bridge for VMware vSphere and SCVMM
If you want to manage VMs in VMware vSphere or System Center Virtual Machine Manager directly from Azure, not just monitor them, but actually provision, resize, and delete them, you need to deploy the Arc resource bridge.
The Arc resource bridge is a pre-packaged appliance VM that you deploy into your on-premises environment. It runs a lightweight Kubernetes cluster internally and handles the communication between your private cloud management APIs and Azure Resource Manager.
The resource bridge requires a dedicated VM with specific CPU, memory, and storage requirements (typically 4 vCPUs, 8 GB RAM, 100 GB disk). It needs network access to both your private cloud management plane (vCenter or SCVMM) and to Azure endpoints. Review the current system requirements in the official documentation before starting, they can change between releases.
Microsoft provides a PowerShell-based deployment script for both VMware and SCVMM scenarios. For VMware, you'll need your vCenter credentials, the name of your datacenter, the resource pool, datastore, and network to use for the appliance VM. The script handles the VM deployment, initial configuration, and Azure registration.
For SCVMM, you need your SCVMM server credentials and the management network details. The process is conceptually similar but uses SCVMM-specific parameters.
After the resource bridge is deployed and connected, you install the appropriate private cloud extension. For VMware, this is the Arc-enabled VMware vSphere extension; for SCVMM, it's the Arc-enabled SCVMM extension. These extensions extend the resource bridge's capabilities to understand and manage the specific private cloud platform.
Once the extension is installed, Azure can discover your existing VMs, virtual networks, datastores, and templates from vCenter or SCVMM. You can then selectively Arc-enable specific VMs to bring them under Azure management, or use the self-service experience to provision new VMs from Azure directly.
Advanced Troubleshooting: Fixing the Most Common Azure Arc Problems
Even with careful preparation, things go wrong. Here are the issues you're most likely to encounter and exactly how to fix them.
Problem 1: Server shows as Disconnected
A Disconnected status means the Arc agent is installed but isn't successfully phoning home to Azure. This is almost always a network connectivity issue.
Diagnosis: On the affected server, check the agent status with azcmagent show (run as admin/root). Look at the Last status change and Error details fields. Run azcmagent check to test connectivity to all required Azure endpoints, it will tell you specifically which endpoints are unreachable.
Fix: Work with your network team to allow outbound HTTPS to the required Azure endpoints. If you're going through a proxy, configure the Arc agent to use it with azcmagent config set proxy.url http://your-proxy:port. If you're in a completely air-gapped environment, you'll need to configure a private endpoint, though be aware that Arc resource bridge does not currently support private link.
Problem 2: Kubernetes Arc agents in CrashLoopBackOff
If Arc agent pods are crash-looping, the most common culprits are insufficient permissions, network issues, or a misconfigured kubeconfig.
Diagnosis: Run kubectl describe pod [pod-name] -n azure-arc and kubectl logs [pod-name] -n azure-arc --previous to get the actual error messages. Look specifically at the clusteridentityoperator and config-agent pods, as these handle the core Azure connectivity.
Fix: If you see authentication errors, verify that the service principal or managed identity being used has the correct RBAC permissions on the cluster's Azure resource. If you see network errors, check that the cluster's nodes can reach Azure endpoints. If you see certificate errors, the cluster's certificate might have expired, this requires a reconnection.
Problem 3: Arc resource bridge upgrade failures
Resource bridge upgrades can fail for several reasons: insufficient resources on the target host, network timeouts during the image pull, or stale credentials.
Diagnosis: Run az arcappliance show --resource-group [rg] --name [name] to check the current status. If the upgrade is stuck or failed, check the upgrade logs on the appliance host.
Fix: For failed upgrades, you can attempt to re-run the upgrade command. If the appliance is in an unrecoverable state, you may need to redeploy it, which is why Microsoft recommends proactive upgrades on a schedule rather than waiting for failures. Always have your deployment configuration files backed up so you can redeploy quickly if needed.
Problem 4: GitOps configurations not applying
If your Flux configurations aren't reconciling, the issue is usually in the Git repository URL, branch name, credentials, or kustomization path.
Diagnosis: Check the configuration status with az k8s-configuration flux show --resource-group [rg] --cluster-name [name] --cluster-type connectedClusters --name [config-name]. Look at the complianceState and statuses fields. Also run kubectl get gitrepository,kustomization -n [namespace] to see Flux's own status view.
Fix: Common fixes include correcting the branch name, updating credentials if you've rotated Git tokens, fixing the kustomization path if it doesn't match your repository structure, or resolving YAML syntax errors in your manifests. Flux is strict about YAML validity, even a misplaced space can cause a reconciliation failure.
Problem 5: Extensions failing to install
Extension installation failures on Arc-enabled servers or Kubernetes clusters are usually caused by permission issues, network problems, or prerequisite configuration missing.
Diagnosis: Check extension status in the portal under your Arc resource's Extensions blade. For more detail, look at the extension instance view, it will show the provisioning state and any error messages from the extension handler.
Fix: Most extension installation failures can be resolved by ensuring the Arc agent has the right permissions, verifying network connectivity (extensions often need to download packages from additional endpoints), and checking that any prerequisite extensions are installed first. For example, the Dependency agent extension requires the Log Analytics agent to be installed first.
Prevention: Best Practices for a Healthy Azure Arc Environment
Running a stable Arc environment long-term requires proactive maintenance, not just reactive troubleshooting. Here's what experienced Arc administrators do to stay ahead of problems.
Keep the Arc agent updated
The Azure Connected Machine agent receives regular updates with bug fixes, security patches, and new features. On Windows, you can configure Windows Update to handle Arc agent updates automatically. On Linux, update through your package manager (apt upgrade azcmagent or yum update azcmagent depending on your distro). Consider using Azure Update Manager, now accessible for Arc-enabled servers, to manage this centrally.
Set up Azure Monitor for hybrid health
Don't wait for a user to report a disconnected server. Configure Azure Monitor alerts on your Arc resources: alert when a server's heartbeat stops, when an Arc agent goes into an error state, or when a Kubernetes cluster's agent pods aren't healthy. This gives you proactive notification instead of reactive discovery.
Manage the resource bridge lifecycle proactively
As mentioned earlier: the Arc resource bridge must be upgraded at least once every six months, regardless of where it falls in the n-3 version support window. The risk of letting it go stale is severe, expired internal certificates can completely break your private cloud management capability with no easy recovery path short of redeployment. Treat resource bridge upgrades like you treat certificate renewals: schedule them in advance, don't let them slip.
Use custom locations thoughtfully
Custom locations are an abstraction layer on top of Arc-enabled Kubernetes clusters that allow you to deploy Azure PaaS services (like App Services or Machine Learning) to your on-premises infrastructure. They're powerful, but they add complexity. Document which cluster extensions and service instances are associated with each custom location, and establish a clear process for upgrading and managing those extensions.
Tag everything consistently
One of Arc's biggest benefits is unified governance, but that only works if your resources are organized consistently. Establish a tagging strategy before you start onboarding at scale. Tag by environment (production, staging, dev), by location (datacenter, city, or region), by owner, and by any other dimensions important to your organization. Azure Policy can enforce tagging requirements on Arc resources just like it does on native Azure resources.
Audit connectivity regularly
Run the azcmagent check command periodically (or script it and pipe the output to your monitoring system) to verify that all required Azure endpoints remain reachable. Firewall rules change. Proxy configurations shift. Network topology evolves. Regular connectivity audits catch drift before it becomes an outage.
Frequently Asked Questions
Does Azure Arc work without internet connectivity?
Partial air-gapped scenarios are supported for some Arc workloads, but full offline operation is limited. Arc-enabled servers and Kubernetes clusters require outbound HTTPS connectivity to Azure endpoints, there's no fully offline mode for the management plane. For Arc data services (SQL Managed Instance), the "directly connected" mode requires connectivity, but there's an "indirectly connected" mode designed for environments with restricted or intermittent connectivity. Note that Arc resource bridge does not currently support Azure Private Link, so if your environment requires all traffic to traverse private endpoints, resource bridge deployments for VMware and SCVMM won't work in that configuration.
Can I use Azure Arc with AWS or Google Cloud VMs?
Yes, absolutely, and this is one of Arc's most compelling use cases. You can install the Arc Connected Machine agent on EC2 instances, GCE VMs, or any other cloud's virtual machines. They'll appear in the Azure portal as Arc-enabled servers, and you can apply Azure Policy, use Defender for Cloud, manage updates, and run extensions on them just like any other Arc server. Additionally, the Multicloud connector feature (enabled by Azure Arc) specifically helps you connect non-Azure public cloud resources to centralize governance in Azure.
How much does Azure Arc cost?
Azure Arc itself, the basic connectivity and projection into Azure Resource Manager, is free for servers, Kubernetes clusters, and SQL Server. You start paying when you use specific services on top of Arc: Defender for Cloud's enhanced security plans, Azure Monitor agent data ingestion, Arc-enabled data services (SQL Managed Instance), and some extension functionality. Check the Azure Arc pricing page for current rates, as they vary by service and are updated periodically. For Windows Server 2012 machines specifically, Arc enables Extended Security Updates (ESU) which have their own licensing model.
What Kubernetes distributions are supported by Azure Arc?
Azure Arc supports any CNCF-conformant Kubernetes distribution. This includes, but isn't limited to, AKS on Azure Local, K3s, Rancher RKE, OpenShift (with some caveats), GKE, EKS, and upstream kubeadm clusters. The key requirement is CNCF conformance and Kubernetes version 1.21 or later. Check the current compatibility list in the official documentation, as supported distributions and minimum versions do change with Arc agent releases.
Can Azure Arc manage VMs that aren't in VMware or SCVMM?
It depends on what you mean by "manage." Arc can monitor and govern VMs running on any hypervisor, you just install the Arc Connected Machine agent directly on the guest OS, and the VM appears in Azure as an Arc-enabled server. What you can't do without the resource bridge is perform hypervisor-level operations (provisioning, resizing, deleting VMs from Azure) on arbitrary hypervisors. That lifecycle management capability currently requires either VMware vSphere, SCVMM, or Azure Local. For other hypervisors, you're limited to guest OS-level management via the Arc agent.
How do I onboard hundreds of servers to Arc at scale?
For large-scale onboarding, you don't want to use interactive authentication or run scripts machine by machine. The recommended approach is to use a service principal, create one with the Azure Connected Machine Onboarding role, generate an onboarding script that uses its credentials, and deploy that script through your existing infrastructure automation: Group Policy, Ansible, Puppet, Chef, Configuration Manager, or whatever tool you already use to push software to servers. You can also use Azure Arc's at-scale onboarding wizard in the portal, which generates service-principal-authenticated scripts and provides guidance on deployment methods. For cloud environments, some providers have native integrations that simplify bulk onboarding further.
What happens if the Arc resource bridge goes down?
If your Arc resource bridge becomes unavailable, the existing Arc-enabled VMs that were projected into Azure continue to function normally as workloads, Arc going down doesn't affect the VMs themselves. What you lose is the management plane: you can't provision new VMs from Azure, you can't resize or delete existing VMs through Azure, and the connection between Azure and your private cloud management tools is broken. Any previously Arc-enabled server VMs that also have the Connected Machine agent installed will continue their direct Arc connectivity (since that goes directly from the guest OS to Azure, not through the resource bridge). Restoring the resource bridge should be your priority, and having your deployment configuration files backed up means you can redeploy it relatively quickly if the appliance itself is unrecoverable.