Azure Stack Hub Operator Guide: Fix AZS 2601 Issues

Microsoft Fix Advanced 14 min read Official Docs Grounded Updated April 20, 2026

What's in This Guide

Why This Is Happening
The Quick Fix , Try This First
Step-by-Step Solution
Advanced Troubleshooting
Prevention & Best Practices
FAQ

Why This Is Happening

If you're an Azure Stack Hub operator staring at a broken AZS 2601 deployment , maybe the administrator portal won't load, the update package refuses to apply, or your tenant users suddenly can't provision VMs, I want you to know this is one of the most common pain points for on-premises Azure infrastructure teams. You're not alone, and the fix is almost always traceable to a handful of root causes.

Azure Stack Hub is Microsoft's on-premises extension of Azure. It's specifically designed for scenarios where you can't send everything to the public cloud: factory floors, disconnected mine shafts, cruise ships, heavily regulated financial environments. The AZS 2601 build is a specific release cadence update that many operators encounter during routine maintenance windows. The number 2601 follows Microsoft's YearMonth versioning, so this is the January 2026 release package. These updates ship on an accelerated cadence now, and falling behind by even one build can leave your scale unit in an unsupported state.

The Azure Stack Hub architecture is not a typical server cluster. Your integrated system is a rack of 4 to 16 servers, called a scale unit, delivered by a hardware partner and tightly coupled to the Azure Stack Hub software stack. Because every component (compute, storage, networking, the software-defined fabric) is managed as one unit, a misconfiguration in one layer cascades fast. Microsoft's error messages in the administrator portal tend to be terse and infrastructure-focused. They tell you what failed, rarely why.

Common root causes I see with AZS 2601 deployment and management problems:

Identity provider misconfiguration, Azure Stack Hub uses either Microsoft Entra ID (formerly Azure AD) for connected deployments or Active Directory Federation Services (AD FS) for disconnected ones. A certificate expiry or a broken trust relationship between AD FS and the Azure Stack Hub internal Active Directory instance is responsible for a large percentage of operator login failures.
Update ring conflicts, If a previous update (say, 2508 or 2511) was only partially applied or got stuck mid-installation, AZS 2601 will refuse to install on top of it. The update health check pipeline is strict by design.
Border connectivity gaps, For hybrid connected deployments, Azure Stack Hub needs reliable outbound connectivity to Azure endpoints. A firewall rule change, a proxy reconfiguration, or an expired certificate on the border device will break cloud management features silently.
Resource provider registration failures, When operators try to offer new services (MySQL, SQL Server, App Service), the resource provider deployment process can fail if the service account quotas or subscription limits weren't pre-configured in the plan.

The reason Microsoft's built-in alerts don't always surface these clearly is that Azure Stack Hub is designed with a separation of concerns between the hardware partner layer and the Microsoft software layer. When something breaks at the seam between those two layers, the telemetry pipeline itself can be impaired, which means you may see no alerts at all, just a portal that quietly stops working.

Browse all Microsoft fix guides →

The Quick Fix, Try This First

Before you spend two hours in PowerShell, do this one thing: open the Azure Stack Hub administrator portal and navigate to Dashboard > Region Management > Updates. Look at the update history list. Is there any update showing a status of "Failed", "Preparing", or stuck on a percentage for more than 90 minutes? That's your problem right there.

If you see a stuck or failed update, here's what to do immediately:

In the administrator portal, click the stuck update entry to open its blade.
Look for the "Resume" button. If it's greyed out, the update engine has locked the update in an unrecoverable state from the UI alone.
Open the Azure Stack Hub tools, you should have these downloaded from GitHub and available on your hardware lifecycle host (HLH) or the privileged endpoint. If not, that's step one of the full fix below.
Connect to the privileged endpoint (PEP) via PowerShell and run the following:

Enter-PSSession -ComputerName <PEP_IP_ADDRESS> -ConfigurationName PrivilegedEndpoint -Credential $credential

Get-AzureStackUpdateStatus

This command returns the actual update state machine, not just what the portal shows you. Look for any step that shows Failed or ActionRequired in the output. That step name is what you'll reference when opening a support case or looking up the specific remediation KB article from Microsoft.

If the portal itself won't load at all and you can't even get to the Updates blade, go directly to PowerShell management. The Azure Stack Hub portals are each backed by separate instances of Azure Resource Manager, so a portal failure doesn't necessarily mean the underlying infrastructure is broken, sometimes it's just the front-end ARM instance that needs a service restart through the PEP.

Pro Tip

Always connect to the privileged endpoint from the hardware lifecycle host or a jump box on the same management VLAN as your scale unit. Trying to reach the PEP from across a VPN or over a routed segment with high latency will cause PSSession timeouts that look exactly like a broken PEP, when the PEP itself is perfectly healthy.

Download and Install Azure Stack Hub Tools from GitHub

Everything begins here. The Azure Stack Hub tools repository on GitHub contains the PowerShell modules you need to manage, troubleshoot, and update your Azure Stack Hub environment. Without these tools, you're flying blind.

On the hardware lifecycle host (HLH) or your designated admin workstation, open PowerShell as Administrator and run:

# Set execution policy
Set-ExecutionPolicy RemoteSigned -Scope CurrentUser -Force

# Install Azure Stack Hub PowerShell modules
Install-Module -Name Az.BootStrapper -Force -AllowPrerelease
Install-AzProfile -Profile 2020-09-01-hybrid -Force
Install-Module -Name AzureStack -RequiredVersion 2.4.0 -Force

After the modules install, download the tools directly:

# Clone the tools repository
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
Invoke-WebRequest `
  -Uri https://github.com/Azure/AzureStack-Tools/archive/refs/heads/master.zip `
  -OutFile AzureStack-Tools.zip

Expand-Archive AzureStack-Tools.zip -DestinationPath . -Force
cd AzureStack-Tools-master

Once installed, import the Connect module and test connectivity to your Azure Stack Hub endpoint:

Import-Module .\Connect\AzureStack.Connect.psm1
Add-AzEnvironment -Name "AzureStackAdmin" `
  -ArmEndpoint "https://adminmanagement.<region>.<fqdn>"

If this command completes without error and returns an environment entry, your management endpoint is reachable and the ARM layer is responding. That's the green light to proceed with further diagnostics. If it times out, your border connectivity or internal DNS is the problem, skip ahead to the Advanced Troubleshooting section.

Verify and Reconnect Your Identity Provider

Azure Stack Hub's identity story is one of the most common sources of operator headaches. The platform supports two identity providers: Microsoft Entra ID for internet-connected deployments, and AD FS for disconnected or air-gapped ones. The key thing to understand is that you can't switch identity providers after deployment, this decision is made at install time and is permanent for the life of that scale unit.

For Entra ID (connected deployments), the most frequent failure mode is an expired or revoked service principal. Check this by running:

Connect-AzAccount -Environment AzureStackAdmin
Get-AzADServicePrincipal -DisplayName "Azure Stack"

If you get an authentication error or the service principal shows as disabled, you'll need to re-consent the Azure Stack Hub app registration in your Entra ID tenant. Go to the Azure portal (global), navigate to Entra ID > App Registrations > All Applications, search for "Azure Stack", and verify the app status and permissions.

For AD FS (disconnected deployments), certificate expiry is the #1 killer. Azure Stack Hub ships with its own internal Active Directory instance, and the trust relationship between that internal AD and your external AD FS server relies on certificates that have a finite lifetime. Run this from the PEP to check certificate status:

Test-AzureStack -Include AzsInfrastructure -DetailedResults

Look for any result with FAIL or WARN in the certificate-related test groups. The output includes the specific certificate thumbprint and expiry date. A certificate within 30 days of expiry will trigger warnings; an already-expired certificate causes silent authentication failures that look like portal hangs or user login errors with no obvious error code.

When Test-AzureStack finishes without critical failures, you'll see a summary line confirming the infrastructure health state. That's your confirmation that identity is healthy before you proceed.

Apply the AZS 2601 Update Package Correctly

Applying the AZS 2601 update through the administrator portal looks deceptively simple, but there's a right sequence that avoids the majority of failed update states I've seen teams struggle with.

First, confirm your current build version. In the administrator portal go to Dashboard > Region Management > [Your Region] > Properties. The Current Version field shows your installed build. AZS 2601 requires a minimum baseline, you generally cannot skip more than one major update release. If you're on 2506 or earlier, you may need to apply an intermediate package first.

Once you've confirmed the path is clear, navigate to Dashboard > Region Management > Updates. If the 2601 package appears in the list with status "Available", select it and click Update Now. The update preparation phase runs first, this validates hardware health, extension host readiness, and available storage. This phase alone can take 45-60 minutes on a 12-node scale unit.

During the update, monitor progress from PowerShell in parallel (don't rely solely on the portal UI, it can lag or time out during long update runs):

$session = New-PSSession -ComputerName <PEP_IP> `
  -ConfigurationName PrivilegedEndpoint `
  -Credential $pepCredential

Invoke-Command -Session $session -ScriptBlock {
  Get-AzureStackUpdateStatus | Select-Object -ExpandProperty ProgressSummary
}

A healthy update shows sequential step completions with no retries exceeding 3 attempts per step. If any step hits 3 retries and then fails, stop monitoring and pull the full update log for that step before deciding whether to resume or roll back.

Create and Configure Offers for Tenant Users

After your infrastructure is stable and updated, the next most common operator question is: why can't my users provision resources? The answer almost always lives in the offers and plans configuration, not in the underlying infrastructure.

Azure Stack Hub uses a layered model: Services roll up into Plans, plans combine into Offers, and users subscribe to Offers. If any layer in that chain is misconfigured, a quota set too low, a plan not added to an offer, a subscription not assigned, users hit errors that look like infrastructure problems but are actually administrative gaps.

In the administrator portal, go to Offers > + Add. Give the offer a display name and select your subscription. Under Base Plans, add the plan that contains the services you want to expose. If you want to offer virtual machines, your base plan must include the Microsoft.Compute, Microsoft.Network, and Microsoft.Storage services with appropriate quotas.

Set VM core quotas under Plans > [Your Plan] > Quotas > Compute. A common mistake I see is leaving the core quota at the default of 0, which silently blocks all VM creation attempts with a generic quota exceeded error that confuses tenant users.

# Verify quota settings via PowerShell
$plan = Get-AzsPlan -Name "YourPlanName"
Get-AzsComputeQuota -Location $plan.Location

Once your offer is configured, set its state to Public so tenants can discover and subscribe to it. An offer left in Private state is invisible to tenant users in the self-service user portal, another common "why can't I see any offers?" support call that's a two-second fix.

Monitor Infrastructure Health and Respond to Alerts

Ongoing operations as an Azure Stack Hub operator means knowing the health of your scale unit before your users tell you something is broken. The administrator portal surfaces health information in the Dashboard > Region Management > Alerts blade, and this should be your first stop every morning.

Alerts are categorized as Critical or Warning. Critical alerts require immediate action, they typically indicate a failed infrastructure role, a full storage pool, or a network connectivity loss between nodes. Warning alerts are indicators of degraded state that isn't yet causing user impact but will if left unresolved.

For more granular monitoring, connect Azure Monitor in global Azure to your Azure Stack Hub deployment. This works in hybrid (connected) scenarios and lets you route alerts to your existing operations dashboards, PagerDuty integrations, or email notification channels. Navigate to Administrator Portal > Virtual Machines > [Select an Infrastructure VM] > Extensions and add the Azure Monitor agent extension.

You can also query infrastructure health directly from PowerShell:

Invoke-Command -Session $pepSession -ScriptBlock {
  Get-AzureStackLog -OutputSharePath "\\<share>\AzureStackLogs" `
    -OutputShareCredential $shareCredential `
    -FilterByRole Storage,Compute,Network `
    -FromDate (Get-Date).AddHours(-4)
}

This pulls structured diagnostic logs from the past 4 hours for the storage, compute, and network roles, the three most common sources of user-impacting failures. Review these logs for ERROR or CRITICAL level entries before escalating to your hardware partner or Microsoft Support. Having these logs ready cuts support case resolution time dramatically.

Advanced Troubleshooting

Disconnected Deployments: AZS 2601 in Air-Gapped Environments

If your Azure Stack Hub runs disconnected from the internet, no connectivity to global Azure, the AZS 2601 update process is fundamentally different. You can't pull the update package from Microsoft's update servers directly. Instead, your solution provider or your own team must download the update package on an internet-connected machine and transfer it to the scale unit via the hardware lifecycle host.

The update package must be placed in a specific local share path that the Azure Stack Hub update service polls. Your hardware partner documentation specifies this path, but the general pattern is a UNC share accessible from the ERCS (Emergency Recovery Console Service) VMs inside the scale unit. Once the package is in place, it appears in the Updates blade automatically within 15 minutes as the polling interval catches it.

In a fully disconnected deployment, the identity provider is always AD FS. Remember that Azure Stack Hub ships with its own internal Active Directory instance, this internal AD powers infrastructure service accounts, not your tenant users. Your external AD FS federation trust connects that internal AD to your corporate identity. When the trust breaks, both operator logins and tenant logins fail simultaneously. Run Test-AzureStack -Include AzsInfrastructure from the PEP to isolate which specific trust or certificate has failed.

Data Residency and the Disconnected Model

One thing the Azure Stack Hub architecture gets right is that in a fully disconnected deployment, no data stored on the appliance is transmitted to Microsoft. The customer owns and controls the appliance entirely. This is why regulated industries, healthcare, defense, financial services, adopt Azure Stack Hub over public cloud-only architectures. When you're troubleshooting in these environments, be aware that Microsoft's standard telemetry-based support capabilities are reduced. You'll rely more heavily on local log collection and the PEP diagnostic commands described above.

Event Log Analysis for Persistent Failures

On the hardware lifecycle host, check the Windows Event Viewer under Applications and Services Logs > Microsoft > AzureStack. Event ID 1001 indicates an update preparation failure. Event ID 2003 typically maps to an ARM endpoint registration problem. Event ID 3010 is the infrastructure health check failing pre-update validation.

For network-level Azure Stack Hub operator problems, check the border device logs for dropped traffic to the management VIP range. Azure Stack Hub management traffic uses specific TCP ports that must be open outbound: 443 (HTTPS), 80 (HTTP for certificate revocation), and 123 (NTP). A border ACL blocking NTP causes clock drift that cascades into Kerberos authentication failures across the entire platform within hours.

When to Call Microsoft Support

If Test-AzureStack returns failures in the AzsInfraCapacity or AzsSFRoleSummary test groups, or if your update has been stuck on the same step for more than 3 hours with repeated retries, stop and escalate. Continuing a broken update can leave your scale unit in a partially upgraded state that requires a full infrastructure redeployment to recover from. Open a case at Microsoft Support with your Get-AzureStackLog output attached. Premium support customers can expect a callback within 2 hours for severity A cases.

Prevention & Best Practices

The operators I've seen run Azure Stack Hub smoothly at scale share a few habits that the ones who are constantly fighting fires don't have. None of this is rocket science, it's discipline around the basics.

Stay current on updates. Azure Stack Hub follows a monthly-to-quarterly update release cadence. Microsoft supports the current version and two versions back, fall behind that window and you're on your own for bug fixes and security patches. Build a maintenance window into your operational calendar every 60-90 days specifically for update packages. The AZS 2601 update, like all others, includes security patches that address vulnerabilities in the underlying Windows Server components powering the infrastructure VMs. Skipping these isn't just an operations risk, it's a security exposure.

Test your PEP access quarterly. The privileged endpoint is your emergency back door when the portal fails. If you only test PEP connectivity the first time something breaks, you'll discover your jump box lost access months ago and you're now troubleshooting two problems instead of one. Schedule a quarterly PEP connection test as part of your ops runbook.

Document your connection model and identity provider choice. It sounds obvious, but I've seen teams where the original deployment engineer left and nobody remaining knew whether the environment was connected or disconnected, or whether it was Entra ID or AD FS. This information is in the administrator portal under Region Management > Properties, screenshot it and store it somewhere that survives staff turnover.

Monitor certificate expiry proactively. Set calendar reminders at 90, 60, and 30 days before any certificate in your Azure Stack Hub trust chain expires. This includes the AD FS token signing certificate, the external certificate used for the portal endpoints, and the internal infrastructure certificates surfaced by Test-AzureStack.

Quick Wins

Run Test-AzureStack -Include AzsInfrastructure weekly and review the output, catch certificate and health issues before they become incidents
Set up an automated email alert from the administrator portal's alert engine to your ops distribution list so Critical alerts don't sit unread over weekends
Keep a local copy of the Azure Stack Hub tools repository on your HLH so you're not dependent on GitHub access during an incident
Document your offers, plans, and quota settings in a spreadsheet, when tenant users report provisioning failures, this reference saves you 30 minutes of portal archaeology every time

Frequently Asked Questions

What exactly is Azure Stack Hub and how is it different from regular Azure?

Azure Stack Hub is an on-premises extension of Microsoft Azure, it lets you run Azure services from hardware in your own datacenter. Unlike public Azure where Microsoft owns and operates the physical infrastructure, with Azure Stack Hub your organization owns the physical appliance (a rack of 4–16 servers) and controls everything on it. The key value is that you get a consistent Azure API surface and DevOps experience on-premises, which means you can write apps that run identically in both environments without code changes. It's specifically designed for edge locations, disconnected scenarios like ships or factories, and regulated industries where data can't leave the building.

Can I run Azure Stack Hub completely offline with no internet connection?

Yes, this is one of the core design goals of Azure Stack Hub. A fully disconnected deployment has zero data flowing from your appliance to Microsoft or to the internet. You own and control everything: the hardware, the software, and every byte of data stored on it. The trade-off is that you must use AD FS as your identity provider (not Entra ID, which requires cloud connectivity), and you need to manually transfer update packages to the environment instead of having them delivered automatically. Real-world examples where this matters include mine shafts, classified government facilities, cruise ships, and manufacturing floors with no reliable WAN link.

How do I manage Azure Stack Hub as an operator, do I need to learn new tools?

If you're already comfortable with Azure management, the learning curve is shorter than you'd expect. Azure Stack Hub uses the same operations model as public Azure, there's an administrator portal (separate from the tenant user portal) that looks and behaves like the Azure portal, and you manage it with the same Azure PowerShell modules you'd use for global Azure subscriptions. The main additions are the Azure Stack Hub-specific PowerShell modules available from GitHub, and the privileged endpoint (PEP) for deep infrastructure operations that don't surface through the normal portal. As an operator, you also manage the service catalog, creating plans, quotas, and offers, which is an Azure Stack Hub-specific concept without a direct public Azure equivalent.

My Azure Stack Hub update is stuck and won't resume, what do I do?

First, connect to the privileged endpoint via PowerShell and run Get-AzureStackUpdateStatus to see the actual state machine, which is more detailed than what the portal shows. Look for the specific step that has Failed or ActionRequired status, that step name is your diagnostic starting point. If the update has been stuck on the same step for more than 3 hours with repeated retries, stop trying to resume it through the portal and open a support case with Microsoft, providing the full output of Get-AzureStackLog. Continuing a broken update run can leave the scale unit in a partially upgraded state that's much harder to recover from than the original failure.

Why can't my tenant users see any offers or create virtual machines?

This is almost always a configuration issue, not an infrastructure problem. The most common causes: the offer is set to Private instead of Public state, the base plan doesn't include the required compute, network, and storage services, or the quota for VM cores is set to zero (the default). Check your offer state first, in the administrator portal under Offers, the state column should show "Public" for any offer you want tenants to see. Then verify the plan's compute quota under Plans > [Your Plan] > Quotas > Compute and confirm the vCPU limit is set to a number greater than zero.

Do I need Microsoft Entra ID or can I use my own Active Directory with Azure Stack Hub?

You have two options, and the choice is made at deployment time, you cannot change it later. For internet-connected deployments, Microsoft Entra ID (formerly Azure AD) is the recommended and most capable option, giving you multitenant identity with full hybrid cloud management features. For disconnected deployments with no internet access, you must use AD FS (Active Directory Federation Services) federated to your existing on-premises Windows Server Active Directory. Both identity providers are fully supported by all Azure Stack Hub resource providers and apps, the experience is equivalent for most scenarios. Just remember that Azure Stack Hub also includes its own internal Active Directory instance for infrastructure service accounts, separate from whichever external identity provider you choose.

Related Microsoft Fix Guides

Sai Kiran Pandrala

Our team includes certified Microsoft engineers, Azure architects, and system administrators with 10+ years of enterprise IT experience. Every guide is written from hands-on troubleshooting, not guesswork. We test every fix before publishing.