Azure Application Gateway: Fix Setup & Config Errors Fast

Microsoft Fix Advanced 15 min read Official Docs Grounded Updated April 20, 2026

Why Azure Application Gateway Errors Keep Biting Engineers

You've set up your Azure Application Gateway, wired everything together, hit deploy , and now traffic isn't routing, backends are showing as "Unhealthy," or your SSL listener refuses to start. I've seen this exact scenario play out on dozens of Azure environments, from small startups running a handful of VMs to enterprise setups with hundreds of backend instances behind WAF. The frustration is real, and it's compounded by the fact that the Azure portal's error messages are often spectacularly unhelpful. "Backend health: Unknown." Thanks, that narrows it down.

Here's the thing: Azure Application Gateway is genuinely one of the more complex Azure services to configure correctly. Unlike a basic load balancer that just looks at IP addresses and port numbers, Application Gateway makes decisions at OSI Layer 7 , it's reading your HTTP request attributes, your URL paths, your host headers. That intelligence is exactly what makes it so powerful for routing /images traffic to image-optimized servers while sending /api requests somewhere else entirely. But that same intelligence means there are more configuration surfaces where things can go wrong.

The most common Azure Application Gateway configuration errors I see break down into four buckets. First, backend health probe failures, your gateway can't reach the backend pool members, usually because of a Network Security Group (NSG) rule blocking the probe source IPs, or because the health probe path returns a non-200 HTTP status. Second, SSL/TLS certificate problems, wrong format, expired cert, or a mismatch between the certificate's Common Name and the hostname the listener is configured for. Third, routing rule conflicts, overlapping path patterns, missing default rules, or listener configurations that fight each other. And fourth, the increasingly urgent issue of running on Application Gateway v1 when Microsoft has set a hard retirement date of April 28, 2026 for the v1 SKU.

That last one deserves special attention right now. If you're reading this in 2026 and you're still on v1, you need to act immediately. The deprecation was announced back in April 2023, and Microsoft is not extending the deadline. After April 28, 2026, v1 is unsupported. No patches, no SLA, nothing.

The reason Microsoft's built-in error messages don't help much is that Application Gateway sits at the intersection of networking (NSGs, virtual networks, subnets), compute (your backend VMs or App Service instances), and security (WAF rules, SSL policies). A single misconfigured NSG rule can look identical in the portal to a dead backend VM. This guide cuts through that ambiguity with specific diagnostic steps.

Browse all Microsoft fix guides →

The Quick Fix, Check Backend Health First

Before you touch a single configuration setting, go straight to the Backend Health blade. This one view resolves the majority of Azure Application Gateway troubleshooting cases because it tells you exactly which backend pool members the gateway can or cannot reach, and it gives you an HTTP status code to work with. Here's how to get there fast.

Open the Azure portal and navigate to your Application Gateway resource. In the left-hand menu, scroll down to the Monitoring section and click Backend health. Give it 20–30 seconds to refresh, it's a live probe, not a cached result.

You'll see a table listing each backend pool and the HTTP settings associated with it, with a status of Healthy, Unhealthy, or Unknown. If you see Unhealthy, click the status to expand it. Look for the "Reason" column, this is where Application Gateway tells you the actual HTTP response code it got back from your backend, or whether the TCP connection failed entirely.

Common status codes you'll see here:

  • Connection timed out, NSG is blocking the gateway's health probe source IPs (65.52.0.0/17 and others in the AzureCloud service tag), or the backend is down.
  • Status code: 404, Your health probe path doesn't exist on the backend. Change the probe path in your HTTP Settings to one that returns 200.
  • Status code: 401 or 403, The backend requires authentication that the probe doesn't send. Use a custom health probe with the correct path or configure the backend to allow unauthenticated health checks on a specific endpoint.
  • Status code: 502, The backend pool is empty or all members are unreachable. This is the classic Azure Application Gateway 502 bad gateway error that shows up for end users.

If all your backends show Healthy but you're still getting errors, the problem is almost certainly in your listeners or routing rules, jump to Step 3.

Pro Tip
Application Gateway health probes originate from the gateway's private IP address in your subnet, not from a fixed Microsoft IP range. If you're using NSGs to restrict traffic, make sure your inbound rule allows TCP traffic on the backend port from the entire Application Gateway subnet CIDR, not just the gateway's frontend IP. Blocking probe traffic is the single most common cause of false "Unhealthy" status in Azure Application Gateway backend health checks.
1
Audit Your NSG Rules to Unblock Health Probes

This is the fix I reach for first when backend health shows "Connection timed out" with no HTTP response code. Your Network Security Group is silently dropping the gateway's probe packets, and the portal gives you zero indication that this is why your backends look dead.

Navigate to Azure Portal > Virtual Networks > [your VNet] > Subnets, find the subnet your Application Gateway is deployed in, and click through to the associated NSG. Under Inbound security rules, check that you have a rule allowing the following:

Source: GatewayManager (Service Tag)
Source port ranges: *
Destination: Any
Destination port ranges: 65200-65535
Protocol: TCP
Action: Allow
Priority: Must be lower number than any Deny rules

That port range 65200–65535 is non-negotiable for Application Gateway v2. Microsoft uses those ports for infrastructure communication between the gateway instances and the Azure control plane. If you block them, your gateway will either fail to provision or will intermittently lose backend connectivity, and the error messages won't point you here.

For the backend subnet NSG (the subnet where your VMs or App Service Environment live), add an inbound rule allowing traffic from your Application Gateway subnet on whatever port your application uses (80, 443, or a custom port). Use the Application Gateway subnet CIDR as the source, not a generic "Any."

After saving the NSG changes, go back to Backend health and click the refresh button. Healthy backends should flip to green within 60 seconds. If they don't, the issue is on the backend itself, check that your web server is actually running and responding on the expected port.

2
Fix SSL/TLS Certificate Errors on Your HTTPS Listener

SSL problems in Azure Application Gateway show up in a few distinct ways: the listener fails to save at all, HTTPS traffic returns a generic TLS handshake error, or, the sneaky one, the gateway is up and running but end-to-end SSL to your backend is silently failing. I know how frustrating this is when you've already spent an hour formatting certificates.

Application Gateway requires certificates in .pfx (PKCS#12) format for listeners. It will not accept .pem, .cer, or .crt files for frontend listeners. If you're uploading a cert exported from another system, make sure it includes the private key. To convert your cert to .pfx with OpenSSL:

openssl pkcs12 -export \
  -out certificate.pfx \
  -inkey private.key \
  -in certificate.crt \
  -certfile chain.crt \
  -password pass:YourPasswordHere

For end-to-end SSL (encrypting traffic between the gateway and your backend), you need to upload the backend server's root certificate to the Application Gateway HTTP Settings as a .cer file (the public key only, no private key). This is a separate step from the frontend listener certificate and one that engineers frequently miss.

To check which SSL policy your gateway is running, go to Azure Portal > Application Gateway > [your gateway] > Listeners > [listener name]. If you're seeing TLS 1.0 or 1.1 negotiation errors in your backend logs, your SSL policy may be too permissive. Switch to the AppGwSslPolicy20220101S predefined policy, which enforces TLS 1.2 minimum and removes older cipher suites. Do this under Application Gateway > SSL settings (Preview) > SSL policies for gateway-wide enforcement.

After updating SSL settings, go to Application Gateway > Listeners, confirm each HTTPS listener shows a green checkmark next to its certificate, and do a test HTTPS request. If you see ERR_SSL_PROTOCOL_ERROR or SSL_ERROR_RX_RECORD_TOO_LONG in a browser, the backend is receiving HTTPS traffic but the backend server isn't configured for SSL, check whether your HTTP Settings has the protocol set to HTTP or HTTPS.

3
Untangle Routing Rules and Path-Based Listener Conflicts

Azure Application Gateway routing rules are where a huge number of "it's just not routing correctly" bugs live. The gateway processes rules in order, and a misconfigured or overly broad rule will swallow traffic that was meant for a more specific rule further down the list. Here's how to diagnose and fix this.

Go to Application Gateway > Rules. You'll see two types: Basic rules (one listener, one backend pool) and Path-based rules (one listener, multiple path conditions routing to different backend pools). The most common mistake I see is a Basic rule sitting above a Path-based rule for the same listener, the Basic rule matches first and all traffic goes to the default backend, ignoring your path conditions entirely.

For path-based routing, click into your rule and examine the Path map. Each path entry needs a leading slash and should use a wildcard for prefix matching. For example:

Path: /images/*   → Backend Pool: images-pool
Path: /api/*      → Backend Pool: api-pool
Path: /           → Backend Pool: default-pool  (default route, required)

The default route (just /) is not optional. If you don't have one, any request that doesn't match a specific path will return a 502. I've seen this bite teams who route /api/* perfectly but then wonder why their root domain returns a 502, it's always the missing default.

For multi-site hosting (multiple hostnames on one gateway), each hostname needs its own listener with the correct Host name field populated. Listeners without a hostname configured will match all host headers, which again creates ordering conflicts. After any rule or listener change, allow 2–3 minutes for the configuration to propagate across all gateway instances before testing.

4
Configure Autoscaling and Fix Capacity-Related 503 Errors

If you're seeing intermittent 503 errors under load, especially during traffic spikes, your Application Gateway v2 instance may be hitting its capacity ceiling before autoscaling kicks in. This is more common than people expect because there's a warm-up lag between when traffic spikes and when new gateway instances come online.

Navigate to Application Gateway > Configuration. In the Autoscaling section, you'll see two fields: Minimum instance count and Maximum instance count. The default minimum of 0 means the gateway can scale all the way down to zero instances during idle periods, which sounds great for cost, but it means cold starts that can take 6–8 minutes, during which new requests will fail. For any production workload, set your minimum to at least 2.

Setting a minimum of 2 also gives you the zone redundancy benefit that Application Gateway v2 is designed for, instances are spread across availability zones by default, so one zone failing doesn't take down your gateway. With minimum instance count at 0 or 1, you're not actually getting that protection.

For the maximum instance count, the v2 SKU supports up to 125 instances. For most workloads, setting the maximum to 10–20 is sufficient. Going higher is fine but affects your cost ceiling, Application Gateway v2 bills per capacity unit and per hour. To view your current capacity unit consumption, go to Application Gateway > Metrics and add the Capacity Units metric, watch this over 24–48 hours to understand your traffic patterns before setting limits.

If you need fixed capacity (predictable billing, consistent workloads), you can disable autoscaling entirely and set a specific instance count. Go to Configuration > Autoscaling and switch to Manual. This is exactly what Microsoft's official documentation describes as "fixed capacity mode", useful when your traffic is predictable and you don't want surprise scaling events during off-hours maintenance windows.

5
Migrate from Application Gateway v1 to v2 Before April 28, 2026

If you're still running on Application Gateway v1 SKU (Standard or WAF), stop reading the other sections and focus here. Microsoft announced the v1 retirement on April 28, 2023 with a firm end-of-support date of April 28, 2026. After that date, v1 receives no security patches, no support tickets, and no SLA coverage. Given today's date, you may have days or weeks left, not months.

The good news is that Microsoft provides an official PowerShell migration script. You'll need Azure PowerShell module version 6.2.0 or later. Here's the migration command sequence:

# Install or update the Az module if needed
Install-Module -Name Az -AllowClobber -Scope CurrentUser

# Connect to your Azure account
Connect-AzAccount

# Download the official migration script from Microsoft's GitHub
# Then run it with your resource group and gateway name:
.\AzureAppGWMigration.ps1 `
  -resourceId /subscriptions/{sub-id}/resourceGroups/{rg-name}/providers/Microsoft.Network/applicationGateways/{gw-name} `
  -subnetAddressRange 10.0.0.0/24 `
  -appgwname {new-v2-gateway-name} `
  -sslCertificates $certs `
  -trustedRootCertificates $trustedCerts

A few things to know before running this. The migration script creates a new v2 gateway in the same resource group, it doesn't modify your existing v1 gateway in place. Your v1 gateway stays running until you're ready to cut over DNS. The script will output the new gateway's frontend IP; update your DNS records or Traffic Manager profile to point to the new v2 frontend IP, verify traffic is flowing correctly, then delete the v1 gateway.

Key differences you'll notice after migration: your gateway now has a static VIP (the public IP won't change on stop/start), autoscaling is available, and if you're in a region with availability zones, instances are zone-redundant by default. Also, the v2 SKU now supports TCP/TLS proxying as a public preview feature, this means you can terminate non-HTTP traffic at the gateway layer, which wasn't possible with v1.

Budget roughly 2–4 hours for a straightforward migration in a test environment, and a maintenance window for production. Document your existing v1 rules, listeners, and backend pool settings before starting, screenshots or an ARM template export from the v1 gateway will save you if you need to reference the original configuration.

Advanced Azure Application Gateway Troubleshooting

Diagnosing WAF False Positives Blocking Legitimate Traffic

If you've enabled the Web Application Firewall on your Application Gateway WAF_v2 SKU and legitimate users are suddenly getting 403 errors, you're hitting WAF rule false positives. This is common after first enabling WAF or after uploading a new application that uses request patterns the OWASP Core Rule Set flags as suspicious.

The right approach is to put WAF in Detection mode first, not Prevention mode. In Detection mode, the WAF logs what it would block without actually blocking anything. Go to Application Gateway > Web Application Firewall > WAF policy > Policy settings and set Mode to Detection. Run your application normally for 24–48 hours, then pull the WAF logs from Log Analytics.

In your Log Analytics workspace, run this query to find which rules are firing most:

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.NETWORK"
| where Category == "ApplicationGatewayFirewallLog"
| where action_s == "Matched"
| summarize count() by ruleId_s, Message
| order by count_ desc

From these results, you can create WAF exclusions for specific rules and specific request attributes (like a particular cookie name or header that's triggering a SQL injection rule when it's actually benign). Go to WAF policy > Exclusions to add per-rule exclusions without disabling entire rule groups.

Reading Application Gateway Access Logs in Log Analytics

Enable diagnostic logging if you haven't already, this is off by default and it's the single most important thing you can do for ongoing Azure Application Gateway troubleshooting. Go to Application Gateway > Diagnostic settings > Add diagnostic setting and send ApplicationGatewayAccessLog, ApplicationGatewayPerformanceLog, and ApplicationGatewayFirewallLog to a Log Analytics workspace.

Once logs are flowing, this query identifies 5xx errors and which backend server they came from:

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.NETWORK"
| where Category == "ApplicationGatewayAccessLog"
| where httpStatus_d >= 500
| project TimeGenerated, clientIP_s, requestUri_s, httpStatus_d, serverRouted_s, timeTaken_d
| order by TimeGenerated desc

The serverRouted_s column shows the exact backend IP and port the gateway sent the request to, this tells you whether a 502 is coming from a specific VM in your pool (replace that VM) or from all backends simultaneously (check your routing rules or the application itself).

Enterprise and Domain-Joined Scenarios

In enterprise environments, Application Gateway for Containers is increasingly relevant for AKS workloads. If you're running the ALB Controller add-on for AKS and pods aren't receiving traffic, first check that the ApplicationLoadBalancer custom resource is in a Ready state with kubectl get applicationloadbalancer -A. Annotation mismatches between the Ingress resource and the Gateway resource are the most common cause of routing failures in this setup, the kubernetes.azure.com/ingress-gateway annotation on your Ingress must exactly match the name of your Gateway resource.

When to Call Microsoft Support

Escalate to Microsoft Support when: your gateway fails to provision and the portal shows a generic "InternalServerError" with no actionable detail; you see intermittent packet drops that NSG flow logs don't explain; your v1-to-v2 migration script fails with a VNet address space conflict you can't resolve; or WAF is blocking traffic with rules you've already added exclusions for. Open a Severity B ticket for production issues. For the v1 retirement deadline specifically, Microsoft has a dedicated migration assistance program, mention "Application Gateway v1 retirement migration" in your ticket subject for routing to the right team.

Prevention & Best Practices for Azure Application Gateway

The teams I've seen run Application Gateway with the fewest incidents share a few habits that are worth building into your operations from day one.

Always keep a minimum instance count of 2 or higher in production. Zero-instance autoscaling is tempting for cost savings, but the cold-start time during traffic surges creates real user-facing outages. Two instances also puts your gateway across two availability zones, which is the zone redundancy design that Application Gateway v2 was built around. The cost difference between 0 and 2 minimum instances is small relative to the reliability gain.

Test your health probe configuration explicitly before going live. Create a dedicated health check endpoint in your application, something like /health that returns HTTP 200 with a minimal response body. Don't use your application's root path for health probes because authentication redirects (302s), maintenance pages, and A/B testing frameworks can all interfere with what the probe sees. Configure a custom probe in Application Gateway > Health probes pointing to this dedicated endpoint.

Export your ARM template after every configuration change. Go to Application Gateway > Export template and save the JSON to your version control system. This gives you an audit trail of configuration changes and a recovery path if a bad update bricks your gateway. It also makes disaster recovery dramatically faster, redeploying from an ARM template takes minutes versus hours of manual portal clicking.

Set up Azure Monitor alerts on key Application Gateway metrics. At minimum, alert on: UnhealthyHostCount greater than 0 (any backend going unhealthy), FailedRequests spike (sudden increase in 4xx/5xx), and ResponseStatus filtered to 5xx codes. Set these alerts to notify your on-call channel before users start reporting problems.

Quick Wins
  • Enable Application Gateway diagnostic logs to Log Analytics before you need them, you can't retroactively query logs that weren't being collected
  • Use Azure Policy to enforce that all Application Gateways in your subscription must have WAF enabled, this prevents shadow deployments without security coverage
  • Tag your Application Gateway subnets clearly, the subnet must be dedicated to the gateway (no other resources), and teams sometimes accidentally deploy VMs into it
  • Review the Azure Advisor recommendations blade monthly, Advisor specifically calls out Application Gateway instances approaching capacity limits or running on deprecated configurations

Frequently Asked Questions

What exactly is Azure Application Gateway and how is it different from Azure Load Balancer?

Azure Application Gateway is a Layer 7 (application layer) load balancer that makes routing decisions based on HTTP request content, things like URL paths, host headers, and query strings. So you can send /api requests to one set of servers and /images requests to a completely different set, all from the same public IP. Azure Load Balancer works at Layer 4 (transport layer) and only sees IP addresses and ports, it has no awareness of what's inside the HTTP requests. If you need to route based on HTTP attributes, integrate WAF for security, or terminate SSL at the gateway, Application Gateway is the right tool. For pure TCP/UDP load balancing across VMs, Load Balancer is simpler and faster.

Why does my Azure Application Gateway backend show as "Unknown" instead of Healthy or Unhealthy?

"Unknown" status almost always means the Application Gateway hasn't been able to run a health probe yet, either it was just deployed, just had its configuration updated, or something is preventing the probe from running at all. The most common cause of a persistent "Unknown" (more than 5 minutes after a configuration save) is an NSG rule blocking TCP traffic on ports 65200–65535 to the Application Gateway subnet. Check your NSG inbound rules and make sure there's an Allow rule for the GatewayManager service tag on that port range. If the NSG looks correct, check whether the backend subnet NSG is blocking inbound traffic from the Application Gateway subnet CIDR on your application port.

What is Application Gateway for Containers and should I use it instead of regular Application Gateway?

Application Gateway for Containers is a newer, Kubernetes-native variant designed specifically for AKS workloads. It uses the ALB Controller add-on and exposes load balancing configuration through standard Kubernetes Gateway API resources instead of Azure portal settings. If you're running AKS and want your platform team to manage infrastructure while application teams manage their own ingress through Kubernetes manifests, Application Gateway for Containers is the better fit. For traditional VM-based backends, App Service, or hybrid scenarios, the standard Application Gateway is still the right choice. The two products share the same underlying Layer 7 intelligence but have very different management interfaces and deployment models.

My Application Gateway is returning 502 errors to users but backend health shows all servers as Healthy, what's going on?

This specific combination, healthy backends plus 502s, points to a problem at the application level, not the infrastructure level. The gateway can reach your backend and the health probe is succeeding, but the actual application requests are failing. Pull your Application Gateway access logs from Log Analytics and look at the serverResponseLatency field, if it's timing out (values in the tens of thousands of milliseconds), your application is slow to respond and hitting the gateway's request timeout, which defaults to 30 seconds. Increase the timeout in your HTTP Settings if needed. Also check whether the backend is returning valid HTTP responses for the actual request paths (not just the health probe path), a 500 from the application will surface as a 502 to the client.

How do I check if I'm on Application Gateway v1 or v2, and what happens if I don't migrate before the deadline?

In the Azure portal, open your Application Gateway resource and look at the Overview blade. The SKU name tells you everything: Standard or WAF (without a version suffix) means you're on v1. Standard_v2 or WAF_v2 means you're on v2. You can also check with Azure CLI: az network application-gateway show --name {gw-name} --resource-group {rg} --query "sku". If you don't migrate before April 28, 2026, Microsoft will not forcibly delete your gateway immediately, but it will become unsupported, meaning no bug fixes, no security patches, and no SLA. Any support tickets you open will be directed to complete the migration first before receiving assistance.

Can I use Azure Application Gateway with App Service (Web Apps) as my backend?

Yes, and it's a very common pattern. Add your App Service's default hostname (e.g., myapp.azurewebsites.net) to the backend pool. The important gotcha: enable the "Pick host name from backend address" option in your HTTP Settings. Without this, Application Gateway sends the original client's host header to App Service, but App Service validates that the host header matches its configured custom domain or the .azurewebsites.net hostname, and it will reject requests with the wrong header, returning a 404 or redirect loop. Turning on "Pick host name from backend address" makes the gateway send myapp.azurewebsites.net as the host header to App Service regardless of what the client sent, which is what App Service expects.

Related Microsoft Fix Guides

H
Sai Kiran Pandrala
Our team includes certified Microsoft engineers, Azure architects, and system administrators with 10+ years of enterprise IT experience. Every guide is written from hands-on troubleshooting, not guesswork. We test every fix before publishing.