How to Troubleshoot Azure Logic Apps: Fix Failed Runs Fast

Microsoft Fix Intermediate 14 min read Official Docs Grounded Updated April 20, 2026

Why Azure Logic Apps Fail , and Why the Portal Won't Tell You Directly

I've seen this on so many Azure tenants: a Logic App that ran perfectly in staging suddenly starts failing in production at 2 AM, your on-call engineer gets paged, and the portal shows a bright red "Failed" status with an error message that says something like The execution of template action 'Send_an_email' failed: the tracking ID is '...'. Helpful? Not really. I know that's frustrating , especially when the workflow is tied to a business-critical process like invoice processing or Teams notifications.

Azure Logic Apps is Microsoft's low-code workflow automation platform, but "low-code" doesn't mean "low maintenance." Under the hood, your Logic App is an orchestration engine coordinating HTTP calls, OAuth tokens, service bus queues, API connections, and retry loops, any one of which can silently break.

The most common root causes I see when troubleshooting Azure Logic Apps break down into five buckets:

  • Connector authentication failures, OAuth tokens expire, service principal secrets rotate, or managed identity permissions get revoked. This is the #1 cause in enterprise environments.
  • HTTP 429 throttling, You're hitting the connector's API rate limit. Office 365, SharePoint, and Salesforce connectors all have per-user, per-minute limits that are much lower than most people assume.
  • Timeout errors (HTTP 408/504), Your action is waiting longer than the Logic Apps runtime allows. Consumption-tier workflows time out at 120 seconds per action by default.
  • Malformed dynamic content expressions, A @body('Parse_JSON')?['someField'] expression silently returns null when the upstream data shape changes, causing downstream actions to receive unexpected null inputs.
  • Misconfigured triggers, Recurrence triggers drift, HTTP request triggers lose their callback URL after a definition change, and event-based triggers (like Service Bus) silently stop polling when their connection expires.

There's also a less obvious category: infrastructure-level throttling from Azure itself. Logic Apps on the Consumption tier run on shared infrastructure. If another tenant on the same stamp is hammering resources, you can see intermittent failures that look random but are actually resource contention. This is almost never mentioned in the error message, you have to dig into Azure Resource Health to find it.

The good news is that Azure Logic Apps has excellent diagnostic tooling if you know where to look. Run History, diagnostic logs, Application Insights integration, and the Logic Apps Standard tier's built-in monitoring dashboard all give you the data you need, it's just buried a few clicks deep. This guide walks you through all of it, from a fast first-look fix to deep enterprise-level debugging. Browse all Microsoft fix guides →

The Quick Fix, Check Run History Before You Do Anything Else

Before you touch a single setting, open Run History. This is the single most valuable screen in the entire Logic Apps portal experience, and I'm always surprised how many engineers skip straight to the Designer when something breaks.

Here's how to get there fast:

  1. Open the Azure Portal at portal.azure.com and navigate to your Logic App resource.
  2. In the left-hand blade, under Overview, you'll see a Runs history tab directly on the overview page, click it. Alternatively, scroll down in the left navigation to Monitoring > Runs history.
  3. You'll see a list of every workflow execution with a timestamp, status (Succeeded / Failed / Cancelled / Skipped), and duration. Click any Failed row.
  4. The run detail view shows every action in your workflow as a collapsible card. Failed actions are highlighted in red. Click the red action card.
  5. Inside, you get three critical pieces of data: the Inputs tab (what the action received), the Outputs tab (what came back, including the raw HTTP response body), and the Error section at the top showing the exact error code and message.

Nine times out of ten, the raw HTTP response body in the Outputs tab tells you everything. A 401 Unauthorized means your connection needs to be re-authorized. A 429 Too Many Requests means you're being throttled. A 400 Bad Request usually means your dynamic content expression is producing a null or mistyped value.

For Logic Apps Standard (the newer single-tenant model), Run History is under Workflows, click your specific workflow, then Overview > Run History. The experience is nearly identical.

Pro Tip
When you click a failed action and see the raw HTTP response, copy the x-ms-client-request-id header value from the Outputs. This is your correlation ID. If you ever need to open a Microsoft support ticket, this ID lets their backend team pull the exact execution trace from their internal logs, it dramatically reduces back-and-forth with support.
1
Read the Failed Action's Raw Error Output in Detail

Once you've opened a failed run, you need to read it methodically, don't just skim for red. Logic Apps can have cascading failures where the first failed action isn't the root cause; it's a dependency that a later action needed.

Look at the action list from top to bottom. Find the first action marked Failed (not Skipped, Skipped actions are a side effect of an upstream failure, not the cause). Click that action and expand the Outputs section.

Common error patterns and what they mean:

  • ActionFailed. Error code: ConnectionAuthorizationFailed, Your API connection stored in Azure is no longer valid. The OAuth token expired or the service principal behind the connection lost its permissions.
  • The request failed with error: 'OperationTimedOut', The action hit the 120-second execution timeout. Your downstream API is too slow, or you're processing too much data in a single action.
  • InvalidTemplate. Unable to process template language expressions, A dynamic content expression like @{body('HTTP')?['data']} threw an evaluation error. The field path doesn't exist in the actual response.
  • WorkflowRunLimitReached, You're on the Consumption tier and have hit the concurrent run limit (which is 1 by default for stateful workflows unless you've explicitly configured concurrency).

If you see a deeply nested JSON error in the body, use the Raw outputs button (looks like <>) to get the unformatted response, sometimes the formatted view truncates the most important part of the message.

Once you've identified the exact error, note the action name. Every action in a Logic App has an internal name (visible in Code View) that's used in all downstream references. If the action name contains spaces, Logic Apps replaces them with underscores in expressions, a mismatch here is another common silent failure.

2
Re-Authorize Broken API Connections

This is the fix for the majority of "it worked yesterday, it's broken today" scenarios. API connections, the objects that store OAuth credentials for connectors like Outlook, SharePoint, Teams, and Salesforce, have tokens that expire. When they do, every Logic App that uses that connection starts failing with a 401 or ConnectionAuthorizationFailed.

Here's how to fix it:

  1. In the Azure Portal, search for API connections in the top search bar and open it. This is a separate resource type, not inside your Logic App blade.
  2. You'll see a list of all API connection objects in your subscription. Find the one your Logic App uses, it's usually named something like office365, sharepointonline, or azureblob.
  3. Click the connection. On the Overview page, look for a yellow warning banner that says "This connection is not authenticated." If you see it, click Edit API connection in the left blade.
  4. Click Authorize and sign in with the account that owns the connection. Then click Save.
  5. Go back to your Logic App and manually trigger a run using the Run Trigger button (top of the Designer page) to confirm it's fixed.

In enterprise environments where connections are owned by a service account, you may need to coordinate with your IT admin. If the service account's password rotated or its MFA policy changed, the connection will silently break on the next token refresh cycle, which happens roughly every hour for most OAuth2 providers.

For Logic Apps Standard tier using Managed Identity instead of stored OAuth connections, the fix is different: check that your Logic App's system-assigned identity still has the correct RBAC role assignments in Azure Active Directory > Enterprise Applications. A role assignment accidentally removed during an access review is a surprisingly common cause of Managed Identity failures.

3
Fix HTTP 429 Throttling Errors with Retry Policies

If Run History shows your Logic App failing with HTTP 429 Too Many Requests, you're being rate-limited by the connector's backend API. I see this most often with Office 365 connectors (the Exchange Online API has a default limit of around 2,000 calls per minute per mailbox) and SharePoint connectors (throttling kicks in aggressively on list operations against large lists).

The short-term fix is to configure a proper retry policy on the throttled action. By default, Logic Apps uses an Exponential Interval retry policy, but the default settings are often too aggressive.

To set a custom retry policy on an action:

  1. In Logic Apps Designer, click the three dots (...) menu on the affected action.
  2. Select Settings.
  3. Under Retry Policy, change the type to Exponential Interval (if not already set) and configure:
    • Count: 4
    • Interval: PT5S (5 seconds minimum)
    • Maximum Interval: PT1H
    • Minimum Interval: PT5S
  4. Click Done and save your Logic App.

For persistent throttling on high-volume workflows, the better long-term fix is to add a Delay action before the throttled step, using a dynamic delay calculated from the Retry-After header in the 429 response. You can capture this header using an HTTP action's output expression: @outputs('HTTP_Action_Name')['headers']['Retry-After'].

// Example: Read Retry-After header in a Delay Until action
@{addSeconds(utcNow(), int(outputs('Send_an_email')['headers']['Retry-After']))}

If you're processing large batches, say, sending 10,000 emails from a SharePoint list, consider switching the Logic App to use chunking via the built-in pagination settings (Settings > Pagination), combined with a concurrency limit of 1 on the For Each loop to serialize execution and reduce API call velocity.

4
Debug Expression and Dynamic Content Errors

Expression errors are the sneakiest category of Logic Apps failures because they often don't cause the workflow to hard-fail immediately. Instead, a bad expression silently evaluates to null, which then gets passed downstream and blows up three actions later with a confusing error that looks totally unrelated.

If your Run History shows a failure in an action that's receiving dynamic content from an earlier step, go back and check the earlier step's outputs. Expand the action in the run detail view and look at what it actually produced. Then compare that to what your expression expected.

The most common expression bugs I fix:

Using dot notation on a possibly-null parent:

// Breaks when 'results' array is empty:
@body('Parse_JSON')['results'][0]['name']

// Safe version using null-coalescing:
@{if(empty(body('Parse_JSON')?['results']), 'No results', body('Parse_JSON')?['results'][0]?['name'])}

Datetime format mismatch:

// Always format dates explicitly for downstream APIs:
@{formatDateTime(triggerBody()?['created_at'], 'yyyy-MM-ddTHH:mm:ssZ')}

Encoding issues in HTTP body:

// Use base64 encoding when passing binary content:
@{base64(body('Get_blob_content'))}

To test expressions without re-running the full workflow, use the Logic Apps Expression Editor's built-in test panel (the beaker icon next to the expression field). Paste in a sample JSON body matching your expected input and evaluate the expression in isolation. This saves enormous time compared to re-triggering the whole workflow to test one expression change.

For complex expressions, switch to Code View (the </> button in the Designer toolbar) and inspect the raw workflow definition JSON. The full expression is easier to read and edit there than in the collapsed Designer card view.

5
Enable Diagnostic Logs and Application Insights for Ongoing Monitoring

Run History is great for post-mortem analysis, but it doesn't give you alerting, trend analysis, or the ability to query across thousands of runs. For that, you need Diagnostic Settings connected to either a Log Analytics Workspace or Application Insights.

Here's how to set it up in about 5 minutes:

  1. In your Logic App's left blade, navigate to Monitoring > Diagnostic settings.
  2. Click + Add diagnostic setting.
  3. Give it a name (e.g., LogicApp-Diagnostics).
  4. Under Logs, check WorkflowRuntime (captures all run, trigger, and action events).
  5. Under Destination details, check Send to Log Analytics workspace and select your workspace.
  6. Click Save.

Once logs are flowing (allow 5–10 minutes), you can run KQL queries against them in Log Analytics. Here's a query that shows all failed runs in the last 24 hours with their error codes:

AzureDiagnostics
| where ResourceType == "WORKFLOWS/RUNS"
| where OperationName == "Microsoft.Logic/workflows/runs/action/completed"
| where status_s == "Failed"
| where TimeGenerated > ago(24h)
| project TimeGenerated, workflowName_s, actionName_s, code_s, error_message_s
| order by TimeGenerated desc

For Application Insights integration (Logic Apps Standard tier), go to your Logic App resource > Settings > Application Insights > toggle it on and point it at an existing App Insights resource. This gives you a live failure rate dashboard, end-to-end transaction search by correlation ID, and smart detection alerts, all without writing any custom code.

Once Application Insights is connected, use this PowerShell snippet to pull recent failure summaries via the REST API:

$appInsightsAppId = "YOUR_APP_INSIGHTS_APP_ID"
$apiKey = "YOUR_API_KEY"
$query = "requests | where success == false | summarize count() by name, resultCode | order by count_ desc"
Invoke-RestMethod -Uri "https://api.applicationinsights.io/v1/apps/$appInsightsAppId/query?query=$([uri]::EscapeDataString($query))" `
  -Headers @{"x-api-key" = $apiKey} | Select-Object -ExpandProperty tables

Advanced Troubleshooting for Enterprise and Complex Scenarios

If the steps above haven't resolved your issue, you're likely dealing with something at the infrastructure, network, or identity layer. These scenarios are less common but significantly harder to diagnose without the right tools.

Integration Service Environment (ISE) and VNet Connectivity Issues

If your Logic App runs inside an ISE or uses the Logic Apps Standard tier with VNet integration, network connectivity is often the culprit for failures. Your Logic App might not be able to reach a private endpoint, an on-premises resource over ExpressRoute, or an Azure service with a Service Endpoint restriction.

Check your ISE's network health: go to Integration Service Environment > Network health. Any subnet showing a status other than "Healthy" will cause action failures. Common issues include NSG rules blocking ports 443 and 9350–9354 (which Logic Apps requires for outbound communication to the Azure Service Bus relay used by connectors).

For Logic Apps Standard with VNet integration, ensure your App Service Plan's outbound IPs are whitelisted in any downstream firewall rules. You can find the outbound IPs under your Logic App Standard resource > Properties > Outbound IP addresses.

Managed Identity Permission Errors

When using system-assigned or user-assigned Managed Identity for connector auth, a missing role assignment silently fails with a generic 403. Validate permissions using the Azure CLI:

# Check role assignments for your Logic App's managed identity
az role assignment list --assignee $(az logicapp show --name YOUR_LOGIC_APP --resource-group YOUR_RG --query "identity.principalId" -o tsv) --all --output table

Stateful vs Stateless Workflow Behavior (Standard Tier)

In Logic Apps Standard, stateless workflows don't persist run history by default, they're faster but invisible in Run History unless you enable storage. If you're not seeing failed runs in history for a Standard tier workflow, check whether it's stateless: in the workflow's Settings blade, look for Storage > Run History Storage and toggle it on.

Workflow Definition Corruption After Export/Import

If you moved a Logic App via ARM template or exported and re-imported the definition, check whether any connector references are broken. In the Designer, broken connector references show as grey "Invalid" cards. The underlying issue is usually that the $connections parameter in the ARM template contains hardcoded resource IDs from the source subscription. Fix this by updating the connection ID and access key references in your ARM parameters file before re-deploying.

Azure Resource Health and Platform Incidents

Before spending hours debugging your workflow, check whether Azure itself is having a bad day. Navigate to your Logic App resource > Support + troubleshooting > Resource health. If the platform reports "Degraded" or "Unavailable," your failures may be a platform issue, not your code. You can also check the Azure Status History page for your region to correlate failure timestamps with known incidents.

When to Call Microsoft Support

If you've confirmed the workflow definition is correct, connections are authorized, retry policies are set, and Azure Resource Health shows healthy, but failures persist, it's time to escalate. Gather: (1) the x-ms-client-request-id from the failed action's outputs, (2) the exact timestamp of at least three failed runs, (3) the workflow resource ID (from Properties blade). With those three things, Microsoft Support can pull the internal execution trace and identify platform-level issues that are completely invisible in the portal UI.

Prevention & Best Practices to Keep Logic Apps Running Reliably

The best time to fix a Logic App failure is before it happens. These are the practices I've seen make the biggest difference on the production Logic Apps environments I've worked on, some of them managing thousands of runs per day.

Use Managed Identity everywhere you can. OAuth-based API connections expire and need periodic human re-authorization, which is a maintenance liability. Managed Identity-based connections never expire, don't store credentials anywhere, and are automatically rotated by Azure. For connectors that support it (Azure Blob, Key Vault, Service Bus, Event Hubs, SQL), switch to Managed Identity and remove the OAuth connection dependency entirely.

Set explicit retry policies on every HTTP and connector action. The default "Fixed Interval" policy with 4 retries and a 7-second wait is fine for many cases, but for connectors that are known to throttle (SharePoint, Outlook), use Exponential Interval with a longer max interval. This single change eliminates the majority of transient failures in high-volume workflows.

Add error handling with Scope + Run After logic. Wrap groups of related actions in a Scope action, then add a parallel branch that only runs if the Scope fails (set Run After to "has failed" and "has timed out"). In that error branch, send an alert to Teams or log to a storage table with the full error context captured via @result('Your_Scope_Name'). This gives you proactive alerting instead of discovering failures hours later.

Test your trigger's callback URL after every definition change. When you save a Logic App with an HTTP Request trigger, the callback URL does not change, but if you ever delete and recreate the trigger, a new URL is generated and all upstream callers will break. Keep the callback URL documented and verify it after any significant workflow change.

Pin your connector versions. Microsoft occasionally releases breaking changes to managed connector APIs. In the Logic App Designer, some connectors show a version selector. Pin to a specific version rather than "Latest" for production workflows, and test version upgrades in a staging Logic App first.

Quick Wins
  • Enable Diagnostic Settings sending WorkflowRuntime logs to Log Analytics on every Logic App from day one, retrofitting this after a failure is frustrating.
  • Set a Logic Apps alert rule on "Failed Runs" metric (under Monitoring > Alerts) with a threshold of >0 over 5 minutes, you'll know about failures before your users do.
  • Use the trackedProperties feature on key actions to inject custom business identifiers (order IDs, customer numbers) into your logs, this makes searching diagnostic logs 10x faster during incidents.
  • Store any secrets your Logic App needs (API keys, connection strings) in Azure Key Vault and reference them via Key Vault references in your workflow parameters, never hardcode them in action inputs.

Frequently Asked Questions

Why does my Azure Logic App show "Succeeded" but the data didn't actually get sent or processed?

This usually happens when the final action in your workflow, like "Send an email" or "Insert a row", completed without an HTTP error code, but the downstream system silently rejected or discarded the payload. The Logic App has no way to know that the email landed in a spam folder or that the database trigger ignored the row. To catch these silent failures, enable the Split On feature for batch processing scenarios, and add response validation in an HTTP action by checking the status code expression: @{if(equals(outputs('HTTP')?['statusCode'], 200), 'OK', 'Failed')}. For email delivery specifically, use the "Get message" action 60 seconds later to verify the message exists in Sent Items.

My Logic App trigger stopped firing, how do I find out why?

The trigger history is separate from the run history. In your Logic App blade, go to Monitoring > Trigger history. You'll see every trigger evaluation, including ones that fired, ones that were skipped (conditions not met), and ones that failed. A Recurrence trigger that shows consistent "Skipped" entries usually means the workflow was disabled at that time. A polling trigger (like SharePoint "When an item is modified") that shows no entries at all means the trigger itself failed to poll, check the connection status in API connections. Event-based triggers that fail silently often have an expired webhook registration, which you can fix by disabling and re-enabling the Logic App.

How do I rerun a failed Logic App run without triggering it from scratch?

Yes, Logic Apps supports resubmitting failed runs directly from Run History. Open the failed run, and at the top of the run detail page click Resubmit. This replays the exact same trigger payload that originally started the run, so your Logic App processes the same data again without requiring the original triggering event to happen again. Note that Resubmit is only available for stateful Logic Apps runs on both Consumption and Standard tiers, and it won't work if the original run was triggered by a recurrence timer (it will just start a new run with the current timestamp instead). For HTTP-triggered workflows, Resubmit is especially handy because you don't have to re-POST from the calling system.

I'm getting "WorkflowRunThrottled" errors, how do I fix Logic Apps throttling?

The WorkflowRunThrottled error means you've hit Azure's Logic Apps platform limits, not a connector API limit. On the Consumption tier, the limit is 100,000 action executions per 5 minutes per workflow. If you're genuinely exceeding this (common in data migration scenarios with giant For Each loops), you have a few options: break the work into smaller chunks using the "chunking" pattern, use the Concurrency control on For Each loops (Settings > Concurrency > Degree of Parallelism, reduce to 1–5), or migrate to Logic Apps Standard tier which has much higher limits. You can also see your current consumption against limits in Azure Monitor under Metrics, search for "Action Executions" as the metric name to plot your usage over time.

Can I use PowerShell to programmatically check Logic App run failures?

Absolutely, this is one of the most useful things to put in an ops script or Azure Automation runbook. The Az PowerShell module has full Logic Apps support. Here's the command to pull failed runs from the last hour:

Get-AzLogicAppRunHistory -ResourceGroupName "YourRG" -Name "YourLogicApp" `
  | Where-Object { $_.Status -eq "Failed" -and $_.StartTime -gt (Get-Date).AddHours(-1) } `
  | Select-Object Name, Status, StartTime, EndTime, Error

You can also use the Azure REST API directly with Invoke-RestMethod and a Bearer token from Get-AzAccessToken if you need more granular control over the response shape, or if you're calling this from a non-Azure environment like a GitHub Actions workflow or an external monitoring system.

My Logic App works in the portal but fails when deployed via ARM template or Bicep, why?

ARM/Bicep deployments of Logic Apps are notoriously tricky because the workflow definition JSON is embedded in the ARM template, and the $connections object references specific resource IDs and access keys that differ between environments. The most common issues are: (1) the connection resource IDs point to the source subscription instead of the target, parameterize every connection resource ID using [resourceId(...)] expressions rather than hardcoding them; (2) the connection access key is stale or missing in your parameters file, retrieve it dynamically in your deployment script using listKeys(resourceId(...), apiVersion); (3) the Managed Identity principal ID is different in the target environment, so role assignments from the source don't exist in the target. Run az deployment group what-if before applying any ARM deployment to preview exactly what will change.

Related Microsoft Fix Guides

H
Sai Kiran Pandrala
Our team includes certified Microsoft engineers, Azure architects, and system administrators with 10+ years of enterprise IT experience. Every guide is written from hands-on troubleshooting, not guesswork. We test every fix before publishing.