Fix Azure Cache for Redis Errors & Migrate to Azure Managed Redis
Why This Is Happening
I've seen this exact situation play out on dozens of Azure projects: your application was humming along fine, your Azure Cache for Redis instance was doing its job, and then something broke , or you logged into the Azure portal and got blindsided by a retirement notice you weren't expecting. Either way, you're here because something isn't working the way it should, and Azure's error messages rarely tell you the full story.
Let's get the biggest issue out of the way first. As of October 2025, Microsoft officially announced the retirement of all Azure Cache for Redis SKUs. That means Basic, Standard, Premium, Enterprise, and Enterprise Flash tiers are all going away. If you're running caches in the Enterprise or Enterprise Flash tiers, new cache creation was blocked starting April 1, 2026, and any remaining caches in those tiers will be automatically migrated to Azure Managed Redis by March 31, 2027. For the Basic, Standard, and Premium tiers, the timeline is slightly longer, new caches for existing customers were blocked from October 1, 2026, and full retirement lands on September 30, 2028.
That retirement news is why a lot of teams right now are suddenly hitting errors they've never seen before. You might be trying to create a new Azure Cache for Redis instance and getting a blocked-operation error with no helpful explanation in the portal. Or your devops pipeline is failing because a Terraform script is trying to provision a new Enterprise-tier cache. These aren't bugs in your code, they're the result of the service lifecycle hitting hard dates.
Beyond the retirement issues, the day-to-day Azure Cache for Redis problems I see most often fall into three buckets. First, connection timeouts, especially the StackExchange.Redis.RedisTimeoutException in .NET apps, which is almost always caused by thread pool starvation or misconfigured retry logic. Second, memory pressure errors where your cache starts evicting data aggressively and your application suddenly behaves like it has no cache at all. Third, configuration drift, someone changed the maxmemory-policy setting months ago, the team forgot about it, and now cache behavior is unpredictable under load.
The frustrating part is that Azure's portal error messages for all of these are often vague. "Operation failed" or "Cache unavailable" doesn't help you triage fast. I know how much this hurts when it's blocking a production deployment or causing a customer-facing outage.
The good news: every one of these is fixable with the right approach. Browse all Microsoft fix guides →
The Quick Fix, Try This First
If you're hitting connection errors right now and need your application talking to Azure Cache for Redis again in the next ten minutes, start here. Open the Azure portal, navigate to your cache resource, and click Console in the left-hand sidebar under "Support + troubleshooting." This drops you into a live Redis CLI connected directly to your instance.
Run this command immediately:
INFO stats
Look at the rejected_connections field in the output. If that number is anything above zero, especially if it's climbing, your cache is actively refusing new client connections because you've hit the maxclients limit for your tier. That's your smoking gun.
Next, run:
INFO memory
Check used_memory_human against maxmemory_human. If used memory is at or above the maxmemory threshold, Redis is under pressure and your eviction policy is actively working, which can cause seemingly random cache misses that look like connection problems to the application layer.
If you've confirmed the issue is maxclients, the fastest resolution for Basic/Standard/Premium tiers is to scale up your cache. In the Azure portal, go to your cache resource → Scale in the left menu → select a larger tier → click Save. The scale operation takes a few minutes, but your application will remain connected throughout it for Standard and Premium tiers.
If you're on the Enterprise tier and can't create new caches as of April 1, 2026, skip straight to Step 4 in this guide about migrating to Azure Managed Redis, that's your path forward.
Before you change anything, you need to know what's actually broken. Guessing costs time. Azure Monitor has Azure Cache for Redis-specific metrics that show you exactly what's happening.
In the Azure portal, open your cache resource and click Metrics in the left menu under "Monitoring." Add the following metrics to a single chart:
- Connected Clients, how many clients are currently connected
- Cache Misses, requests that didn't find a key in cache
- Server Load, percentage of time the Redis server is processing requests
- Used Memory, current memory usage
Set the time range to the last 24 hours and look for spikes in Server Load or Connected Clients that correlate with the time your errors started. A Server Load consistently above 80% is a red flag, Redis is single-threaded for command processing, so sustained high server load means your application is overwhelming the cache with requests.
You can also pull these diagnostics via the Azure CLI if you prefer working in a terminal. Replace the placeholders with your actual resource group and cache name:
az monitor metrics list \
--resource /subscriptions/{sub-id}/resourceGroups/{rg-name}/providers/Microsoft.Cache/Redis/{cache-name} \
--metric "connectedclients,usedmemory,serverLoad" \
--interval PT1M \
--output table
If you see Server Load above 80% for more than a few minutes at a time, that confirms you need to scale up or optimize your command patterns. If Used Memory is consistently near the max, your eviction policy needs attention (covered in Step 2). If Connected Clients is plateauing at exactly your tier's limit, that's the maxclients ceiling being hit.
When this step works, you'll have a clear picture of which specific metric is the root cause, not a vague "it's slow." That context makes every subsequent fix step faster and more confident.
One of the most common Azure Cache for Redis configuration mistakes I see is leaving the maxmemory-policy at its default or setting it to something that made sense at launch but doesn't match how the application actually uses the cache under real load. When Redis runs out of memory and there's no clear eviction strategy, it either throws errors or starts evicting keys your application didn't expect to lose.
The default policy for most Azure Cache for Redis tiers is volatile-lru, which only evicts keys that have an expiry set. If your application stores a lot of keys without expiry, volatile-lru does nothing and Redis returns an OOM (out of memory) error instead.
To check your current policy, open the Azure portal → your cache resource → Advanced settings in the left menu. You'll see a maxmemory-policy dropdown. Common options and when to use each:
allkeys-lru, evicts the least recently used key across all keys, regardless of expiry. Best for general caching workloads.volatile-lru, only evicts keys with expiry set. Good when you have a mix of session data (with TTL) and permanent reference data you never want evicted.allkeys-lfu, evicts least frequently used keys. Better than LRU when access patterns are highly skewed.noeviction, never evicts, returns errors on write when memory is full. Only use this if your application explicitly handles OOM errors and you want full control.
You can also set this via Azure CLI to make it repeatable and scriptable across environments:
az redis update \
--name {cache-name} \
--resource-group {rg-name} \
--set "redisConfiguration.maxmemory-policy=allkeys-lru"
After saving, go back to the Azure portal Console and run CONFIG GET maxmemory-policy to confirm the change took effect. You should see the new policy reflected immediately without a cache restart.
If your application is .NET and you're seeing StackExchange.Redis.RedisTimeoutException: Timeout performing GET (5000ms), or similar, in your application logs, the problem is almost never actually Azure Cache for Redis itself. I know that sounds counterintuitive, but hear me out.
The most common cause of this error is .NET thread pool starvation. When your application is under load, the .NET thread pool can't keep up with the number of async operations, and StackExchange.Redis's async calls time out waiting for a thread to complete them. The cache is fine. Your app is choking.
First, check your application logs for this pattern alongside the timeout: look for high ThreadPool.QueueLength values or messages like "IOCP: (Busy=X, Free=Y, Min=Z, Max=W)". If Busy is close to Max, that's thread pool starvation.
The fix is to configure minimum thread pool sizes in your application startup. In your Program.cs or Startup.cs:
// Add this early in your startup code
ThreadPool.SetMinThreads(200, 200);
For the StackExchange.Redis connection itself, make sure you're using a singleton connection multiplexer, not creating a new connection per request. A new ConnectionMultiplexer per request is one of the most destructive patterns I see in Azure Redis implementations:
// WRONG, do not do this
var redis = ConnectionMultiplexer.Connect(connectionString);
var db = redis.GetDatabase();
// RIGHT, singleton, created once at startup
private static readonly Lazy<ConnectionMultiplexer> _lazyConnection =
new Lazy<ConnectionMultiplexer>(() =>
ConnectionMultiplexer.Connect(connectionString));
public static ConnectionMultiplexer Connection => _lazyConnection.Value;
Also set connectTimeout=5000 and syncTimeout=5000 in your connection string, and add abortConnect=false so the client retries on startup rather than crashing immediately if the cache isn't ready. If the error persists after these changes, the issue may be a genuine network latency spike, check the Cache Latency metric in Azure Monitor at the time of the timeouts.
This is the most important step in this guide right now, given where we are in the Azure Cache for Redis retirement timeline. If you're still running on any Azure Cache for Redis tier, Basic, Standard, Premium, Enterprise, or Enterprise Flash, you need a migration plan. The good news is Azure provides tooling to make this less painful than it sounds.
Azure Managed Redis reached General Availability in May 2025. It's the successor product, built on a more modern architecture. The migration path varies by which tier you're currently on.
For Enterprise and Enterprise Flash tiers, Microsoft will auto-migrate remaining caches by March 31, 2027. But waiting for an auto-migration is risky, you lose control of the timing and could face a cutover during a high-traffic period. Do it on your own schedule instead.
For Basic, Standard, and Premium tiers, the deadline is softer but still real. Plan your migration before October 1, 2026 for existing customers.
To start a migration in the Azure portal, navigate to your existing cache resource and look for the Migration to Azure Managed Redis option in the left sidebar. If it's not there yet, you can initiate it via CLI:
az redis export \
--name {cache-name} \
--resource-group {rg-name} \
--prefix {export-prefix} \
--container {sas-url-to-storage-container}
Then create your new Azure Managed Redis instance and import the RDB file. Choose your target tier carefully, Azure Managed Redis offers Memory Optimized, Balanced, Compute Optimized, and Flash Optimized tiers. Note that all Flash Optimized tiers are still in Public Preview as of this writing. Tiers using over 120 GB of storage (like Memory Optimized M150 and above, Balanced B150 and above, and Compute Optimized X150 and above) are also still in Public Preview. For production workloads, stay on GA tiers until those reach full availability.
After migration, update your application connection strings to point to the new Azure Managed Redis endpoint. Then run your test suite and monitor the Cache Hits and Cache Misses metrics for the first 24 hours to verify your warm-up strategy is working correctly.
Since November 2024, when you create a new Standard or Premium tier Azure Cache for Redis cache, Azure automatically enables zone redundancy using Automatic_Zonal_Allocation in regions that support availability zones. That's a great default. But a lot of teams who migrated to Azure Managed Redis or scaled between tiers found that zone redundancy didn't carry over as expected, and they're now running without the HA protection they thought they had.
Here's how to verify your zone redundancy status. In the Azure portal, go to your cache resource → Overview. Look for the "Availability Zones" field in the properties panel. If it shows "Not configured" or is blank, zone redundancy is off.
To check via CLI:
az redis show \
--name {cache-name} \
--resource-group {rg-name} \
--query "zones" \
--output tsv
If the output is empty, you're not zone-redundant. For Premium tier caches, you can choose availability zones explicitly during creation or update. For Standard tier, let Azure choose automatically. You cannot add zone redundancy to an existing non-redundant cache without recreating it, so this is something to verify before you consider a deployment complete, not after.
When creating a new Premium cache with explicit zone selection via CLI:
az redis create \
--name {cache-name} \
--resource-group {rg-name} \
--location eastus \
--sku Premium \
--vm-size P1 \
--zones 1 2 3
For Standard tier, use --zone-redundancy Enabled and let Azure handle zone assignment automatically. After creation, re-run the az redis show command above and confirm zones are listed in the output. If your region doesn't support availability zones, the command will succeed but the zones field will still be empty, that's expected, not a bug.
Advanced Troubleshooting
If the steps above didn't resolve your issue, or if you're working in an enterprise or domain-joined environment with tighter network controls, this section is for you.
Diagnosing with Azure Diagnostic Logs
Enable diagnostic logs for your cache if you haven't already. In the Azure portal, go to your cache resource → Diagnostic settings → Add diagnostic setting. Send logs to a Log Analytics workspace. The key log category to enable is ConnectedClientList, which gives you a snapshot of all connected clients every hour. This is invaluable for identifying which specific client IPs are consuming the most connections.
Once logs are flowing, run this Kusto query in Log Analytics to spot connection hogs:
AzureDiagnostics
| where ResourceType == "REDIS"
| where Category == "ConnectedClientList"
| extend clients = parse_json(properties_s)
| mv-expand client = clients
| summarize count() by tostring(client.ip), bin(TimeGenerated, 1h)
| order by count_ desc
Firewall and Private Endpoint Issues
If your application can't reach the cache at all, not timeouts, but outright connection refused, the issue is almost always the firewall rules or private endpoint configuration. Azure Cache for Redis defaults to allowing all Azure services, but if you've tightened the firewall, check Firewall in the left menu of your cache resource. Verify that your application's outbound IP or VNet subnet is in the allowlist.
For caches deployed inside a VNet, confirm that the subnet has the required NSG rules open. Redis uses port 6379 (non-SSL) and 6380 (SSL/TLS). If your application is forcing TLS (which it should be), only 6380 needs to be open. Check that your NSG isn't blocking 6380 on the subnet where the cache sits:
az network nsg rule list \
--nsg-name {nsg-name} \
--resource-group {rg-name} \
--output table
Redis Version Upgrade Considerations
Azure Cache for Redis introduced Redis 7.2 preview support on the Enterprise tier in June 2024. If you're on Enterprise or Enterprise Flash and have upgraded (or are planning to), be aware that Redis 7.2 includes breaking changes to certain commands. The SINTERCARD command, LCS (Longest Common Subsequence) operations, and some ACL command behaviors changed between 6.x, 7.0, and 7.2. If your application was written against Redis 6.x APIs and you triggered a version upgrade, either manually through Redis Version in the portal, or automatically, test your command patterns against the new version before committing to it in production.
You can manually trigger an upgrade from the portal under your cache resource → Advanced settings → Redis version. Manual upgrades let you pick your maintenance window, which is the main reason to do it yourself rather than waiting for the automatic upgrade Microsoft runs.
Prevention & Best Practices
The best Azure Cache for Redis troubleshooting session is the one you never have to do. Based on what I've seen break most often, and the official Microsoft guidance backing these up, here are the practices that keep Azure Cache for Redis healthy long-term.
Monitor proactively, not reactively. Set up Azure Monitor alerts before something breaks. The three alerts every cache deployment should have: an alert when Server Load exceeds 80% for more than 5 minutes; an alert when Used Memory Percentage exceeds 90%; and an alert when Cache Misses spikes more than 200% above your 7-day baseline. These three alerts catch the vast majority of problems before your application notices them.
Plan your Azure Managed Redis migration now, not when the deadline forces your hand. Given the retirement timeline, Enterprise tiers auto-migrated by March 31, 2027; Basic/Standard/Premium retired September 30, 2028, treating this as a Q3 or Q4 project is fine for most teams, but blocking out the time and doing a test migration in a non-production environment this quarter means you understand the gotchas before you're under pressure.
For new Azure Cache for Redis deployments (noting that new Enterprise tier creation is blocked as of April 1, 2026), use Azure Managed Redis going forward. It's GA, it's the direction Microsoft is investing in, and starting fresh on the successor product avoids you having to migrate later.
Always use TLS. Port 6379 (non-TLS) should never be used for production workloads. In your connection string, always specify ssl=true and port 6380. This is non-negotiable for any data that matters.
- Enable zone redundancy for all Standard and Premium tier caches, it's automatic for new caches but verify it's active on existing ones
- Set a keyspace notification so your application can react to eviction events rather than discovering keys are gone on the next cache miss
- Use the Azure Cache for Redis connection string from Key Vault, not hardcoded in app config, this makes key rotation and migration easier
- Run
SLOWLOG GET 25in the Redis Console monthly to catch commands that are starting to slow down before they cause timeouts
Frequently Asked Questions
Is Azure Cache for Redis actually being shut down? When exactly?
Yes, it's being retired across all tiers. Microsoft announced this in October 2025. For Enterprise and Enterprise Flash tiers, you can't create new caches as of April 1, 2026, and remaining caches will be migrated to Azure Managed Redis by March 31, 2027. For Basic, Standard, and Premium tiers, new cache creation for existing customers was blocked from October 1, 2026, and the service fully retires September 30, 2028. The replacement product is Azure Managed Redis, which reached General Availability in May 2025.
My Azure Cache for Redis connection string stopped working after I tried to create a new cache, what happened?
If you're trying to provision a new cache in the Enterprise or Enterprise Flash tier after April 1, 2026, creation is blocked, that's the retirement restriction, not a bug. Your existing caches' connection strings are still valid; only new cache creation is blocked. To get a new cache, you need to create one in Azure Managed Redis instead. If your existing cache connection string broke independently of this, check whether the cache resource still exists in the Azure portal and whether its firewall rules changed.
What's the difference between Azure Cache for Redis and Azure Managed Redis?
Azure Managed Redis is the next-generation successor to Azure Cache for Redis. It offers different performance tiers, Memory Optimized, Balanced, Compute Optimized, and Flash Optimized, instead of the Basic/Standard/Premium/Enterprise tiers you're used to. The underlying Redis protocol is the same, so your application code and connection libraries (StackExchange.Redis, Jedis, ioredis, etc.) work without modification. The main differences are in the management layer, pricing model, and additional features around scaling and persistence.
Why am I getting "StackExchange.Redis.RedisTimeoutException" even though my cache looks healthy in Azure Monitor?
When the cache metrics look clean but your .NET application keeps timing out, the culprit is almost always .NET thread pool starvation on the application side, not the cache itself. The StackExchange.Redis library uses async I/O, and when your thread pool is saturated, async continuations queue up and exceed the timeout threshold before a thread becomes available to execute them. Add ThreadPool.SetMinThreads(200, 200) at application startup and make sure you're using a singleton ConnectionMultiplexer rather than creating a new connection per request.
Is zone redundancy automatic for Azure Cache for Redis, or do I need to enable it?
Since November 2024, new Standard and Premium tier caches in supported regions are created with zone redundancy enabled by default using Automatic_Zonal_Allocation. However, caches created before November 2024 do not have this automatically applied. You can't add zone redundancy to an existing cache without recreating it, so check your existing caches. For Premium tier, you can also explicitly choose which availability zones to use during creation. To verify, run az redis show --name {name} --resource-group {rg} --query "zones" and confirm the output is non-empty.
What Azure Managed Redis tiers are safe for production right now?
As of May 2025 when Azure Managed Redis hit GA, the tiers confirmed for production use are Memory Optimized up to M120, Balanced up to B120, and Compute Optimized up to X120. Anything above 120 GB in those tiers, M150, B150, X150 and higher, is still in Public Preview. All Flash Optimized tiers are also still in Public Preview. Microsoft's guidance is to avoid Preview features for production workloads unless you have explicit support commitments. For production, stick to the sub-150 GB tiers until the higher tiers reach GA.