How to Fix Azure DocumentDB Errors & Setup Issues

Microsoft Fix Intermediate 18 min read Official Docs Grounded Updated April 20, 2026

Why This Is Happening

I've seen this exact situation on dozens of Azure deployments , you spin up a new Azure DocumentDB cluster, paste in what looks like a perfectly valid connection string, hit run, and get smacked with a connection timeout or an authentication error that tells you absolutely nothing useful. Or maybe you got the cluster running fine last week, and now your application just can't reach it. Either way, you're stuck, and the Azure portal error messages aren't exactly writing you a love letter of explanation.

Azure DocumentDB is Microsoft's fully managed NoSQL database service with MongoDB compatibility, and that's precisely where a lot of the confusion starts. Because it speaks MongoDB's wire protocol, developers often assume they can treat it exactly like a self-hosted MongoDB instance. You can't , not entirely. There are Azure-specific layers around firewall rules, authentication, encryption, and cluster configuration that don't exist in a plain MongoDB setup, and when something goes wrong, the error surfaces as a generic MongoDB driver error rather than an Azure-specific one.

The most common Azure DocumentDB problems I see fall into four buckets. First, firewall and network rules, your client IP isn't whitelisted, or a virtual network rule is blocking traffic. Second, authentication failures, wrong credentials, expired passwords, or a secondary user that was created without the right role assignments. Third, connection string format issues, Azure DocumentDB uses a specific endpoint format that differs slightly from standard MongoDB URIs, and a single wrong character will kill the connection silently. Fourth, vector search configuration problems, this is newer territory for most teams, and the indexing pipeline setup trips people up constantly.

What makes Azure DocumentDB troubleshooting particularly annoying is that it's built on the open-source DocumentDB project (which itself is built on PostgreSQL), so you've got three layers of technology stack to reason about when something breaks. A problem that looks like a MongoDB driver issue might actually be a PostgreSQL-level configuration constraint surfacing through the compatibility layer.

The good news is that almost every common Azure DocumentDB error has a clear, fixable cause. I'll walk you through the full diagnostic and repair process right here. Browse all Microsoft fix guides →

The Quick Fix, Try This First

Before you go deep on diagnostics, try this first, it resolves about 60% of Azure DocumentDB connection failures I see in the wild.

Open the Azure portal and navigate to your DocumentDB cluster. In the left sidebar, click Networking. Look at the Firewall rules section. Nine times out of ten, you'll find that either your current public IP address is missing from the allowed list, or the entire public access toggle is set to Disabled.

Click Add current client IP address, Azure will auto-detect your IP and add it. If you're running this from an application server rather than your local machine, you'll need to manually type in that server's outbound IP. Click Save and wait about 90 seconds for the rule to propagate.

Now go back to your application and retry the connection. Use this exact connection string format, Azure DocumentDB requires the tls=true parameter and the correct port:

mongodb://<username>:<password>@<cluster-name>.mongocluster.cosmos.azure.com:10255/?tls=true&authMechanism=SCRAM-SHA-256&retrywrites=false

That retrywrites=false flag matters. Azure DocumentDB doesn't support retryable writes the same way a MongoDB replica set does, and leaving it enabled can cause write operations to fail with a cryptic driver-level error. I've watched senior engineers spend two hours debugging what was ultimately a one-parameter fix.

If the connection works now, great, you're done. If you're still getting errors, keep reading. The next sections cover every other scenario systematically.

Pro Tip
When testing Azure DocumentDB connectivity, always try MongoDB Shell (mongosh) before blaming your application code. Run mongosh "mongodb://<your-connection-string>" from your terminal first. If mongosh connects but your app doesn't, the problem is in your driver configuration or connection pool settings, not Azure. This single test saves hours of chasing the wrong root cause.
1
Verify Your Cluster Status and Grab the Correct Connection String

The first thing to do is confirm your Azure DocumentDB cluster is actually running and healthy. This sounds obvious, but I've seen teams spend 45 minutes debugging a connection that was failing simply because the cluster was in a Provisioning or Updating state.

Go to the Azure portal and search for DocumentDB in the top search bar. Select your cluster from the list. On the Overview page, check the Status field in the top properties panel. It should say Ready. If it says anything else, Provisioning, Updating, Stopping, wait it out. Provisioning a new cluster typically takes 5–10 minutes.

Once you've confirmed the cluster is Ready, click Connection strings in the left sidebar. You'll see pre-generated connection strings for several languages and tools. Don't type these manually, copy them directly. A single transposed character in a cluster hostname will produce a DNS resolution failure that looks exactly like a firewall block.

If you're connecting with MongoDB Compass, use the Compass connection string shown in the portal. For application code, use the driver-specific string. Azure DocumentDB officially supports connections via MongoDB Shell, MongoDB Compass, and Studio 3T, plus all major language drivers including Node.js, Python, C#, Java, and Go.

Copy the connection string, paste it into a plain text editor, and visually inspect it. Confirm:

  • The hostname ends in .mongocluster.cosmos.azure.com
  • The port is 10255
  • The string includes tls=true

If all three are present and the cluster status is Ready, the connection string itself is not your problem. Move to Step 2.

2
Configure Firewall Rules to Allow Your Client

Azure DocumentDB's firewall is the most common single source of connection failures, and it's completely separate from any Azure Virtual Network settings you may have configured at a higher level. The DocumentDB cluster has its own network access controls, and by default they are restrictive.

In the Azure portal, navigate to your DocumentDB cluster and click Networking in the left menu. You'll see two sections: Public network access and Firewall rules.

First, make sure Public network access is set to Enabled. If it's disabled, no external client can reach the cluster regardless of firewall rules, that's your problem, right there. Toggle it on and save.

Second, add your IP address to the firewall allowlist. For local development, click Add current client IP address. For production application servers, you need the outbound IP of your compute layer, find this in your App Service's Properties blade, or by running this from your server:

curl -s https://api.ipify.org

Add that IP to the firewall rules list and click Save. Changes propagate within about 60–90 seconds.

If you're running inside an Azure Virtual Network, you have a better option than IP-based rules. Under Networking, you can add a Virtual Network rule that allows your entire subnet to communicate with DocumentDB without exposing it to the public internet at all. This is the approach I recommend for any production workload, it's more secure and eliminates IP rotation headaches when your compute layer scales or redeploys.

After saving the firewall rule, wait 90 seconds, then retry your mongosh test command from earlier. A successful connection here means your network path is clear. If you're still getting connection refused or network timeout, check whether your corporate network has an outbound firewall or proxy that's blocking port 10255.

3
Fix Authentication, Create or Reset Secondary Users

Authentication errors in Azure DocumentDB are usually one of three things: a wrong password, a user that doesn't exist in the database you're connecting to, or a user that was created without the right role. The error message you get, typically Authentication failed or Command not authorized, doesn't tell you which one, which is why this step trips people up.

Azure DocumentDB supports creating secondary users with specific role-based access. This is different from the admin credentials you set when you first created the cluster. If your application is using a secondary user account, that account needs to exist at the database level, not just at the cluster level.

To create or verify a secondary user, go to your cluster in the Azure portal and click MongoDB users in the left sidebar. You'll see a list of all users associated with the cluster. If your application's username isn't in this list, that's your problem, the user doesn't exist.

Click + Add to create a new user. Set a username, generate a strong password, and assign the appropriate role. For read-write application access, use the readWrite role on the specific database your application connects to. For admin-level operations, use dbOwner. Don't give your application the cluster admin role, that's a security risk and violates the principle of least privilege.

Once the user is created, update your connection string with the new credentials:

mongodb://appuser:YourSecurePassword123%21@cluster-name.mongocluster.cosmos.azure.com:10255/myDatabase?tls=true&authMechanism=SCRAM-SHA-256&retrywrites=false

Note that special characters in passwords must be URL-encoded, an exclamation mark ! becomes %21, an at-sign @ becomes %40, and so on. Unencoded special characters in a connection string URI will silently corrupt the parsed credentials, producing an authentication error that looks completely unrelated to the real cause.

If the user exists but you've forgotten the password, there's no "view password" option, you'll need to delete and recreate the user, or use the cluster admin credentials to run a db.changeUserPassword() command via mongosh.

4
Configure and Troubleshoot Vector Search Indexing

If you're using Azure DocumentDB for AI-driven applications, which is increasingly common given its built-in vector search capabilities, there's a specific class of errors that shows up around index configuration. I've seen teams build their entire embedding pipeline and then hit a wall because the vector index wasn't set up correctly before they started inserting documents.

Azure DocumentDB supports vector search and AI-driven embeddings directly in the database. But unlike a regular field index, a vector index has specific requirements around the number of dimensions, the similarity metric, and the index type. Getting any of these wrong either produces an error on index creation or, worse, silently creates an index that produces wrong similarity results.

To create a vector index via mongosh, connect to your cluster and run:

db.runCommand({
  createIndexes: "myCollection",
  indexes: [
    {
      name: "vectorIndex",
      key: { embedding: "cosmosSearch" },
      cosmosSearchOptions: {
        kind: "vector-ivf",
        numLists: 100,
        similarity: "COS",
        dimensions: 1536
      }
    }
  ]
})

The dimensions value must exactly match the output dimension of your embedding model. OpenAI's text-embedding-ada-002 produces 1536 dimensions. OpenAI's text-embedding-3-small defaults to 1536 but can be reduced. If your dimensions don't match what the model actually outputs, every vector search query will return incorrect results or throw a dimension mismatch error.

For production workloads, Azure DocumentDB also supports Product Quantization and Half-Precision Vector Indexing to improve search speed and reduce storage requirements. Product Quantization compresses vectors into smaller codes, dramatically improving query speed at a small accuracy tradeoff. Half-Precision Indexing stores vectors in 16-bit floats instead of 32-bit, cutting index memory in half. Both are worth enabling once you've validated your pipeline is returning correct results at full precision.

If your vector search queries are returning results but they look wrong, verify that you're using the same embedding model at query time as you used at indexing time. Mixing embedding models is the single most common cause of bad vector search results, and it's completely silent, producing no error, just nonsensical rankings.

5
Fix High Availability and Cross-Region Replication Issues

If you're seeing intermittent connection drops, sudden latency spikes, or failover-related errors, the problem is likely in your high availability configuration. Azure DocumentDB has strong built-in HA capabilities, but they only protect you if they're set up correctly from the start.

Azure DocumentDB supports cross-region replication to ensure your applications stay available even if an entire Azure region goes offline. This is critical for production workloads. To check your current replication configuration, go to your cluster in the Azure portal and click Replicate data globally (or Geo-replication, depending on your portal version). You'll see a world map showing your primary region and any configured secondary regions.

If you're running a single-region setup and experiencing downtime during Azure maintenance windows, this is your fix: add at least one secondary replica in a geographically distinct region. The best practice from Microsoft's own high availability documentation is to choose a secondary region that's in the same Azure geography but a different data center pair, for example, East US with a secondary in East US 2.

For disaster recovery planning, the key metric is your RPO (Recovery Point Objective) and RTO (Recovery Time Objective). Azure DocumentDB's asynchronous replication to secondary regions means there's a small replication lag, typically under 5 seconds under normal conditions, but potentially higher under heavy write load. If your workload absolutely cannot tolerate any data loss, consider synchronous writes with write concern settings:

// In your MongoDB driver connection options
writeConcern: { w: "majority", wtimeout: 5000 }

One thing I want to flag clearly: if you're experiencing a cluster-wide outage and neither your primary nor secondary region is responding, that's beyond what any configuration change will fix in the moment. That's the scenario where you need Azure's support team and your own DR runbook. Document your failover procedures before you need them, not during the incident.

After configuring or verifying your replication setup, test your failover by temporarily modifying your application's connection string to point at the secondary region's endpoint. Verify reads and writes succeed. Then switch back. This test takes 10 minutes and gives you confidence that your HA configuration will actually hold when it matters.

Advanced Troubleshooting

If the steps above didn't resolve your issue, or if you're in an enterprise environment with more complex requirements, here's where to dig deeper.

Diagnosing with Azure Monitor and Diagnostic Logs

Azure DocumentDB integrates with Azure Monitor for logging and metrics. If you're not sure what's actually failing, turn on diagnostic logging first. Go to your cluster in the Azure portal, click Diagnostic settings in the left menu, then click + Add diagnostic setting. Enable MongoRequests logs and route them to a Log Analytics workspace. Within a few minutes of enabling this, you can query your logs with Kusto Query Language (KQL):

AzureDiagnostics
| where ResourceType == "MONGOCLUSTERS"
| where statusCode_s != "200"
| project TimeGenerated, operationType_s, statusCode_s, errorCode_s, durationMs_s
| order by TimeGenerated desc
| take 100

This query shows you every failed operation with the actual error code and duration. Error code 13 means authentication failure. Error code 11000 means duplicate key violation. Error code 16500 means you've hit a request rate limit. Each of these points to a completely different fix.

Customer-Managed Key (CMK) Encryption Issues

Azure DocumentDB supports customer-managed key encryption through Azure Key Vault, and this is a known source of subtle, hard-to-diagnose failures. If your cluster was working and then suddenly stopped accepting connections after a key rotation or vault policy change, CMK is almost certainly the cause.

The cluster needs continuous read access to your CMK in Key Vault to decrypt data. If that access is revoked, even temporarily, even accidentally, the cluster will refuse all connections until access is restored. Go to your Azure Key Vault, click Access policies, and verify that your DocumentDB cluster's managed identity still has Get, Wrap Key, and Unwrap Key permissions. If any of these are missing, add them back. The cluster should recover within a few minutes of permissions being restored.

Microsoft Entra ID and Role-Based Access Control

For enterprise deployments using Microsoft Entra ID (formerly Azure Active Directory) for authentication, there's an additional RBAC layer on top of the database-level user management. If your application uses a managed identity or service principal to authenticate, that identity needs the DocumentDB Account Contributor role or a custom role with the appropriate data plane permissions assigned at the resource level in Azure RBAC.

To check role assignments, go to your DocumentDB cluster in the portal, click Access control (IAM), then Role assignments. Find your service principal or managed identity and verify its role. If it's missing, click + Add > Add role assignment and assign the appropriate role.

Infrastructure as Code Deployment Failures

Teams deploying Azure DocumentDB with Bicep or Terraform frequently hit configuration drift issues, the IaC template creates the cluster correctly but doesn't configure firewall rules or secondary users, so the first deployment works but the second one (after a destroy/recreate) leaves the cluster in an inaccessible state. The fix is to include your firewall rules and user creation in the same IaC template as the cluster, so they're always deployed together atomically.

When to Call Microsoft Support
If you've worked through every step in this guide and you're still seeing failures, or if your cluster is in a Failed or Degraded state in the Azure portal that hasn't self-resolved after 30 minutes, it's time to open a support ticket. Don't wait too long on a production cluster, Microsoft's SLA response times are tier-dependent, so the sooner you escalate, the sooner they can engage the DocumentDB engineering team if needed. Go to Microsoft Support, select Azure, and choose Technical as the support type. Include your cluster resource ID, the time window when the issue started, and the specific error messages or KQL query results from your diagnostic logs.

Prevention & Best Practices

Most Azure DocumentDB problems are entirely preventable. I've watched teams hit the same wall repeatedly because they stood up clusters quickly for a proof of concept, never went back to harden the configuration, and then promoted that same setup to production. Here's how to do it right from the start, or how to retrofit best practices onto an existing cluster.

First, treat your connection strings and credentials as secrets from day one. Store them in Azure Key Vault or your CI/CD platform's secret management system, never in environment files checked into source control, never hardcoded in application code. Azure DocumentDB supports integration with managed identities specifically to eliminate the need to handle credential strings in application configuration at all.

Second, set up monitoring before you go to production, not after your first incident. Configure Azure Monitor alerts for connection count thresholds, error rate spikes, and storage utilization. A sudden drop in connection count is often the first signal of a firewall misconfiguration or credential expiry, catching it via an alert is infinitely better than catching it via an angry user report at 2 AM.

Third, plan your index strategy before you start inserting data. Changing index configuration on a collection that already contains millions of documents is slow and can impact query performance during the rebuild. For vector indexes in particular, the dimension count is locked at index creation time, you can't change it without dropping and recreating the index, which means reindexing all your documents.

Fourth, test your disaster recovery process on a schedule. Cross-region replication gives you the infrastructure for failover, but if you've never actually done a failover drill, you don't know if your application handles it correctly. Run a DR test every quarter, it takes about an hour and gives you confidence that your HA investment will pay off when you actually need it.

Quick Wins
  • Use Virtual Network rules instead of IP-based firewall rules for production workloads, they're more secure and don't break when your compute layer's IP changes
  • Create separate secondary users with minimum required permissions for each application service, never share the cluster admin credential across applications
  • Enable diagnostic logging from day one, it costs almost nothing and makes troubleshooting dramatically faster
  • Version-pin your MongoDB driver to match Azure DocumentDB's supported protocol version, check the official compatibility matrix before upgrading drivers

Frequently Asked Questions

Why does Azure DocumentDB say "connection refused" even though my firewall rule is saved?

Firewall rule propagation takes 60–90 seconds after you click Save in the portal, if you tested immediately after saving, try again. Also double-check that you've enabled Public network access in the Networking blade; the firewall rules don't matter if the public access toggle is set to Disabled. Finally, verify you're connecting to port 10255 specifically, not the standard MongoDB port 27017, Azure DocumentDB listens on 10255 and a connection attempt to 27017 will be refused regardless of firewall settings.

Can I use Azure DocumentDB as a drop-in replacement for MongoDB Atlas?

For most use cases, yes, Azure DocumentDB's MongoDB compatibility means your existing MongoDB drivers, queries, and aggregation pipelines will work without modification. The main gaps are in features that are specific to MongoDB Atlas, like Atlas Search (full-text search via Lucene), Atlas Triggers, and some advanced aggregation operators that haven't been implemented in the compatibility layer yet. If your application uses any MongoDB Atlas-specific APIs rather than standard MongoDB driver calls, test those specific operations against Azure DocumentDB before committing to a migration. The open-source DocumentDB project's GitHub issues list is a good place to check current compatibility gaps.

My vector search is returning wrong results, the similarity scores don't make sense. What's wrong?

The most common cause is an embedding model mismatch, you indexed your documents with one embedding model and you're querying with a different one, or a different version with different output dimensions. Even slight differences in tokenization or normalization between model versions will produce cosine similarity scores that look plausible but are semantically meaningless. The second most common cause is not normalizing your vectors before indexing when using cosine similarity, if your embedding model doesn't output unit vectors, you need to normalize them yourself before inserting. Check that dimensions in your index definition exactly matches your model's actual output size, and verify you're using the same model endpoint for both indexing and querying.

How do I rotate credentials for Azure DocumentDB without downtime?

The cleanest approach is to create a new secondary user with the new credentials before deleting the old one, then do a rolling update of your application configuration, update one instance at a time so at least some instances are always connected with valid credentials. In Azure App Service, you can use deployment slots to swap configurations without any gap in connectivity. If you're using Azure Key Vault references for your connection strings, updating the secret in Key Vault will be picked up automatically by your application on the next secret refresh cycle, no deployment required. Only delete the old user after you've confirmed all application instances are successfully connecting with the new credentials.

What's the difference between Azure DocumentDB and Azure Cosmos DB for MongoDB?

These are two different services that both offer MongoDB compatibility, and the naming is genuinely confusing, I understand why people mix them up. Azure Cosmos DB for MongoDB is the older, proprietary Microsoft service with MongoDB API compatibility. Azure DocumentDB is newer and is built on the open-source DocumentDB project, which is itself built on PostgreSQL. Azure DocumentDB is designed to be more aligned with open standards and is specifically positioned as the home for the open-source DocumentDB community. For new projects, Azure DocumentDB is generally the recommended path; for existing Cosmos DB for MongoDB workloads, migration tooling exists but isn't automatic.

My Bicep/Terraform deployment creates the cluster but it's not accessible, what am I missing?

The most common oversight in IaC deployments of Azure DocumentDB is forgetting to declare firewall rules and secondary users as resources in the same template. The cluster resource provisions successfully, but without explicit firewall rule resources, no client can connect, and without secondary user resources, your application has no credentials. Make sure your Bicep or Terraform template includes firewallRules child resources with your application's IP ranges, and uses mongodbUserDefinitions to create application-level database users. Store the generated passwords in Azure Key Vault as part of the same deployment pipeline, never as plain text in your state file.

Related Microsoft Fix Guides

H
Sai Kiran Pandrala
Our team includes certified Microsoft engineers, Azure architects, and system administrators with 10+ years of enterprise IT experience. Every guide is written from hands-on troubleshooting, not guesswork. We test every fix before publishing.