Azure AI Custom Vision: Fix Setup, Training & API Errors

Microsoft Fix Intermediate 14 min read Official Docs Grounded Updated April 20, 2026

What's in This Guide

Why This Is Happening
The Quick Fix
Step-by-Step Solution
Advanced Troubleshooting
Prevention & Best Practices
FAQ

Why This Is Happening

I've seen this scenario play out more times than I can count: you spin up an Azure AI Custom Vision project, upload your training images, hit the "Train" button , and something breaks. Maybe the prediction endpoint returns a 401. Maybe your Azure Custom Vision model accuracy is stuck at 65% no matter how many images you throw at it. Maybe you're staring at a blank portal screen and wondering if your subscription is even active.

Here's the reality: Azure AI Custom Vision is a powerful image recognition service, but it sits at the intersection of Azure subscriptions, resource keys, training pipelines, and prediction APIs , and each of those layers has its own failure modes. Microsoft's error messages don't always tell you which layer is the problem. A generic "Request failed" in the portal could mean your training key is wrong, your resource region is mismatched, or you've exceeded a quota you didn't know existed.

There's also a major piece of context you need right now if you're building anything production-facing with Azure Custom Vision: Microsoft has announced the planned retirement of the Azure Custom Vision service on September 25, 2028. That doesn't mean stop using it today, full support continues until that date, but it does mean every new project you start should have a migration plan in the back pocket. I'll cover your migration options (Azure Machine Learning AutoML, Azure AI Foundry, Azure AI Content Understanding) in depth later in this guide.

The most common Azure Custom Vision setup problems I see break down into four buckets:

Authentication failures, wrong endpoint, wrong key, wrong region, keys swapped between training and prediction resources
Training accuracy problems, too few images per label, wrong domain selected, subtle visual differences the service wasn't designed for
Prediction API errors, unpublished iterations, endpoint URL typos, content-type header missing on REST calls
Quota and limit hits, exceeding the free tier's project or image caps without realizing it

I know this is frustrating, especially when you're trying to validate a prototype fast and every hour counts. Let's work through each of these systematically. Browse all Microsoft fix guides →

The Quick Fix, Try This First

Before diving deep, run through this checklist. In my experience, roughly 70% of Azure AI Custom Vision issues get resolved by catching one of these basics:

1. Verify you have two separate resources, not one. Azure Custom Vision requires a Training resource and a Prediction resource. These are separate Azure resources, each with their own keys and endpoints. I constantly see people using their Training key against the Prediction endpoint (or vice versa) and then debugging for an hour. Go to the Custom Vision portal, click the gear icon (Settings) in the top right, and you'll see both resource keys listed. Cross-reference these with your Azure Portal under your Custom Vision resource.

2. Confirm the region matches everywhere. If your Custom Vision resource is in East US, your endpoint must reference East US. Using a West Europe endpoint URL with an East US key will fail every time. The endpoint format looks like: https://<your-resource-name>.cognitiveservices.azure.com/, grab this directly from Azure Portal → your Custom Vision resource → Keys and Endpoint.

3. Check that your iteration is published. Training a model is not the same as publishing it. Until you explicitly publish an iteration, the Prediction API has nothing to call. In the Custom Vision portal, go to the Performance tab of your project, select your trained iteration, and click Publish. Give the iteration a name (you'll use this name in your prediction URL).

4. Make sure you have at least 5 images per tag for a test, but aim for 50+ for real accuracy. The service won't train at all if you're below the minimum threshold per label. The official recommendation is 50 images per label as a starting baseline for decent model performance.

5. Check your Azure subscription status. Navigate to portal.azure.com → Subscriptions and confirm the subscription isn't suspended or over budget. A suspended subscription silently kills all API calls with unhelpful error responses.

Pro Tip

When you first set up Azure Custom Vision, immediately save both keys (Key 1 and Key 2) and both endpoints (Training and Prediction) into a local .env file or Azure Key Vault before doing anything else. The single most time-consuming debugging session I've ever had came from mixing up which key belonged to which resource three days after initial setup, when memory gets fuzzy and the portal shows you four keys at once.

Verify Your Azure Custom Vision Resource Configuration in the Portal

Start at the source of truth: the Azure Portal itself. Browse to portal.azure.com, search for "Custom Vision" in the top search bar, and open your Custom Vision resource. You should see two resources if you set things up correctly, one with a kind of CustomVision.Training and one with CustomVision.Prediction. If you only see one, that's your first problem.

Click your Training resource and navigate to Keys and Endpoint in the left sidebar. Copy the Endpoint URL, it should look like:

https://YOUR-RESOURCE-NAME.cognitiveservices.azure.com/

Now open the Custom Vision web portal at customvision.ai in Microsoft Edge or Google Chrome (these are the only officially supported browsers, running it in Firefox or an older Edge version is an unsupported configuration that causes unpredictable portal behavior). Sign in with the same Microsoft account tied to your Azure subscription.

Click the gear icon at the top right. Under "Accounts," you should see your Azure resources listed. If your subscription doesn't appear here, click Add subscription and enter your Azure directory and subscription details manually.

If everything looks connected but you're still getting errors, scroll down to the "Limited Trial" section. If you're on the free Limited Trial (not tied to an Azure subscription), you're capped at two projects and 5,000 training images total. Migrate to an Azure subscription-backed resource to get past these limits.

When this step is done correctly, the Custom Vision portal should show your projects listed under the correct Azure resource, and hovering over the resource should show its region matching what's in Azure Portal.

Fix Azure Custom Vision Training API Authentication Errors

Authentication errors are the number-one cause of Azure Custom Vision training failures when using the SDK or REST API directly. The symptom is typically an HTTP 401 Unauthorized or HTTP 403 Forbidden when calling the Training API endpoint.

The Training API requires the Training-Key header, not an Authorization Bearer token, not an API key in the query string. Here's the correct header format for a REST call:

POST https://YOUR-RESOURCE.cognitiveservices.azure.com/customvision/v3.3/training/projects
Training-Key: YOUR-TRAINING-KEY
Content-Type: application/json

A very common mistake is using the Prediction key in the Training-Key header. They look identical (32-character hex strings) but they're resource-specific. Double-check by going to Azure Portal → your Training resource → Keys and Endpoint. The key shown there is your Training key. Then go to your Prediction resource and note that key separately.

If you're using one of the official SDKs (.NET, Python, Java, Go), the client constructor takes your Training endpoint and Training key explicitly:

# Python SDK example
from azure.cognitiveservices.vision.customvision.training import CustomVisionTrainingClient
from msrest.authentication import ApiKeyCredentials

credentials = ApiKeyCredentials(in_headers={"Training-key": TRAINING_KEY})
trainer = CustomVisionTrainingClient(ENDPOINT, credentials)

If you're getting a ProjectNotFoundException or 404 Not Found when trying to access an existing project, the project GUID in your code may be wrong. Get the correct project ID from the Custom Vision portal: open your project, and look at the URL, it contains the project GUID in the format /projects/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.

Once authentication is correct, the Training API call should return an HTTP 200 with the project details JSON. If you see a 429, you've hit the rate limit, back off and retry with exponential backoff.

Fix Low Azure Custom Vision Model Accuracy and Training Quality Issues

This one takes more finesse than a simple config fix. If your Azure Custom Vision image classification model is training but the accuracy numbers are disappointing, say, below 80% precision and recall on your quick test, the problem usually isn't the service. It's the training data or domain selection.

First, check your domain setting. The Custom Vision service offers several specialized algorithm domains optimized for specific subject material: General, Food, Landmarks, Retail, Adult, and more. If you're classifying retail product images but your project is set to the General domain, you're leaving accuracy on the table. In the Custom Vision portal, go to your project's Settings tab (the gear icon inside the project) and change the Domain. Changing the domain requires retraining from scratch, but it's worth it.

For image count, the official guidance is to start with 50 images per label. I'd say 50 is the floor, not the target. For reliable accuracy in a production context, I push clients toward 100–200 images per label, with genuine variation in lighting, angle, background, and scale. Duplicate or very similar images hurt more than they help, the model overfits to your training set instead of generalizing.

Also watch for class imbalance. If you have 200 images of "cat" and 15 images of "dog," the model will learn to predict "cat" aggressively. Balance your label counts as evenly as you can.

The Custom Vision portal's Smart Labeler feature can accelerate labeling after initial training. Once you have a trained iteration, Smart Labeler suggests tags for new untagged images. Go to the Training Images tab, select untagged images, and click the Smart Labeler button. Review its suggestions, don't blindly accept them, then use those accepted labels to retrain.

One hard limit from official documentation: Azure Custom Vision is not designed for detecting subtle differences in images, such as minor cracks or dents in quality assurance scenarios. If that's your use case, you'll get frustratingly inconsistent results regardless of how much data you add. Consider Azure Machine Learning AutoML for those scenarios instead.

Fix Azure Custom Vision Prediction API Not Returning Results

You've trained your model, you've published your iteration, you're making prediction calls, and you get either a 404, a blank response, or confidence scores that are all near zero. Here's how to work through this.

First, confirm the prediction URL format. The Prediction API endpoint has a specific structure that trips people up constantly:

POST https://YOUR-PREDICTION-RESOURCE.cognitiveservices.azure.com/customvision/v3.0/Prediction/{projectId}/classify/iterations/{publishedName}/image

There are two variants, one for sending an image file (binary) and one for sending an image URL. Make sure you're hitting the right one. For a URL-based call, append /url and send JSON with the Url field. For a binary image, send the raw bytes with Content-Type: application/octet-stream.

The {publishedName} segment must exactly match the name you gave the iteration when you published it, it's case-sensitive. Go to the Performance tab in the Custom Vision portal, find your iteration, and check the "Published as" name. Copy it character for character.

For authentication on the Prediction API, use the Prediction-Key header (not Training-Key):

Prediction-Key: YOUR-PREDICTION-KEY
Content-Type: application/octet-stream

If predictions come back with all confidences near 0.0 or 0.33 (for a 3-class model), this suggests the model never truly trained on meaningful differences. Check that your training images are genuinely distinct across labels and that you have enough of them. Also verify you're calling the right project ID, calling a classification model's endpoint with object-detection-formatted expectations (or vice versa) produces garbage output, not an error.

A successful prediction response looks like this:

{
  "predictions": [
    {
      "tagName": "Cat",
      "probability": 0.9973,
      "tagId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
    }
  ]
}

If you see this JSON structure with sensible probability values, the Prediction API is working correctly.

Export Your Azure Custom Vision Model for Offline and Mobile Use

Exporting a trained Azure Custom Vision model for offline inference is a common need, whether you're deploying to a mobile app, an edge device, or a Windows ML application. This step only works if you trained your project with an exportable (compact) domain. The standard "General" domain is not exportable. You must use "General (compact)" or another compact variant.

If you trained on a non-compact domain and now want to export, you'll need to create a new project with the compact domain, re-upload your images, and retrain. There's no way to convert an existing non-compact model to an exportable format after the fact.

Assuming you used a compact domain, here's how to export from the portal:

Go to your project in the Custom Vision portal
Click the Performance tab
Select the trained iteration you want to export
Click Export
Choose your target format: CoreML (iOS), TensorFlow (Android/Python), ONNX (Windows ML), Docker, or Vision AI DevKit

For programmatic export via the Training API:

# Python SDK, export as ONNX
export = trainer.export_iteration(project_id, iteration_id, "ONNX")
while export.status == "Exporting":
    time.sleep(1)
    export = trainer.get_exports(project_id, iteration_id)[0]
    
# Download the exported model
response = requests.get(export.download_uri)
with open("model.onnx", "wb") as f:
    f.write(response.content)

For Windows ML integration, take the downloaded .onnx file and add it to your Visual Studio project as a content file. Windows ML generates a wrapper class automatically when you add an ONNX file to a UWP project. The generated class handles input tensor preparation and output parsing, you just call EvaluateAsync() with your image data.

For TensorFlow export targeting Python inference, load the SavedModel directory and run predictions using the standard tf.saved_model.load() approach. The official docs include a Python sample for exactly this use case, it's worth pulling from the Azure Samples GitHub repository rather than writing the tensor reshaping code from scratch.

Advanced Troubleshooting

Diagnosing Azure Custom Vision Quota and Rate Limit Errors

The Azure Custom Vision service enforces limits at multiple levels, and hitting any one of them produces errors that look identical on the surface. Here's what the official Limits and Quotas documentation specifies that you need to know:

Free (F0) tier: 2 projects max, 5,000 training images per project, 10,000 predictions per month
Standard (S0) tier: 100 projects, 50,000 training images per project, unlimited predictions (billed per 1,000)
Training API: rate-limited to prevent runaway training job submissions

If you're hitting project limits on the free tier, upgrading your resource to S0 in Azure Portal resolves it immediately. Go to your Custom Vision Training resource → Pricing tier in the left sidebar → select S0 → Apply. Pricing tier changes take effect within a few minutes and don't require retraining your models.

Enterprise and Domain-Joined Scenarios

In enterprise environments, Azure Custom Vision requests may fail due to outbound proxy or firewall rules. The Custom Vision portal (customvision.ai) and API endpoint (*.cognitiveservices.azure.com) both need HTTPS (port 443) outbound access. If your organization uses a TLS inspection proxy, the proxy certificate needs to be trusted by whatever HTTP client your application uses, the SDK's underlying HttpClient won't trust it automatically.

For Azure Active Directory-based access control, note that Custom Vision uses subscription keys for authentication, not AAD tokens. RBAC roles on the resource control who can view/manage the resource in Azure Portal, but API calls themselves always require the subscription key from Keys and Endpoint. There is no OAuth flow for Custom Vision API calls, don't try to use a service principal Bearer token directly against the Custom Vision API; it won't work.

Debugging with Azure Monitor and Diagnostic Logs

For persistent, hard-to-reproduce errors, enable Diagnostic Settings on your Custom Vision resource. In Azure Portal, open your resource → Diagnostic settings → Add diagnostic setting. Route logs to a Log Analytics Workspace and enable the RequestResponse log category. Within 5–10 minutes of enabling this, failed API calls will show up in Log Analytics with full request/response details including status codes and latency.

Query your logs in Log Analytics with:

AzureDiagnostics
| where ResourceType == "COGNITIVESERVICES/ACCOUNTS"
| where OperationName contains "CustomVision"
| where ResultType != "Success"
| project TimeGenerated, OperationName, ResultDescription, CallerIpAddress
| order by TimeGenerated desc

This query surfaces failed Custom Vision API calls with their error descriptions, much more useful than the generic client-side error messages.

When to Call Microsoft Support

If you're seeing consistent 5xx errors from the Custom Vision API, your resource is showing as healthy in Azure Portal but API calls are failing, or your billing shows charges for calls you didn't make, those are signals to escalate. Before calling, collect your resource ID (from Azure Portal → Properties), the exact error message and HTTP status code, approximate timestamps of failures, and your region. File a support ticket at Microsoft Support with severity based on business impact. For production outages, use Severity A. For non-blocking issues, Severity C is fine.

Prevention & Best Practices

Start Your Azure Custom Vision Migration Planning Now

I want to be direct about something: the retirement date of September 25, 2028 sounds far away, but if you're building a product on Azure AI Custom Vision today, you're building on a sunset service. That's not a reason to panic, but it is a reason to design with migration in mind from day one.

Microsoft has outlined three migration paths depending on your use case. For custom image classification and object detection models, Azure Machine Learning AutoML is the recommended path, it uses classic machine learning techniques and supports both model types. For teams who want to use generative AI approaches, models in the Azure AI Foundry model catalog offer prompt-engineering-based solutions that can achieve high accuracy in custom scenarios without the traditional labeled-dataset training paradigm. If you need a managed solution specifically for image classification that also handles documents, audio, and video, Azure AI Content Understanding (currently in public preview) is worth evaluating.

My practical recommendation: keep your labeled training image datasets well-organized and exported. Those labeled images are your most valuable asset, they'll transfer to any migration target. Don't let them exist only inside the Custom Vision portal.

Structuring Your Training Data for Long-Term Reliability

Good Azure Custom Vision model performance comes down to data discipline. Tag your images consistently, if "Cat" and "cat" are both used as tag names in your project (case-sensitive), you'll end up training a confused model. Standardize tag naming conventions before you upload anything.

Keep a local copy of all training images organized by label folder. The Custom Vision portal is the runtime, not your source of truth. If the project is ever accidentally deleted, having local copies means you can rebuild in hours rather than starting over.

Quick Wins

Store Training key, Prediction key, project IDs, and published iteration names in Azure Key Vault, never hardcode them in source files
Use at least 50 images per tag during initial training, with real variation in lighting, scale, and background
Always use a compact domain if there's any chance you'll need offline or mobile deployment, you can't convert later
Set up Azure Monitor Diagnostic logs from day one so you have a paper trail if billing or error questions arise

Frequently Asked Questions

What exactly is Azure AI Custom Vision and what can I build with it?

Azure AI Custom Vision is an image recognition service that lets you train your own machine learning models to classify images or detect objects within images, without needing deep ML expertise. You bring labeled images, the service trains a model on them, and you get a prediction API you can call from any application. Real-world uses include retail product identification, manufacturing defect screening, wildlife species classification, and document type sorting. The service handles the ML training pipeline entirely; you focus on your data and labels.

Is Azure Custom Vision shutting down? Should I stop using it?

Microsoft announced that Azure Custom Vision will be retired on September 25, 2028. Full support, including SLAs, security patches, and new customer onboarding, continues until that date. So no, you shouldn't abandon existing projects immediately. What you should do is start evaluating migration paths: Azure Machine Learning AutoML for custom model training, Azure AI Foundry models for generative AI approaches, and Azure AI Content Understanding for managed classification workflows. Microsoft has published a dedicated Custom Vision Migration Guide to help you plan the transition.

Why is my Custom Vision prediction API returning 401 Unauthorized even though I'm using the right key?

The most likely cause is using your Training key where the Prediction key is required (or vice versa). These are two separate Azure resources, each with their own distinct keys. For prediction calls, use the key from your Prediction resource, go to Azure Portal, find your Custom Vision Prediction resource specifically, and grab the key from Keys and Endpoint there. Also verify your endpoint URL uses the Prediction resource's endpoint, not the Training resource endpoint. The two endpoints have different base URLs even if they look similar.

How many training images do I actually need for decent Azure Custom Vision accuracy?

The official guidance is 50 images per label as a starting point, and you can technically train with as few as 5 (though the accuracy will be unreliable). For production use, I'd push for 100–200 images per label with genuine variation, different lighting conditions, angles, backgrounds, and scales. The number matters less than the diversity. Two hundred near-identical photos of the same object in the same lighting tell the model almost nothing useful compared to 50 genuinely varied shots. Also keep your label counts balanced, a large disparity between classes causes the model to favor the over-represented class.

Can I export my Azure Custom Vision model to run offline on Android or iOS?

Yes, but only if your project was trained using a compact domain. Standard domains (General, Food, Landmarks, etc.) produce cloud-only models. You need to create your project with the corresponding compact variant (for example "General (compact)") before training. If you already trained on a non-compact domain, you'll need to start a new project with a compact domain and retrain. Once trained on a compact domain, you can export to CoreML for iOS, TensorFlow for Android, or ONNX for Windows ML directly from the Performance tab in the Custom Vision portal, or programmatically via the Training API.

What's the difference between image classification and object detection in Azure Custom Vision?

Image classification assigns one or more labels to an entire image, it tells you "this image contains a cat." Object detection goes further: it finds where in the image the labeled item appears and returns bounding box coordinates along with the label and confidence score. Use classification when you just need to know what's in an image. Use object detection when you need to locate and potentially count multiple distinct items within a single image, for example, counting products on a shelf or identifying the position of defects on a component. Both features are available through the same Custom Vision portal and APIs, but they require separate project types.

Related Microsoft Fix Guides

Sai Kiran Pandrala

Our team includes certified Microsoft engineers, Azure architects, and system administrators with 10+ years of enterprise IT experience. Every guide is written from hands-on troubleshooting, not guesswork. We test every fix before publishing.