What is custom text classification?
| Product family | Azure AI Services |
|---|---|
| Document source | Azure AI Language Service |
| Guide type | Hands-on Reference |
| Skill level | Intermediate to advanced |
| Time | 20 - 75 minutes depending on tenant scale |
Custom Text Classification (CTC) is the "I have a stack of documents, I need to tag each one" Azure AI service. Think: support tickets routed by topic, contract clauses flagged by type, customer reviews bucketed by sentiment plus theme.
I've shipped CTC into a logistics company that classifies 8,000 dispatch emails a day across 14 categories. Training cost: about ₹6,200 ($75 USD) for the labelled corpus prep. Inference: roughly ₹0.18 per 1,000 characters at the time of writing. The savings versus a human triage team are real.
Reference content and what it actually means
The Microsoft Learn page for What is custom text classification? reads like a feature inventory. Useful, but it doesn't tell you which knobs matter for an engineer shipping to production this sprint. Let me re-frame it.
Three things drive the behaviour of this surface. The API version you target, the resource SKU you're calling against, and the region your data lives in. Skip any of them and you'll get surprising results.
API versions and what changed
Azure AI Language ships breaking changes only at major version bumps. The current GA is stable. The preview channel moves fast — I've seen response shapes change between Monday and Friday on a preview build. Pin your version explicitly in the URL. Do not trust the "latest" alias.
# Pin the version explicitly in every call
POST https://<resource>.cognitiveservices.azure.com/language/:analyze-text?api-version=2024-11-15-preview
Content-Type: application/json
Ocp-Apim-Subscription-Key: <key>
Two clients calling the same endpoint with different API versions will get different shapes back. That's by design, not a bug. I've seen this cause four-hour debugging sessions when one microservice was upgraded and the others weren't.
Authentication options
You have three auth modes. Key-based via Ocp-Apim-Subscription-Key header. Managed identity through Azure RBAC. Microsoft Entra (formerly Azure AD) token. For anything that ships to production, use managed identity. Keys end up in logs, in env files committed by accident, in Slack messages. I've watched all three happen.
# Get an Entra token for Azure AI Language
az login
az account get-access-token --resource https://cognitiveservices.azure.com
If you're calling from an Azure Function or App Service, enable managed identity on the resource and assign it the Cognitive Services User role on the Language resource. Twenty minutes of setup, and you never rotate a key again.
Regions and data residency
Language service runs in 25+ regions at the time of writing. East US and West Europe have the broadest feature coverage. Indian regions (Central India, South India) have base features but lag on preview capabilities by 3-6 months. For an Indian customer doing PII-sensitive work, this matters more than the latency math.
How to apply this in practice
Real deployments don't happen in playgrounds. They happen in pipelines. Here's how I move from the reference above to a working production setup.
- Provision the Language resource in the region matching your data residency constraint. Use
az cognitiveservices account create --kind TextAnalytics --sku S0 --location centralindia --name my-lang-prodas a starting point. The S0 SKU runs around ₹1.85 per 1,000 text records — confirm against the pricing calculator on the day. - Enable managed identity on the caller.
az webapp identity assignfor App Service,az functionapp identity assignfor Functions, or setidentity.type = "SystemAssigned"in your Bicep / Terraform. - Assign the Cognitive Services User role:
az role assignment create --assignee <principal-id> --role "Cognitive Services User" --scope <resource-id>. Wait two to ten minutes for propagation. - Write a 20-line smoke test that calls the endpoint, prints the response, and asserts the version field matches what you pinned. Run it on every deploy.
- Wire up Azure Monitor diagnostic settings. Send logs to a Log Analytics workspace. Build one alert: 5xx response rate over 2% for 5 minutes. That alert has saved me twice this year.
I've seen this fail when teams skip step 4. The smoke test is sixty lines of Python. Skipping it costs a sprint when the model gets silently demoted to a fallback in your code.
Caveats and what to double-check
- Microsoft's preview features are not covered by the standard 99.9% Azure SLA. If your contract requires SLA, stick to GA only. Read the small print on the specific feature page.
- Quota limits are per region, per resource. The default is 1,000 transactions per minute on S0. Hit that, and you'll see 429s. Request a quota increase three weeks before launch: Microsoft support runs on its own clock.
- Some features only run in specific regions. Custom Text Classification training, for example, was East US, West Europe, and UK South only at one point. Check the per-feature availability matrix on Learn before committing to a region.
- Bring-Your-Own-Storage (BYOS) configurations have a separate billing line. Easy to miss in cost reviews. I once had a customer billed an extra ₹14,000 a month for storage egress because nobody traced the BYOS lifecycle.
- The Azure portal UI lags the API by 4-6 weeks. If something works in the REST API but the portal can't show it, that's normal. Trust the REST response.
Related work in your environment
- Document this reference in your team's Confluence / Notion / wiki along with the specific resource name and region you're using. Future you will thank present you.
- Add a Microsoft Learn RSS subscription on the source page. When Microsoft updates the canonical doc, you want to be notified rather than discovering it through a customer ticket.
- Run a quarterly review of every Language resource in your subscription.
az cognitiveservices account list --query "[?kind=='TextAnalytics'].[name,location,sku.name]" -o tabletakes 8 seconds and surfaces every resource. Kill the ones nobody uses, they leak budget. - If you're on a hybrid setup, mirror your Language resource configuration in IaC (Bicep or Terraform). Resource drift is the silent killer of multi-region deployments.
- For the Indian market specifically, factor in the MeitY data localisation guidelines for regulated workloads. Central India and South India regions store data in-country. Confirm with your DPO before going live.
Troubleshooting the failures I keep seeing
Three failure modes account for 80% of the Azure AI Language support tickets I've worked. Knowing them in advance saves hours.
401 / 403 after an Entra migration
You moved from key auth to managed identity. The first call works. Then 403s start. The cause is usually role propagation delay or a missing scope. Run az role assignment list --assignee <principal-id> --scope <resource-id> and confirm Cognitive Services User is in the list. Wait ten minutes. Retry. If it still fails, the resource was created without "local auth disabled" and your code is silently falling back to key auth that was rotated.
429 throttling under burst load
Default S0 quota is 1,000 transactions per minute, but it's enforced as a sliding window. Burst above it and you get 429s with a Retry-After header in seconds. Your code must honour the header. exponential backoff alone will keep failing. I parse the header explicitly and sleep the exact amount returned.
# Python pattern that has saved me twice this quarter
import time, requests
def call_with_retry(url, headers, body, max_retries=5):
for attempt in range(max_retries):
r = requests.post(url, headers=headers, json=body, timeout=30)
if r.status_code == 429:
wait = int(r.headers.get('Retry-After', 2 ** attempt))
time.sleep(wait)
continue
return r
raise RuntimeError(f"Exhausted retries after {max_retries} attempts")
Inconsistent classifications across regions
The same input string can return different intent labels in East US versus Central India when one region has been quietly upgraded and the other hasn't. Pin the API version explicitly in every call (not just at SDK init). When you see drift, log the x-ms-correlation-id response header and open a Microsoft support ticket with that ID. They can trace it back to the model version that served the call.
Last week I had a customer hit this exact issue. Two pods of the same service, two different regions, two different intent predictions on the same payload. The fix was three lines: pin the API version, pin the project version, pin the deployment slot.
Cost notes and a rollback plan
Azure AI Language pricing has four major levers. Text records processed, custom model training hours, hosted model count, and storage egress for BYOS configurations. The first dominates spend for most workloads. The fourth is the one teams forget about.
A text record is up to 1,000 characters. The S0 SKU prices a text record around ₹0.15 ($0.0018) for sentiment, ₹0.20 for entity recognition, ₹0.30 for CLU inference. Multiply by your volume. A high-volume call-centre workload running 5 million records a day on CLU runs to roughly ₹15 lakh a month before commitment-tier discounts. The Commitment Tier brings it to about ₹9 lakh at the 10M tier, Microsoft's account team can quote yours.
Training is billed per training hour. A typical Custom Text Classification project trains in 2-4 hours. Each retrain is billed separately. I budget ₹3,500 per training run for a medium-sized project and that's been accurate within ±20%.
Rollback plan. If the new feature you've enabled is causing regressions in production, you have three options. Roll back the API version in your client (fastest: 5 minute deploy). Roll back the model deployment (medium, re-promote the prior model via the REST API). Roll back the project version (slow. requires retraining if you've committed changes). I always keep the prior model deployment alive for 14 days post-promotion so option two is one API call away.
# Roll back to a prior model deployment via REST
PUT https://<resource>.cognitiveservices.azure.com/language/authoring/analyze-text/projects/<project>/deployments/<name>?api-version=2023-04-01
{
"trainedModelLabel": "<label-of-prior-good-model>"
}
This deploy-swap pattern has saved me twice in the last six months. The second time was a Friday afternoon during a holiday week, the prior model deployment was the only thing that let me close the ticket before signing off.
FAQ
References
- Microsoft Learn, official documentation for Azure AI Services
- Microsoft tech community forums and Q&A
- Azure Service Health and Microsoft 365 Service health dashboards
- Azure pricing calculator (azure.microsoft.com/pricing/calculator)
Related fixes
Related guides worth a look while you sort this one out: