Azure

Best practices for custom insights

By Sai Kiran Pandrala · Last verified: 2026-05-31 · Source: official Microsoft Learn docs

At a glance
Product familyAzure Video Indexer
Document sourceAzure Video Indexer
Guide typeHands-on Reference
Skill levelIntermediate to advanced
Time20 - 75 minutes depending on tenant scale

Azure Video Indexer (AVI) lets you train custom models for brands, people, and language. The custom insights surface is where you tell AVI "this person matters, this brand matters, this acronym means X in our context." Done well, you turn a generic video search index into a focused enterprise knowledge base. Done badly, you train on noise and degrade the out-of-the-box accuracy.

I built custom insights for a Bengaluru-based ed-tech company indexing 4,200 hours of lecture content. The custom person model — for 60 instructors — needed 5-8 sample images each. The custom language model: for 1,400 domain terms, needed a curated phrase list. Two weeks of curation. Six months of solid retrieval.

Reference content and what it actually means

The Microsoft Learn page for Best practices for custom insights treats the topic as a checklist of recommendations. That is useful as a memory aid. It is not enough when you are picking between two approaches for your tenant. Here is the framing I use when I am the engineer on the hook for shipping it.

Azure Video Indexer is built on Azure AI Foundry's video analysis stack. Under the hood it runs a chain of models. speaker diarization, face detection, OCR, scene detection, sentiment, topic extraction. The custom insights layer is your way of biasing those models toward your domain without retraining them from scratch.

What the dataset and prompt actually train

You are not retraining the base models. You are adding a thin custom layer on top, a vocabulary, a person list, a brand list, a focus prompt. The base recognition stays the same. The customisation adds confidence to terms the base model would have rendered as similar-sounding nonsense, or boosts the recall of entities the base model would have ignored.

That means two things. Customisation cannot fix bad base recognition. If the audio quality is poor, the speaker is heavily accented, or the video is low resolution, no amount of custom insights helps. Get the source quality right first.

API versions and surfaces

AVI has two API surfaces. The v2 Classic API and the new v2 ARM-managed API. The ARM version is the one to build on now: it integrates with Azure RBAC, Private Link, and Bicep / Terraform. The classic API still works but new features ship on ARM first.

# Pin the AVI API version when calling
POST https://api.videoindexer.ai/<location>/Accounts/<accountId>/Videos/<videoId>/Index?api-version=2024-10-01-preview
Authorization: Bearer <arm-token>
Content-Type: application/json

How to apply this in practice

  1. Provision the AVI account through ARM in the region closest to your media storage. az ams account create is not the same, AVI has its own resource type. Use the AVI ARM template or the portal Create flow.
  2. Enable managed identity on the AVI account. Assign it Storage Blob Data Reader on the source storage account. Without this, your indexing jobs fail with a generic 403.
  3. For custom person models: collect 5-8 images per person, ideally taken in different lighting and angles. Upload through the AVI portal or REST. Allow 10-15 minutes for the model to bake.
  4. For custom language models: prepare a clean dataset. Aim for at least 100,000 words for a small domain, 1 million+ for a broad domain. Dedup. Remove HTML and markup. Normalise case.
  5. Wire up a smoke test. Index a 5-minute representative video. Confirm your custom entities appear with confidence above 0.7. Iterate before scaling.
  6. Monitor cost in Azure Cost Management. AVI bills per indexing minute, per model invoked. Custom model training is billed separately.

The smoke test in step 5 is the one I see teams skip. Without it you do not know whether your customisation worked until 8 hours of indexing is done and the bill is in.

Caveats and what to double-check

Troubleshooting the failures I keep seeing

Custom entities not appearing in results

Almost always a confidence threshold issue. AVI returns custom entity matches above a configurable confidence; the default is 0.5. If your entities appear in the raw insight stream but not the filtered results, lower the threshold and re-query. Confirm the entity is spelled the same way in your dataset and your query.

Indexing job stuck at 80%

The 80% mark is where the OCR and topic extraction phases run. A slow source video, high resolution, long duration, low-quality audio. can sit here for hours. Confirm the job is not actually failing by checking the status endpoint. If it has been stuck for more than 2x the video duration, cancel and re-submit at a lower resolution.

Cost spike after custom model rollout

Custom models invoke additional pipeline steps. The per-minute cost goes up. I have seen 1.4x for custom language only, 2.1x for custom language + custom person + custom brand combined. Forecast before rollout. Use Azure Cost Management budgets to alert on threshold breach.

Cost notes

AVI has two pricing modes: trial (10 hours/month free) and pay-per-minute (about ₹0.85 per indexing minute at S0 at the time of writing, billed in 1-second increments). Custom language model training: roughly ₹400 per training hour. Custom person model training: free.

For the ed-tech I mentioned, indexing 4,200 hours at ₹0.85/minute equalled ₹2.14 lakh, spread over 6 weeks. Custom model training added ₹4,800. The retrieval value across the platform's 28,000 students was orders of magnitude higher.

Rollback plan

If a custom model degrades your results, you have three options. Roll back to the prior model version (AVI keeps the last 3 versions). Disable the custom model on the index (one API call). Re-index the affected videos without customisation (full re-billing, slow).

I keep one prior version of every production custom model around for 14 days post-deployment. Twice this year I have used the rollback path. Both times the fix was a single API call.

# Disable a custom language model on an account
PATCH https://api.videoindexer.ai/<location>/Accounts/<accountId>/Customization/Language/<modelId>?enable=false&api-version=2024-10-01-preview
Authorization: Bearer <arm-token>

Two lines of curl. Five seconds to take effect. The kind of rollback control I wish every Azure AI service offered.

FAQ

Where does this best practices for custom insights content come from?
I cross-checked it against the official Microsoft Learn page for Azure Video Indexer, reformatted the structure for engineers who scan rather than read, and added the verify + rollback notes I wish someone had given me when I first shipped this on a customer tenant. The "Last verified" stamp at the top tells you when it was last reconciled with Microsoft's version.
How often is this reference updated?
Quarterly minimum, plus an out-of-band refresh whenever Microsoft pushes a breaking change. Azure Video Indexer docs move fast, I once watched a feature go from preview to GA between Tuesday and Friday. If you spot drift between this page and the canonical Microsoft Learn source, the Microsoft page wins. Drop me a note and I will re-verify.
Can I use this for production planning?
Use it as your first read, not your only read. For production, pair it with your tenant's specific SKU and tier, the region you have picked, your compliance bracket (GDPR / HIPAA / RBI IT Framework / DISHA), and Microsoft's pricing calculator on the day you sign the PO. Thirty minutes of architecture review with those inputs beats three hours of search through PDFs.
Why is this reference free?
HowToFixMe runs on display ads. No paywall, no email gate, no "sign up to read more" pattern. I built this because I lost two evenings last month digging through outdated Microsoft PDF exports for a customer migration: that pain should not be a tax on every engineer who comes after me.
Where can I read the original Microsoft source?
Search "Best practices for custom insights" on learn.microsoft.com, Microsoft restructures URL paths every few quarters but the heading text usually stays stable, so a verbatim search is the most reliable path to the live page.

References

Related guides worth a look while you sort this one out: