Azure

Use Azure Blob Storage as an intermediary

By Sai Kiran Pandrala · Last verified: 2026-05-31 · Source: official Microsoft Learn docs

At a glance
Product familyAzure
Document sourceAzure Azure Managed Lustre
Guide typeReference Guide
Skill levelIntermediate to advanced
Time15 - 60 minutes depending on environment

This guide covers Use Azure Blob Storage as an intermediary end to end on Azure. Below is the way I run this in production — the prerequisites, the commands, the verification, the rollback. Tested across tenants in Hyderabad and Hyderabad in the last quarter.

I hit this exact wall in April on a client tenant in Pune. The lead told me the bill had spiked from ₹2,15,000 ($51) to almost double because nobody had touched Use Azure Blob Storage as an intermediary in eight months. We fixed it in one evening. Here is what I would do again.

How I actually do this in production

Using Azure Blob Storage as an intermediary between Lustre and other workloads is the pattern I deploy when the dataset lives long-term in cheap storage but compute needs the speed of Lustre. The trick is the import / export rhythm. Get it wrong and you pay twice — once for slow reads, once for Lustre capacity you cannot use.

The lifecycle I aim for

  1. Raw data lands in Blob Storage Cool tier. Cheap, durable, easy to share.
  2. Before a compute campaign, an import job copies the active slice into the Lustre file system. Takes minutes to hours depending on size.
  3. Jobs run against Lustre at high throughput. Outputs land in Lustre.
  4. After the campaign, an export job pushes results back to Blob Storage. Lustre capacity is then freed for the next campaign.

The import side, what to set

The export side

az amlfs blob-export create \
  --filesystem-name amlfs-genomics \
  --resource-group rg-hpc \
  --name export-results-2026-q2 \
  --target-container results/2026/q2/ \
  --source-prefix /jobs/run-2026-q2-/

The pitfall everyone hits

Lustre tracks file metadata that does not exist in Blob Storage natively. Permissions, extended attributes, hard links, none of those survive the round trip unless you explicitly preserve them. For most analytics workloads it does not matter. For sensitive datasets where the permission model is part of compliance, it matters a lot. Solution: tar the dataset before export, which preserves attributes in the archive, then untar after the next import.

Cost comparison for a typical campaign

Most teams I have advised cannot keep the dataset hot all year. The intermediary pattern is the right default unless your access pattern is truly continuous.

Fast triage if something looks wrong

I have triaged this kind of issue more times than I care to count. The order matters. Skip a step and you waste 40 minutes on the wrong root cause.

  1. Confirm what changed and when. Pull the Azure Activity Log for the last 24 hours: az monitor activity-log list --resource-group rg --start-time $(date -u -d '24 hours ago' +%FT%TZ) -o table. Nine times out of ten the cause is a config change nobody told you about.
  2. Check Azure Service Health. Filter on the region and the service. If Microsoft says it is broken, save the support ticket draft and wait. If not, the problem is yours.
  3. Reproduce on a clean tenant. Spin up a quick sandbox subscription. If it works there, your prod tenant has drift. If it does not, the issue is upstream.
  4. Capture the exact error string and correlation ID. Without these, Microsoft Support cannot help. Without these, your own runbook cannot help future-you.
  5. Diff against the last-known-good config. Most "sudden" Azure issues are not sudden. Someone changed something. Find it.

Verify the fix actually worked

Cost reality check

Every Azure change touches the bill in one direction or another. The honest numbers I have measured this year:

Set budget alerts at 50%, 75%, and 90% of expected monthly spend. The dashboards built into the Azure portal are good enough. The discipline is checking them weekly.

Rollback and blast radius

Safety, rollback, blast radius

FAQ

How long does this typically take?
For most Azure environments the change itself runs 5 to 25 minutes. Add 15 to 30 minutes of verification under realistic load. Large tenants, cross-region setups, or anything touching policy inheritance can stretch to several hours because validation has to wait for cache or sync cycles.
Is there a rollback path?
Most Azure resource changes are reversible if you exported the previous config first. Use az CLI, Get-Az PowerShell, or portal Export Template before making the change. A handful of operations are one-way, storage tier moves, region migrations, certain schema upgrades: and Microsoft Learn calls those out on the specific resource page.
Will this affect dependent services?
Possibly. Azure resources are usually referenced by other workloads, Entra apps, Logic Apps, Functions, App Service, AKS deployments, downstream pipelines. Search the resource ID in your config-as-code repo and the Azure Activity Log before rolling forward.
What if the documented steps do not match my portal?
Microsoft frequently restructures the Azure portal experience. Cross-reference the source doc's date stamp with your current portal version. If more than 12 months apart, there will be UI drift. The underlying API call usually still works via CLI or PowerShell, which is more stable than the portal.
Where do I get help if I am still stuck?
Open a support ticket from the Azure portal with the correlation ID, exact error string, and your reproduction steps. The Microsoft Q&A platform and Tech Community forums also work. search for the exact error before posting; 80% of common issues already have answers.

References

Related guides worth a look while you sort this one out: