Azure Dedicated HSM: Setup, Errors & Migration Guide
Why This Is Happening
If you're here, you're probably staring at a provisioning failure, a connectivity error, or a configuration headache with Azure Dedicated HSM , and the Azure portal error message is giving you absolutely nothing useful. I've seen this exact situation play out on enterprise deployments where security architects have spent weeks planning a rollout, only to hit a wall the moment they try to get the HSM device talking to their virtual network. I know that's frustrating, especially when the hardware is sitting idle and your compliance deadline isn't.
Azure Dedicated HSM is one of the most specialized services in the Azure catalog. It's built around physical Thales Luna 7 HSM Model A790 appliances, deployed inside Microsoft's globally distributed datacenters, and connected directly to your virtual network. That last part, the direct VNet injection, is where most problems start. Unlike Software-as-a-Service offerings that abstract the underlying infrastructure away from you, Azure Dedicated HSM hands you the keys to a physical device. That's exactly what organizations in financial services, government, and regulated healthcare want. But it also means you carry the full weight of configuration, management, and troubleshooting.
The most common reasons people land on this guide include: network peering mistakes that prevent the HSM device from being reachable, VPN configuration errors when trying to access the appliance from on-premises management tools, high availability pairing failures when setting up a second device for redundancy, partition setup confusion when preparing the device for multiple application instances, and, increasingly, questions about what to do now that Microsoft has announced Azure Dedicated HSM will be fully retired on July 31, 2028.
There's also a less-talked-about category of issue: organizations that provisioned an Azure Dedicated HSM for a specific workload, then found that their other Azure services, things like Azure Disk Encryption, Azure Storage encryption, or Azure SQL Database Transparent Data Encryption, don't actually integrate with it. The service was designed for "lift-and-shift" migrations and running software like Apache/Nginx SSL offload, Oracle TDE in a VM, or Active Directory Certificate Services. If you assumed it was a universal encryption backend for all Azure services, that mismatch is going to cause real architectural pain.
One more thing worth getting straight up front: qualifying to use Azure Dedicated HSM in the first place has a hard financial threshold. You need at least $5 million USD in overall committed Azure revenue annually, plus an assigned Microsoft Account Manager. If your organization doesn't hit that bar, you may not even be able to onboard, and no amount of troubleshooting will fix that. For organizations in that situation, Azure Managed HSM or Azure Key Vault are the appropriate alternatives.
Whatever your specific problem, this guide walks through the most common Azure Dedicated HSM setup issues, configuration errors, and the migration path you'll need if you're on an existing deployment. Browse all Microsoft fix guides →
The Quick Fix, Try This First
Most Azure Dedicated HSM connectivity failures, the kind where your HSM device is provisioned but completely unreachable, come down to one root cause: the HSM device is connected to your virtual network, but your network security groups, routing tables, or DNS settings are blocking traffic before it ever reaches the appliance. This is the single most common misconfiguration I see, and it's fixable in under 10 minutes if that's actually your problem.
Open the Azure portal, go to Virtual Networks, select the VNet your HSM is connected to, and click Subnets in the left-hand menu. Find the subnet dedicated to your HSM, Azure Dedicated HSM requires its own dedicated subnet, this is a hard requirement. Check that no Network Security Group (NSG) is attached to that subnet blocking inbound traffic on TCP port 1792 (the Thales Luna Network HSM management port) or on the ports required by your specific application integration.
If an NSG is attached, click on it, go to Inbound security rules, and verify that traffic from your management workstation's IP range is allowed. Add a rule if it's missing:
Priority: 100
Source: [Your management subnet CIDR or specific IP]
Source port ranges: *
Destination: [HSM subnet CIDR]
Destination port ranges: 1792, 22
Protocol: TCP
Action: Allow
Save the rule and wait about 30 seconds for it to propagate. Then try reaching your HSM device again using the Thales Luna Network HSM client tools from your management VM. If it responds, you found your problem.
If connectivity still doesn't work after the NSG fix, the next fastest check is your route table. Go to Route tables in the portal, find the table associated with the HSM subnet, and confirm there's no user-defined route sending HSM traffic to a network virtual appliance or firewall that's silently dropping packets. A hairpin routing problem through a firewall that doesn't understand the Thales HSM protocol will look exactly like an NSG block from a symptom standpoint.
Finally, confirm the HSM device itself shows as Provisioned (not "Updating" or "Failed") in the Azure portal under Dedicated HSMs. A device stuck in "Updating" usually means the provisioning request hit a backend resource allocation issue, open a support ticket immediately, as this can't be self-resolved.
Azure Dedicated HSM has strict networking prerequisites that must be in place before you attempt provisioning. Skipping this step and trying to fix the network after the fact is how you end up with a broken deployment that requires a support ticket and a delete-and-redeploy cycle. I've seen teams waste two weeks this way.
First, confirm your target virtual network exists in an Azure region where Azure Dedicated HSM is available. Not every Azure region carries the Thales Luna 7 hardware. Check the current regional availability in the Azure portal under All services > Security > Dedicated HSMs and attempt to create a resource, the region dropdown will only show supported locations.
Next, create a dedicated subnet exclusively for HSM use. This subnet cannot be shared with other resources. A /28 subnet (14 usable IPs) is the minimum and more than sufficient for a two-device HA pair. Name it something explicit like GatewaySubnet-HSM or hsm-subnet to avoid confusion later:
az network vnet subnet create \
--resource-group MyResourceGroup \
--vnet-name MyVNet \
--name hsm-subnet \
--address-prefixes 10.2.0.0/28
Also verify that your VNet has a gateway subnet if you need point-to-site or site-to-site VPN access for on-premises management. The gateway subnet is separate from the HSM subnet. Go to Virtual network gateways in the portal and confirm a VPN Gateway is provisioned and in a Succeeded state. If it's not there, creating it takes 30–45 minutes, plan for that time in your deployment window.
When this step is done correctly, you should be able to look at your VNet's subnet list and see at minimum: one GatewaySubnet for VPN, one hsm-subnet for the HSM devices, and one management subnet for your admin VMs.
The Azure portal UI for Azure Dedicated HSM is minimal, most serious deployments happen through the Azure CLI or PowerShell, and the CLI gives you better error output when something goes wrong. Make sure you're on Azure CLI version 2.x or later (az --version to check).
Register the Dedicated HSM resource provider if you haven't already. Missing this step produces a confusing "subscription not registered" error that looks like a permissions problem:
az provider register --namespace Microsoft.HardwareSecurityModules
az provider show --namespace Microsoft.HardwareSecurityModules --query "registrationState"
Wait until the output shows "Registered" before proceeding. Then deploy your first HSM device:
az dedicated-hsm create \
--resource-group MyResourceGroup \
--name MyHSMDevice \
--location eastus \
--sku SafeNet Luna Network HSM A790 \
--network-profile-network-interfaces private-ip-address=10.2.0.5 \
--subnet id="/subscriptions/{sub-id}/resourceGroups/{rg}/providers/Microsoft.Network/virtualNetworks/{vnet}/subnets/hsm-subnet" \
--stamp-id stamp1 \
--zones 1
Provisioning takes 25–35 minutes. Monitor the deployment with:
az dedicated-hsm show \
--resource-group MyResourceGroup \
--name MyHSMDevice \
--query "provisioningState"
A healthy response is "Succeeded". If you see "Failed", the error details are in the activity log: go to the Azure portal, open Monitor > Activity log, filter by your resource group, and look for the failed operation. The JSON error body there will tell you whether it was a capacity issue, a network conflict, or a quota problem.
A single Azure Dedicated HSM device is a single point of failure. For any production workload, you need a minimum of two devices configured as an HA pair. Microsoft explicitly supports this and the Thales Luna 7 appliance's HA capabilities are what make it possible, but you have to do the configuration yourself through the Thales Luna Network HSM client software, not through Azure.
Deploy your second device into a different Availability Zone within the same region:
az dedicated-hsm create \
--resource-group MyResourceGroup \
--name MyHSMDevice2 \
--location eastus \
--sku SafeNet Luna Network HSM A790 \
--network-profile-network-interfaces private-ip-address=10.2.0.6 \
--subnet id="/subscriptions/{sub-id}/resourceGroups/{rg}/providers/Microsoft.Network/virtualNetworks/{vnet}/subnets/hsm-subnet" \
--stamp-id stamp2 \
--zones 2
Note the different --stamp-id and --zones values. Using stamp1 and stamp2 with zones 1 and zones 2 ensures the two physical devices land in different fault domains within the Azure datacenter. You do not want both HSMs on the same physical rack, that defeats the purpose of HA entirely.
Once both devices show "Succeeded" provisioning state, connect to each one using the Thales Luna Network HSM client (downloaded from the Thales customer support portal) and run the HA group configuration from your management VM. This involves exchanging certificates between the two devices and creating a virtual HA slot that your applications connect to instead of connecting directly to either physical device. The Thales integration guide documents this process in detail, follow it step by step and do not skip the certificate exchange step, as that's the most common cause of HA group formation failures.
Once your device is provisioned and reachable, it's a blank slate. You need to initialize it through the Thales client software before any application can store or retrieve keys. The Thales Luna 7 A790 supports up to 10 partitions, think of each partition as an isolated key store for a specific application or team. This is one of the features that makes Azure Dedicated HSM genuinely useful for organizations running multiple workloads: Oracle TDE can have its own partition, Apache SSL offload gets another, and ADCS gets a third, all on the same physical device.
Connect to your HSM from your management VM using the Thales LunaCM utility:
lunacm
lunacm:> slot set -slot 0
lunacm:> hsm init -label MyHSMPrimary
You'll be prompted to set the HSM SO (Security Officer) password. This is the administrative credential for the device. Store it in a password vault immediately, if you lose it, device recovery requires a full zeroization (factory reset), which destroys all keys. There is no back door. Microsoft has no administrative access after this point. That's by design, but it puts the entire burden of credential management on you.
After initialization, create your first partition:
lunacm:> partition create -label AppPartition1
Assign a Partition SO password and a Crypto Officer password separately. Applications use the Crypto Officer credential to perform cryptographic operations. The Crypto Officer cannot modify partition configuration, that separation of duties is important for meeting FIPS 140-2 Level 3 requirements.
If initialization fails with error code CKR_PIN_INCORRECT on a device you haven't touched yet, the device may have been pre-configured during provisioning with a default credential. Check the Thales documentation from the customer portal for factory default credentials, but change them immediately after your first login.
Many organizations deploying Azure Dedicated HSM still need to manage the device from on-premises tools, their existing Thales management infrastructure, their HSM management workstations, or their application servers that haven't migrated to Azure yet. This requires either a site-to-site VPN or Azure ExpressRoute between your on-premises network and the Azure VNet where the HSM lives.
For site-to-site VPN, your VNet already needs a Virtual Network Gateway (covered in Step 1). Once the gateway is in place, configure the local network gateway to represent your on-premises network:
az network local-gateway create \
--resource-group MyResourceGroup \
--name MyOnPremGateway \
--gateway-ip-address [Your-OnPrem-VPN-Device-Public-IP] \
--local-address-prefixes 192.168.0.0/24
Then create the VPN connection:
az network vpn-connection create \
--resource-group MyResourceGroup \
--name MyVPNConnection \
--vnet-gateway1 MyVNetGateway \
--local-gateway2 MyOnPremGateway \
--shared-key [YourPSK]
Once the VPN connection shows Connected status in the portal (Virtual network gateways > Connections), test reachability from an on-premises management station to the HSM's private IP address using a simple ping first, then an NTLS connection test with the Thales client:
ping 10.2.0.5
vtl verify
The vtl verify command from the Thales VTL (Virtual Token Library) utility should return the partition list and confirm the NTLS (Network Trust Link Service) channel is healthy. If vtl verify fails but ping succeeds, you have a TLS certificate trust issue, re-run the client certificate registration steps from the Thales integration guide. If both fail, go back and check the NSG rules on the HSM subnet from Step 1, and also verify the on-premises firewall allows outbound traffic to TCP ports 1792 and 1093 toward the Azure VNet IP range.
Advanced Troubleshooting
If the step-by-step process hasn't resolved your issue, here's where I go deeper on the problems that take more investigation to solve.
Azure Dedicated HSM Provisioning Stuck in "Updating" State
This is a backend allocation issue. The Azure Dedicated HSM service deploys physical hardware, unlike virtual machines, there's no instant capacity scaling. If provisioning hangs in "Updating" for more than 45 minutes, the request has almost certainly hit a capacity wall in your target region. Check the Azure Service Health dashboard (Monitor > Service Health > Health advisories) for any active capacity constraints in your region. If none are listed, open a support ticket with severity B and provide the correlation ID from the failed provisioning operation. You'll find the correlation ID in Monitor > Activity log, look for the Create operation on your HSM resource and expand the JSON.
Event Viewer and Serial Port Monitoring
Microsoft maintains monitor-level access to Azure Dedicated HSM devices via a serial port connection. This is not an administrative channel, it only covers hardware telemetry like temperature readings, power supply health, and fan status. You cannot and should not attempt to block this access; if you do, you lose Microsoft's ability to proactively alert you to hardware failures before they cause an outage. On the Azure side, hardware health signals surface through Azure Monitor, set up a diagnostic settings rule on your HSM resource to forward telemetry to a Log Analytics workspace:
az monitor diagnostic-settings create \
--resource [HSM-Resource-ID] \
--name HSMDiagnostics \
--workspace [LogAnalytics-Workspace-ID] \
--logs '[{"category": "AuditEvent","enabled": true}]' \
--metrics '[{"category": "AllMetrics","enabled": true}]'
In Log Analytics, query for hardware events using KQL:
AzureDiagnostics
| where ResourceType == "DEDICATEDHSMS"
| where Category == "AuditEvent"
| order by TimeGenerated desc
Performance Problems, RSA Operations Below Expected Throughput
The Thales Luna 7 A790 is rated for 10,000 RSA-2048 operations per second. If your application is seeing significantly lower throughput, the most common culprits are: (1) not using the HA virtual slot, connecting directly to a physical device instead of the HA group means you're not load-balancing across both devices; (2) network round-trip latency between your application VM and the HSM subnet, keep application VMs in the same VNet and ideally the same Availability Zone as the HSM; (3) partition contention, if multiple applications are hammering a single partition, consider distributing them across separate partitions. Remember you get up to 10 partitions per device.
Migration Planning, Retirement Deadline July 31, 2028
This is the conversation that every Azure Dedicated HSM customer needs to have now, not in 2027. Microsoft is not accepting new customers, and existing deployments must migrate before the July 31, 2028 deadline. Your two primary migration targets are Azure Cloud HSM (the direct successor, now generally available) and Azure Managed HSM. The right choice depends on your control requirements: Azure Cloud HSM is the closest functional match if you need single-tenant dedicated hardware. Azure Managed HSM is multi-tenant but FIPS 140-2 Level 3 validated and far simpler to operate. If you don't actually need physical device control and just need FIPS 140-2 Level 3 compliance, Azure Managed HSM is likely the better fit going forward.
Escalate immediately for: provisioning failures that persist beyond 45 minutes, hardware health alerts you receive via Azure Monitor that indicate physical device degradation, any situation where you suspect a device has been zeroized unexpectedly, and migration planning assistance for moving off Azure Dedicated HSM before the retirement deadline. For technical HSM configuration issues (Thales software, partition setup, HA group configuration), your first call should actually be to Thales customer support, they own the device software. For Azure-layer issues (networking, provisioning, monitoring), go to Microsoft Support. Know which layer your problem is in before you open a ticket, it'll save you being bounced between vendors.
Prevention & Best Practices
Getting Azure Dedicated HSM working correctly is only half the job. Keeping it healthy and avoiding the failures I see on repeat deployments requires building good operational habits from day one.
Design for Redundancy Before You Go Live
Every Azure Dedicated HSM production deployment should have at minimum two devices in two separate Availability Zones within the same region, configured as a Thales HA group. If your compliance requirements also mandate geographic redundancy, provision a second HA pair in a secondary Azure region and configure cross-region application failover. The cost of a second device is real, but it's trivially small compared to the cost of key unavailability for a workload that depends on HSM-backed cryptography. Plan for this architecture on day one, retrofitting HA into an existing deployment is painful and requires application downtime.
Credential Management is Your Responsibility, Take It Seriously
Once you change the HSM SO password after first access (which you must do, Microsoft has zero access after that point), you are the sole custodian of that credential. Store HSM administrative credentials in a separate, air-gapped password management system, not in Azure Key Vault (which is somewhat circular), and implement M-of-N access controls so that no single person can access the device alone. Document your credential recovery procedures before you need them, not during an incident. A zeroized HSM means all keys are gone permanently.
Test Your HA Failover Before Production Traffic Hits
Setting up a Thales HA group is not the same as testing it. Before you route real application traffic through the HSM, deliberately take one device offline and verify that the HA virtual slot automatically fails over to the surviving device without application errors. This test should be part of your go-live checklist, not something you discover works (or doesn't) during your first real outage.
Start Migration Planning Now
With the July 31, 2028 retirement deadline, any organization running Azure Dedicated HSM should already have a migration project in progress. The migration is not a lift-and-shift operation, it requires re-enrolling applications, migrating key material (which may require application downtime depending on your key export policies), and revalidating compliance documentation. Six months is the minimum realistic timeline for a complex deployment. Start now.
- Enable Azure Monitor diagnostic settings on your HSM resources on day one, you want telemetry history when a problem occurs, not starting from zero after the fact
- Deploy a dedicated management VM in the same VNet as your HSM and keep it running, cold-start provisioning a management VM during an incident adds unnecessary delay
- Document your Thales HA group configuration including partition names, slot assignments, and client certificate fingerprints, this information is critical for disaster recovery and isn't stored anywhere Azure can retrieve it for you
- Review the Azure Dedicated HSM retirement announcement and schedule a migration assessment with your Microsoft Account Manager before the end of 2026
Frequently Asked Questions
What is Azure Dedicated HSM and when do I actually need it?
Azure Dedicated HSM gives you a physical Thales Luna 7 appliance in an Azure datacenter that's yours alone, no other customers share it. You get full administrative control, and Microsoft has no access to your keys or partitions once you change the default credentials. The scenarios where it genuinely makes sense are lift-and-shift migrations of applications that were running against on-premises HSMs (like Oracle TDE, Apache SSL offload, or ADCS), and cases where regulations explicitly mandate FIPS 140-2 Level 3 validated hardware with sole-tenant control. If your requirement is simply FIPS 140-2 Level 3 compliance without the need for sole physical device access, Azure Managed HSM meets that bar with far less operational overhead. Most Azure workloads are better served by Azure Key Vault or Azure Managed HSM, Azure Dedicated HSM is a specialized tool for a specific profile of requirement.
Why is Azure Dedicated HSM being retired and what do I do about it?
Microsoft announced the retirement of Azure Dedicated HSM with a final support date of July 31, 2028. The service is not accepting new customers. The retirement reflects the general availability of Azure Cloud HSM, which Microsoft has positioned as the direct successor for customers who need dedicated single-tenant HSM hardware in the cloud. Existing Azure Dedicated HSM customers need to migrate to either Azure Cloud HSM, Azure Managed HSM, or Azure Key Vault, depending on their specific workload requirements. Microsoft recommends using the "How to choose the right Azure key management solution" guide to evaluate which target service fits your needs. Start your migration assessment now, waiting until 2027 leaves little room for the application re-enrollment, key migration, and compliance revalidation work that a real migration requires.
Does my organization qualify to use Azure Dedicated HSM?
Qualification for Azure Dedicated HSM has two requirements: you must have an assigned Microsoft Account Manager, and your organization must have at least $5 million USD in overall committed Azure revenue annually. This threshold exists because the service requires dedicated physical hardware allocation and premium support engagement, it's not a standard self-service offering. If your organization doesn't meet the $5M threshold, you cannot onboard regardless of your technical requirements. In that case, Azure Managed HSM is the appropriate alternative for FIPS 140-2 Level 3 requirements, and Azure Key Vault covers most standard key management needs. Note that since new customer onboarding is no longer accepted due to the retirement announcement, this question is now moot for new customers, Azure Cloud HSM is where new deployments should go.
How do I set up high availability for Azure Dedicated HSM?
High availability for Azure Dedicated HSM requires provisioning at minimum two physical devices in different Availability Zones within the same Azure region, then configuring them as a Thales HA group using the Thales Luna Network HSM client software. Azure provides the infrastructure layer (deploying the two devices into separate fault domains), but the HA group configuration itself happens entirely within the Thales management software, this is not something you configure in the Azure portal. The HA group creates a virtual slot that your applications connect to; the Thales client handles load balancing and automatic failover transparently. You should also consider cross-region redundancy for workloads with the strictest availability requirements, deploying a second HA pair in a different Azure region and configuring application-level failover logic. Test failover explicitly before going live, assuming it works is not the same as confirming it works.
Can I access my Azure Dedicated HSM from my on-premises servers?
Yes, and this is actually one of the primary use cases for Azure Dedicated HSM, supporting hybrid scenarios where applications span on-premises and Azure. You need either a site-to-site VPN or Azure ExpressRoute connection between your on-premises network and the Azure VNet where your HSM is deployed. Once the network layer is in place, on-premises servers use the same Thales Luna Network HSM client software to establish NTLS connections to the HSM's private IP address, exactly as they would with an on-premises appliance. The key requirement is that your on-premises firewall allows outbound traffic to the HSM subnet on TCP ports 1792 and 1093. The Thales integration guides, available from the Thales customer support portal, walk through the client certificate exchange process required to establish trusted NTLS connections from on-premises clients.
What's the difference between Azure Dedicated HSM and Azure Managed HSM?
The core difference is physical versus logical isolation. Azure Dedicated HSM gives you a physical Thales appliance that no other customer shares, you have full administrative control including the ability to manage partitions, set policies, and configure HA entirely on your own terms. Azure Managed HSM is a managed service backed by FIPS 140-2 Level 3 validated hardware, but it's a multi-tenant service from a hardware perspective, Microsoft manages the underlying HSMs and exposes key management through a standardized API. Managed HSM requires dramatically less operational expertise: no Thales client software, no partition management, no HA configuration. For most organizations that need FIPS 140-2 Level 3 compliance, Azure Managed HSM is the right answer. Azure Dedicated HSM makes sense when you have an existing Thales-based application that requires direct PKCS#11 or JCE access to the hardware, or when you have regulatory requirements that specifically mandate physical sole-tenant control of the HSM device itself, not just logical key isolation.