Nvidia (Mellanox): How to request a hardware RMA
By Sai Kiran Pandrala · reviewed by Sai Kiran Pandrala, Editor Last verified: 2026-05-30
| Vendor | Nvidia (Mellanox) |
|---|---|
| Operating system | Cumulus Linux / NVOS / SONiC |
| Category | Warranty / RMA / Support |
| Skill level | Intermediate to advanced |
| DIY-able? | Yes with CLI access; some scenarios need Nvidia Enterprise Support + RMA. |
What this guide covers
How to request a hardware RMA in the Nvidia (Mellanox) support ecosystem.
Step-by-step
- Open a support case with the failure symptom.
- Engineer typically requests diagnostic output to confirm a hardware fault.
- Once confirmed, the RMA is initiated.
- Replacement ships per your support tier SLA.
- Swap, return the failed unit using the return label, update inventory.
Useful URLs
- Support portal: https://enterprise-support.nvidia.com/
- Open a case: https://enterprise-support.nvidia.com/s/createcase
- Bug / advisory search: https://docs.nvidia.com/networking/category/etmnetworkingdevices
- Knowledge base: https://docs.nvidia.com/networking/
- Security advisories (PSIRT): https://www.nvidia.com/en-us/security/
- Warranty lookup: https://www.nvidia.com/en-us/support/enterprise/services-and-support/
Frequently asked questions
Will this work on my specific Cumulus Linux / NVOS / SONiC version?
The procedure reflects current Cumulus Linux / NVOS / SONiC behaviour. Older releases may need minor syntax adjustments, use the CLI help (? or tab-completion) to verify.
Should I open a Nvidia Enterprise Support case immediately?
Open one if you suspect hardware failure or the symptom persists after a maintenance-window reload. Make sure your support entitlement is active first.
Where can I find the Nvidia (Mellanox) official documentation?
https://docs.nvidia.com/networking/: search the product family + feature name.
Is this procedure safe in production?
Test in a lab or maintenance window first. Capture pre-change state so you can roll back.
Related guides
Related fixes
Related guides worth a look while you sort this one out:
- Best Nvidia (Mellanox) switch for branch office
- Best Nvidia (Mellanox) switch for enterprise data centre
- Best Nvidia (Mellanox) switch for retail store
- Best Nvidia (Mellanox) switch for SD-WAN deployment
- Best Nvidia (Mellanox) switch for small office
- Best Nvidia (Mellanox) switch for SMB under USD 5000
References
- Nvidia (Mellanox) support portal: https://enterprise-support.nvidia.com/
- Nvidia (Mellanox) knowledge base: https://docs.nvidia.com/networking/
- Nvidia (Mellanox) security advisories: https://www.nvidia.com/en-us/security/
- Open a case: https://enterprise-support.nvidia.com/s/createcase
Reference material, not professional advice. Validate against your specific Cumulus Linux / NVOS / SONiC version and test in a non-production environment before applying.
What changed recently?
Fault diagnosis on a Nvidia device goes faster when you map the symptom to a recent change:
- Did firmware update in the last 7 days?
- Did the network (router, ISP, VPN) change?
- Was the device moved physically?
- Did paired devices (phone, hub, app) update?
- Were any accessories swapped in or out?
The answer narrows the root cause to a manageable subset.
Safety + preconditions
Before any work on a Nvidia device:
- Unplug from mains for any internal-access procedure.
- Discharge stored energy (capacitors in PSUs, residual battery charge) per manufacturer guidance.
- Use ESD-safe handling for boards and modules, no carpet, no wool sleeves.
- Avoid moisture; never apply liquids near vents or connectors.
- If you smell smoke, see scorch marks, or feel uneven heat, stop and escalate.
Quick verification
Before you walk away from a Nvidia device fix, run through:
1. Reproduce the original trigger. does the issue reappear? 2. Check the device's status / health screen for any new alerts. 3. Confirm paired devices (app, hub, controller) reconnected. 4. Save / commit any configuration changes per the device's normal workflow. 5. Note the change in your maintenance log with date + firmware version.
When to call Nvidia support instead
Escalate if:
- The same symptom returns within 24 hours of a clean fix.
- You see physical damage (burn marks, swollen battery, cracked PCB).
- The device is in warranty and a hardware replacement is the cheaper outcome.
- Repair requires specialised tools you don't own (alignment jigs, calibration software).
- Following the official path keeps the warranty intact, which matters more than the time spent.
More frequently asked questions
Is it safe to apply during business hours?
If the device is in production use, apply during a scheduled maintenance window. Most procedures need 2-15 minutes of downtime. Capture pre-change state so you can roll back if needed.
How long does this fix usually take?
Most users complete the steps in 20-45 minutes the first time, and 5-10 minutes on subsequent runs once the menu paths are familiar.
Are there safer alternatives for non-technical users?
Yes, the manufacturer's self-service troubleshooter (HP Smart, LG ThinQ, Samsung Members, similar) usually walks through the same steps in a guided UI. Use that first if you're not comfortable with menu paths.
Does this affect other devices on my network?
Generally no. The procedure is local to this device. Network-side changes (firmware updates that affect TLS, SMB, or routing) are flagged explicitly in the steps.
Will the procedure work on the international variant?
Some features and firmware paths are region-locked. Check the model spec sheet to confirm your variant supports the menu option referenced. If you're outside the US/EU, look for the regional support portal.