Upgrade Paths

Nvidia (Mellanox) SN3420: Upgrade Path to latest LTS / GA

By Sai Kiran Pandrala · reviewed by Sai Kiran Pandrala, Editor Last verified: 2026-05-30

⚡ At a glance
VendorNvidia (Mellanox)
Operating systemCumulus Linux / NVOS / SONiC
CategoryUpgrade Paths
Skill levelIntermediate to advanced
DIY-able?Yes with CLI access; some scenarios need Nvidia Enterprise Support + RMA.

Image upgrades on Nvidia (Mellanox) platforms have one cardinal rule: verify the running image first. `nv show system` on Cumulus Linux / NVOS / SONiC is the single most useful command in a change window because it tells you exactly what you are rolling back to if something breaks.

Across the SN3420 family the upgrade syntax is `onie-nos-install /home/cumulus/cumulus-linux-5.x.img`: pay attention to the activation step because Cumulus Linux / NVOS / SONiC treats download and activate as separate transactions. Forgetting the activation step is the single most common reason an 'upgrade' silently does nothing.

Nvidia Enterprise Support expects you to capture pre-upgrade state and have a console session open during the change window. Anything less is a support-case waste of time if it goes sideways.

What this guide covers

Upgrade procedure for Nvidia (Mellanox) SN3420 to latest LTS / GA (Cumulus Linux / NVOS / SONiC).

Notes specific to this combination

Verify the supported upgrade path in the Nvidia (Mellanox) release notes before proceeding. Some Cumulus Linux / NVOS / SONiC releases require an intermediate hop; some support direct upgrade.

Step-by-step

  1. Verify current version: nv show system.
  2. Read the release notes for supported upgrade paths.
  3. Confirm minimum RAM / disk for the target release.
  4. Download target image; verify checksum.
  5. Schedule maintenance window.
  6. Back up running configuration.
  7. Copy image to local flash.
  8. Run onie-nos-install /home/cumulus/cumulus-linux-5.x.img.
  9. Reboot: nv action reboot system.
  10. Verify; nv config save if healthy.

CLI / commands

nv show system
nv show platform inventory
onie-nos-install /home/cumulus/cumulus-linux-5.x.img
nv config save

Frequently asked questions

Will this work on my specific Cumulus Linux / NVOS / SONiC version?

The procedure reflects current Cumulus Linux / NVOS / SONiC behaviour. Older releases may need minor syntax adjustments, use the CLI help (? or tab-completion) to verify.

Should I open a Nvidia Enterprise Support case immediately?

Open one if you suspect hardware failure or the symptom persists after a maintenance-window reload. Make sure your support entitlement is active first.

Where can I find the Nvidia (Mellanox) official documentation?

https://docs.nvidia.com/networking/. search the product family + feature name.

Is this procedure safe in production?

Test in a lab or maintenance window first. Capture pre-change state so you can roll back.

Related guides worth a look while you sort this one out:

References


Reference material, not professional advice. Validate against your specific Cumulus Linux / NVOS / SONiC version and test in a non-production environment before applying.

Why this matters for your day-to-day

A Nvidia device that's misbehaving costs more than the fix itself: lost productivity, missed calls, security risk, even safety risk in some categories. Treating the symptom quickly with a documented procedure is cheaper than letting it persist. The steps above are written to get you back to working in under an hour where possible, and to flag clearly when escalation is the right call.

Before you start

A few things to confirm so the Nvidia device fix goes cleanly:

How to confirm it's actually fixed

On a Nvidia device, the test is rarely "reboot and see". Use this list:

Escalation guide

For a Nvidia device, the right escalation depends on impact:

More frequently asked questions

Why is this happening on a brand-new unit?

Out-of-box defects do occur. If you've owned the device under 30 days and the symptom persists after a factory reset, escalate to the seller for replacement under DOA terms before opening a manufacturer support case.

Does this affect other devices on my network?

Generally no. The procedure is local to this device. Network-side changes (firmware updates that affect TLS, SMB, or routing) are flagged explicitly in the steps.

What if the fix returns after a reboot?

Persistent fault returns mean either: a hardware fault (escalate), a configuration that's being overwritten by a sync source (check cloud profiles), or a regression in a recent firmware update (rollback).

How long does this fix usually take?

Most users complete the steps in 20-45 minutes the first time, and 5-10 minutes on subsequent runs once the menu paths are familiar.

Are there safer alternatives for non-technical users?

Yes, the manufacturer's self-service troubleshooter (HP Smart, LG ThinQ, Samsung Members, similar) usually walks through the same steps in a guided UI. Use that first if you're not comfortable with menu paths.