How to Fix CVE-2026-2069: Stack Buffer Overflow in llama.cpp
Related fixes
Other vulnerabilities in the same area that are worth patching alongside this one:
- How to Fix CVE-2026-2957: Denial of service in dst-admin — Denial of service in dst-admin
- How to Fix CVE-2026-3309: Improper Control of Generation of Code ('Code Injection') — Improper Control of Generation of Code ('Code Injection')
- How to Fix CVE-2026-31386: Improper neutralization of special elements used in an OS command ('OS Command Injection') , Improper neutralization of special elements used in an OS command ('OS Command Injection')
- How to Fix CVE-2026-4712: Critical Vulnerability in Firefox , Critical Vulnerability in Firefox
- How to Fix CVE-2026-32308: OneUptime: Stored XSS via Mermaid Diagram Rendering (securityLevel: "loose") , OneUptime: Stored XSS via Mermaid Diagram Rendering (securityLevel: "loose")
*By Sai Kiran Pandrala*
| Severity | CVSS 4.8 - Medium |
|---|---|
| Actively exploited? | Not currently listed in CISA KEV |
| Affected | 55abc39 |
| Fixed in | See vendor advisory |
| Type (CWE) | CWE-121: Stack-based Buffer Overflow |
What is CVE-2026-2069?
CVE-2026-2069 is a stack-based buffer overflow in llama.cpp. A remote attacker can send a crafted message that overflows a fixed-size stack buffer, corrupting the return address and, on un-mitigated builds, achieving code execution. Vendor description: A flaw has been found in ggml-org llama.cpp up to 55abc39. Impacted is the function llama_grammar_advance_stack of the file llama.cpp/src/llama-grammar.cpp of the component GBNF Grammar Handler.
Why this CVE matters
Stack-based buffer overflows in network-reachable services have driven some of the highest-impact incidents of the past two years. Modern compiler protections raise the bar, but real-world exploits for unpatched appliances continue to appear quickly after disclosure.
For deployments of llama.cpp that have been exposed to the public internet during the disclosure window, the operating assumption should be that scanning has already happened. Even where exploitation has not been publicly observed, scanning for the vulnerable fingerprint is cheap and routine. Patching closes the door; log review and credential rotation close out the rest of the response.
Am I affected?
You are affected if your installation matches any of these version ranges:
- llama.cpp: 55abc39
Check your installed version against the list above. If you cannot determine the version, treat the system as affected and follow the upgrade path below.
Open llama.cpp's About dialog or run the vendor-documented version-check command. Compare the result against the affected ranges in the advisory.
How to fix CVE-2026-2069
- Read the vendor advisory in full: https://vuldb.com/?id.344636
- Upgrade llama.cpp to the patched build listed in the vendor advisory.
- Back up the configuration (and database, where applicable) before upgrading.
- Apply the patch in a maintenance window. For HA pairs, upgrade the standby node first, fail over, then upgrade the former primary.
- Restart the affected service so the patched binary loads, then verify the new version (see verification section).
npm / Yarn / pnpm
# Confirm the patched build against the vendor advisory: https://github.com/ggml-org/llama.cpp/pull/18993
# Update to the patched release <patched-version-from-advisory>.
npm install llama.cpp@<patched-version-from-advisory>
# Alternative pinning:
npm install llama.cpp@latest
npm ls llama.cpp
PyPI (pip / Poetry)
# Confirm the patched build against the vendor advisory: https://github.com/ggml-org/llama.cpp/pull/18993
pip install --upgrade "llama.cpp==<patched-version-from-advisory>"
pip show llama.cpp | grep -i version
# Poetry equivalent:
poetry add llama.cpp@<patched-version-from-advisory>
Docker / container
# Confirm the patched build against the vendor advisory: https://github.com/ggml-org/llama.cpp/pull/18993
docker pull <your-registry>/llama.cpp:<patched-version-from-advisory>
docker stop <app> && docker rm <app>
docker run -d --name <app> <your-registry>/llama.cpp:<patched-version-from-advisory>
docker image inspect <your-registry>/llama.cpp:<patched-version-from-advisory> --format '{{.Id}}'
Verify the fix landed
# Confirm the patched build against the vendor advisory: https://github.com/ggml-org/llama.cpp/pull/18993
# 1. Confirm the running version equals the advisory's fixed-in build.
# (Use the platform-specific version probe from the commands above.)
# 2. Re-scan with your vulnerability scanner (Nessus, Qualys, Tenable, OpenVAS).
# The scanner should no longer flag CVE-2026-2069 on the patched target.
# 3. Inspect recent service and kernel logs for crash-loops or rollback events.
journalctl --since "10 minutes ago" | tail -200
dmesg --since "10 minutes ago" | tail -100
If you cannot patch immediately
Block network reachability to the vulnerable service from untrusted networks and apply the patched build. Memory-corruption bugs cannot be reliably mitigated at the network layer; the patch is the fix.
How to verify the fix worked
- After applying the patch, verify the running version in the product's admin UI or via the vendor-documented CLI command.
- Confirm the patched build matches the version listed in the vendor advisory.
- Run an authenticated vulnerability scan with a current signature set and confirm the scanner no longer flags CVE-2026-2069.
- Review logs for the entire pre-patch window for indicators of compromise listed in the vendor or CISA advisory.
- Confirm any network-layer mitigations that were applied as a stopgap have been reverted (or left in place intentionally) once the patch is verified.
If your installation was internet-reachable during the disclosure window, treat log review as part of the remediation rather than an optional follow-up. Look for repeated service restarts, crash logs from the affected daemon, and core files generated around the time of any anomalous traffic. A memory-corruption flaw used for exploitation often leaves a trail of failed attempts before the successful one.
Frequently asked questions
Is CVE-2026-2069 being exploited in the wild?
Public exploitation has not been confirmed by CISA at the time of writing. Treat the patch as time-sensitive anyway; reports often lag actual abuse.
Will a WAF or IDS rule fully mitigate CVE-2026-2069?
No. Network-layer filters can reduce noise and slow opportunistic scanners, but they will not stop a determined attacker. The vendor patch is the only durable fix.
How long should I plan for the upgrade?
Typical vendor-documented upgrade windows for llama.cpp run from a few minutes to under an hour depending on cluster size. Test in a staging environment first and follow the vendor's documented HA upgrade order.
References
- Official vendor advisory: https://vuldb.com/?id.344636
- NVD entry: https://nvd.nist.gov/vuln/detail/CVE-2026-2069
- CISA KEV catalog: https://www.cisa.gov/known-exploited-vulnerabilities-catalog
- Additional vendor or research reference: https://vuldb.com/?ctiid.344636
- Additional vendor or research reference: https://vuldb.com/?submit.745263
- Additional vendor or research reference: https://github.com/ggml-org/llama.cpp/issues/18988
- Additional vendor or research reference: https://github.com/ggml-org/llama.cpp/issues/18988#event-4426704865
*This guide was assembled from the official vendor advisory, the NVD record, and the CISA KEV catalog entry on 2026-05-25. Always confirm against the vendor advisory before applying changes in production.*