CVE-2026-44223 | vLLM: extract_hidden_states speculative decoding crashes server on any request with penalty parameters

vLLM is an inference and serving engine for large language models (LLMs). From 0.18.0 to before 0.20.0, the extract_hidden_states speculative decoding proposer in vLLM returns a tensor with an incorrect shape after the first decode step, causing a RuntimeError that crashes the EngineCore process. The crash is triggered when any request in the batch uses sampling penalty parameters (repetition_penalty, frequency_penalty, or presence_penalty). A single request with a penalty parameter (e.g., "repetition_penalty": 1.1) is sufficient to crash the server. This vulnerability is fixed in 0.20.0.

Published: 2026-05-12 Last update: 2026-06-22 Assigner: [email protected] Source: [email protected]

Conclusion & alert: CVE-2026-44223 is rated Low Risk (36.2/100): CVSS Medium severity, with low exploitation likelihood (EPSS 0.37%). Mandatory action: Monitor for updates and reassess as exploit intelligence or EPSS changes.

Risk is dynamic; we continuously reassess and refresh what is shown on this page as upstream context changes.

Exploit prediction scoring system (EPSS) score for CVE-2026-44223

EPSS lead: Daily EPSS estimates relative likelihood of exploitation; percentile ranks this CVE among scored vulnerabilities (higher = more severe relative rank).

# Date Old EPSS score New EPSS score Delta (New - Old)
1 2026-06-15 0.04% 0.37% +0.33%
2 2026-05-13 0.04%

Full EPSS history (2 records total)

Common vulnerability scoring system (CVSS) metrics for CVE-2026-44223

CVSS metrics for this CVE.

Base score Version Severity Vector Exploitability Impact Score source
6.5 3.1 MEDIUM
CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H Click to expand
Attack vector (AV:N)
Could be attacked over the internet or any normal routed network—not just someone sitting at the machine.
Attack complexity (AC:L)
Once they can reach the bug, pulling it off is straightforward—no weird race conditions or rare setup.
Privileges required (PR:L)
A normal user session is enough; they don’t have to be admin.
User interaction (UI:N)
Nobody has to click “OK” or open a trap file; it can work without a victim helping.
Scope (S:U)
Damage stays in the same “trust bubble” as the broken component—no big spill into unrelated systems.
Confidentiality (C:N)
Doesn’t really leak secrets in a meaningful way.
Integrity (I:N)
Data isn’t meaningfully altered or forged.
Availability (A:H)
Could take the service down hard or make it unusable for people who depend on it.
2.8 3.6 [email protected]

Weakness enumeration for CVE-2026-44223

GitHub Security Advisory for CVE-2026-44223

GHSA-83vm-p52w-f9pw · Severity: medium · Ecosystem: pip — vLLM: extract_hidden_states speculative decoding crashes server on any request with penalty parameters

Affected software / configurations for CVE-2026-44223

Vendor Product Version Raw CPE
vllm vllm >= 0.18.0, < 0.20.0 cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*

References for CVE-2026-44223

cvelogic Threat Intelligence