CVE-2026-44223 | vLLM: extract_hidden_states speculative decoding crashes server on any request with penalty parameters

vLLM is an inference and serving engine for large language models (LLMs). From 0.18.0 to before 0.20.0, the extract_hidden_states speculative decoding proposer in vLLM returns a tensor with an incorrect shape after the first decode step, causing a RuntimeError that crashes the EngineCore process. The crash is triggered when any request in the batch uses sampling penalty parameters (repetition_penalty, frequency_penalty, or presence_penalty). A single request with a penalty parameter (e.g., "repetition_penalty": 1.1) is sufficient to crash the server. This vulnerability is fixed in 0.20.0.

Published: 2026-05-12 Last update: 2026-06-22 Assigner: [email protected] Source: [email protected]

NVD Status：Modified，CVE State: published

View at NVD, CVE.org, EUVD

Threat Intelligence & Risk Assessment for CVE-2026-44223

EPSS CVSS CWE GHSA Affected

Conclusion & alert: CVE-2026-44223 is rated Low Risk (36.2/100): CVSS Medium severity, with low exploitation likelihood (EPSS 0.37%). Mandatory action: Monitor for updates and reassess as exploit intelligence or EPSS changes.

Risk is dynamic; we continuously reassess and refresh what is shown on this page as upstream context changes.

Exploit prediction scoring system (EPSS) score for CVE-2026-44223

EPSS lead: Daily EPSS estimates relative likelihood of exploitation; percentile ranks this CVE among scored vulnerabilities (higher = more severe relative rank).

#	Date	Old EPSS score	New EPSS score	Delta (New - Old)
1	2026-06-15	0.04%	0.37%	+0.33%
2	2026-05-13	—	0.04%	—

Full EPSS history (2 records total)

Common vulnerability scoring system (CVSS) metrics for CVE-2026-44223

CVSS metrics for this CVE.

Base score	Version	Severity	Vector	Exploitability	Impact	Score source
6.5	3.1	MEDIUM	`CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H` Click to expand Attack vector (AV:N) Could be attacked over the internet or any normal routed network—not just someone sitting at the machine. Attack complexity (AC:L) Once they can reach the bug, pulling it off is straightforward—no weird race conditions or rare setup. Privileges required (PR:L) A normal user session is enough; they don’t have to be admin. User interaction (UI:N) Nobody has to click “OK” or open a trap file; it can work without a victim helping. Scope (S:U) Damage stays in the same “trust bubble” as the broken component—no big spill into unrelated systems. Confidentiality (C:N) Doesn’t really leak secrets in a meaningful way. Integrity (I:N) Data isn’t meaningfully altered or forged. Availability (A:H) Could take the service down hard or make it unusable for people who depend on it.	2.8	3.6	[email protected]

Weakness enumeration for CVE-2026-44223

CWE-131 MITRE ↗ CWE-704 MITRE ↗

GitHub Security Advisory for CVE-2026-44223

GHSA-83vm-p52w-f9pw · Severity: medium · Ecosystem: pip — vLLM: extract_hidden_states speculative decoding crashes server on any request with penalty parameters

Affected software / configurations for CVE-2026-44223

Vendor	Product	Version	Raw CPE
vllm	vllm	>= 0.18.0, < 0.20.0	cpe:2.3:a:vllm:vllm::::::::

References for CVE-2026-44223

URL	Tags
https://github.com/vllm-project/vllm/pull/38610	Issue Tracking Patch
https://github.com/vllm-project/vllm/security/advisories/GHSA-83vm-p52w-f9pw	Mitigation Vendor Advisory

cvelogic Threat Intelligence