CVE-2024-5206 | Sensitive Data Leakage in sklearn.feature_extraction.text.TfidfVectorizer in scikit-learn/scikit-learn

A sensitive data leakage vulnerability was identified in scikit-learn's TfidfVectorizer, specifically in versions up to and including 1.4.1.post1, which was fixed in version 1.5.0. The vulnerability arises from the unexpected storage of all tokens present in the training data within the `stop_words_` attribute, rather than only storing the subset of tokens required for the TF-IDF technique to function. This behavior leads to the potential leakage of sensitive information, as the `stop_words_` attribute could contain tokens that were meant to be discarded and not stored, such as passwords or keys. The impact of this vulnerability varies based on the nature of the data being processed by the vectorizer.

Published: 2024-06-06 Last update: 2024-11-21 Assigner: [email protected] Source: [email protected]

Conclusion & alert: CVE-2024-5206 is rated Low Risk (21.9/100): CVSS Medium severity, with low exploitation likelihood (EPSS 0.19%). Mandatory action: Monitor for updates and reassess as exploit intelligence or EPSS changes.

Risk is dynamic; we continuously reassess and refresh what is shown on this page as upstream context changes.

Exploit prediction scoring system (EPSS) score for CVE-2024-5206

EPSS lead: Daily EPSS estimates relative likelihood of exploitation; percentile ranks this CVE among scored vulnerabilities (higher = more severe relative rank).

# Date Old EPSS score New EPSS score Delta (New - Old)
1 2026-06-15 0.04% 0.19% +0.15%
2 2025-11-21 0.05% 0.04% -0.01%
3 2025-11-18 0.05%

Full EPSS history (6 records total)

Common vulnerability scoring system (CVSS) metrics for CVE-2024-5206

CVSS metrics for this CVE.

Base score Version Severity Vector Exploitability Impact Score source
4.7 3.1 MEDIUM
CVSS:3.1/AV:L/AC:H/PR:L/UI:N/S:U/C:H/I:N/A:N Click to expand
Attack vector (AV:L)
They already need access on the box, or another person has to do something wrong; it’s not a remote drive-by.
Attack complexity (AC:H)
Even with access, the exploit needs extra luck, timing, or a fussy environment to actually work.
Privileges required (PR:L)
A normal user session is enough; they don’t have to be admin.
User interaction (UI:N)
Nobody has to click “OK” or open a trap file; it can work without a victim helping.
Scope (S:U)
Damage stays in the same “trust bubble” as the broken component—no big spill into unrelated systems.
Confidentiality (C:H)
Serious risk that confidential data gets exposed in a big way.
Integrity (I:N)
Data isn’t meaningfully altered or forged.
Availability (A:N)
Service keeps running; no real outage angle.
1.0 3.6 [email protected]
4.7 3.0 MEDIUM
CVSS:3.0/AV:L/AC:H/PR:L/UI:N/S:U/C:H/I:N/A:N Click to expand
Attack vector (AV:L)
They already need access on the box, or another person has to do something wrong; it’s not a remote drive-by.
Attack complexity (AC:H)
Even with access, the exploit needs extra luck, timing, or a fussy environment to actually work.
Privileges required (PR:L)
A normal user session is enough; they don’t have to be admin.
User interaction (UI:N)
Nobody has to click “OK” or open a trap file; it can work without a victim helping.
Scope (S:U)
Damage stays in the same “trust bubble” as the broken component—no big spill into unrelated systems.
Confidentiality (C:H)
Serious risk that confidential data gets exposed in a big way.
Integrity (I:N)
Data isn’t meaningfully altered or forged.
Availability (A:N)
Service keeps running; no real outage angle.
1.0 3.6 [email protected]

Weakness enumeration for CVE-2024-5206

GitHub Security Advisory for CVE-2024-5206

GHSA-jw8x-6495-233v · Severity: medium · Ecosystem: pip — scikit-learn sensitive data leakage vulnerability

OS Trackers for CVE-2024-5206

vendor priority summary link
debian unimportant CVE-2024-5206 unimportant priority: Debian including 1 source packages (scikit-learn), 5 status rows across 5 suites (bookworm, bullseye, forky, sid, trixie): open 5. https://security-tracker.debian.org/tracker/CVE-2024-5206
redhat medium https://access.redhat.com/security/cve/CVE-2024-5206
suse medium https://www.suse.com/security/cve/CVE-2024-5206/
ubuntu medium CVE-2024-5206 medium priority: Ubuntu including 1 source packages (scikit-learn), 11 status rows across 11 suites (bionic, focal, jammy, mantic, noble, oracular, plucky, questing, trusty, upstream, xenial): needs-triage 8, ignored 3. https://ubuntu.com/security/CVE-2024-5206

Affected software / configurations for CVE-2024-5206

Vendor Product Version Raw CPE
scikit-learn scikit-learn < 1.5.0 cpe:2.3:a:scikit-learn:scikit-learn:*:*:*:*:*:python:*:*

References for CVE-2024-5206

cvelogic Threat Intelligence