CVE-2024-5206

Publication date 6 June 2024

Last updated 30 May 2025

Ubuntu priority

Cvss 3 Severity Score

A sensitive data leakage vulnerability was identified in scikit-learn’s TfidfVectorizer, specifically in versions up to and including 1.4.1.post1, which was fixed in version 1.5.0. The vulnerability arises from the unexpected storage of all tokens present in the training data within the `stop_words_` attribute, rather than only storing the subset of tokens required for the TF-IDF technique to function. This behavior leads to the potential leakage of sensitive information, as the `stop_words_` attribute could contain tokens that were meant to be discarded and not stored, such as passwords or keys. The impact of this vulnerability varies based on the nature of the data being processed by the vectorizer.

Status

Show unmaintained releases

Package	Ubuntu Release	Status
scikit-learn	25.04 plucky	Needs evaluation
	24.10 oracular	Needs evaluation
	24.04 LTS noble	Needs evaluation
	23.10 mantic	Ignored end of life, was needs-triage
	22.04 LTS jammy	Needs evaluation
	20.04 LTS focal	Needs evaluation
	18.04 LTS bionic	Needs evaluation
	16.04 LTS xenial	Needs evaluation
	14.04 LTS trusty	Needs evaluation

Severity score breakdown

Parameter	Value
Base score	4.7 · Medium
Attack vector	Local
Attack complexity	High
Privileges required	Low
User interaction	None
Scope	Unchanged
Confidentiality	High
Integrity impact	None
Availability impact	None
Vector	CVSS:3.0/AV:L/AC:H/PR:L/UI:N/S:U/C:H/I:N/A:N

CVE-2024-5206

Cvss 3 Severity Score

Status

Severity score breakdown

References

Other references