Vulnerabilites related to vllm - vllm
CVE-2025-48887 (GCVE-0-2025-48887)
Vulnerability from cvelistv5
Published
2025-05-30 17:36
Modified
2025-05-30 17:58
Severity ?
VLAI Severity ?
EPSS score ?
CWE
- CWE-1333 - Inefficient Regular Expression Complexity
Summary
vLLM, an inference and serving engine for large language models (LLMs), has a Regular Expression Denial of Service (ReDoS) vulnerability in the file `vllm/entrypoints/openai/tool_parsers/pythonic_tool_parser.py` of versions 0.6.4 up to but excluding 0.9.0. The root cause is the use of a highly complex and nested regular expression for tool call detection, which can be exploited by an attacker to cause severe performance degradation or make the service unavailable. The pattern contains multiple nested quantifiers, optional groups, and inner repetitions which make it vulnerable to catastrophic backtracking. Version 0.9.0 contains a patch for the issue.
References
► | URL | Tags | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
Impacted products
Vendor | Product | Version | ||
---|---|---|---|---|
vllm-project | vllm |
Version: >= 0.6.4, < 0.9.0 |
{ "containers": { "adp": [ { "metrics": [ { "other": { "content": { "id": "CVE-2025-48887", "options": [ { "Exploitation": "poc" }, { "Automatable": "no" }, { "Technical Impact": "partial" } ], "role": "CISA Coordinator", "timestamp": "2025-05-30T17:58:00.784274Z", "version": "2.0.3" }, "type": "ssvc" } } ], "providerMetadata": { "dateUpdated": "2025-05-30T17:58:23.074Z", "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "shortName": "CISA-ADP" }, "references": [ { "tags": [ "exploit" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-w6q7-j642-7c25" } ], "title": "CISA ADP Vulnrichment" } ], "cna": { "affected": [ { "product": "vllm", "vendor": "vllm-project", "versions": [ { "status": "affected", "version": "\u003e= 0.6.4, \u003c 0.9.0" } ] } ], "descriptions": [ { "lang": "en", "value": "vLLM, an inference and serving engine for large language models (LLMs), has a Regular Expression Denial of Service (ReDoS) vulnerability in the file `vllm/entrypoints/openai/tool_parsers/pythonic_tool_parser.py` of versions 0.6.4 up to but excluding 0.9.0. The root cause is the use of a highly complex and nested regular expression for tool call detection, which can be exploited by an attacker to cause severe performance degradation or make the service unavailable. The pattern contains multiple nested quantifiers, optional groups, and inner repetitions which make it vulnerable to catastrophic backtracking. Version 0.9.0 contains a patch for the issue." } ], "metrics": [ { "cvssV3_1": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 6.5, "baseSeverity": "MEDIUM", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H", "version": "3.1" } } ], "problemTypes": [ { "descriptions": [ { "cweId": "CWE-1333", "description": "CWE-1333: Inefficient Regular Expression Complexity", "lang": "en", "type": "CWE" } ] } ], "providerMetadata": { "dateUpdated": "2025-05-30T17:36:16.716Z", "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "shortName": "GitHub_M" }, "references": [ { "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-w6q7-j642-7c25", "tags": [ "x_refsource_CONFIRM" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-w6q7-j642-7c25" }, { "name": "https://github.com/vllm-project/vllm/pull/18454", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/pull/18454" }, { "name": "https://github.com/vllm-project/vllm/commit/4fc1bf813ad80172c1db31264beaef7d93fe0601", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/commit/4fc1bf813ad80172c1db31264beaef7d93fe0601" } ], "source": { "advisory": "GHSA-w6q7-j642-7c25", "discovery": "UNKNOWN" }, "title": "vLLM has a Regular Expression Denial of Service (ReDoS, Exponential Complexity) Vulnerability in `pythonic_tool_parser.py`" } }, "cveMetadata": { "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "assignerShortName": "GitHub_M", "cveId": "CVE-2025-48887", "datePublished": "2025-05-30T17:36:16.716Z", "dateReserved": "2025-05-27T20:14:34.297Z", "dateUpdated": "2025-05-30T17:58:23.074Z", "state": "PUBLISHED" }, "dataType": "CVE_RECORD", "dataVersion": "5.1" }
CVE-2025-29783 (GCVE-0-2025-29783)
Vulnerability from cvelistv5
Published
2025-03-19 15:33
Modified
2025-03-22 00:02
Severity ?
VLAI Severity ?
EPSS score ?
CWE
- CWE-502 - Deserialization of Untrusted Data
Summary
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. When vLLM is configured to use Mooncake, unsafe deserialization exposed directly over ZMQ/TCP on all network interfaces will allow attackers to execute remote code on distributed hosts. This is a remote code execution vulnerability impacting any deployments using Mooncake to distribute KV across distributed hosts. This vulnerability is fixed in 0.8.0.
References
► | URL | Tags | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
Impacted products
Vendor | Product | Version | ||
---|---|---|---|---|
vllm-project | vllm |
Version: >= 0.6.5, < 0.8.0 |
{ "containers": { "adp": [ { "metrics": [ { "other": { "content": { "id": "CVE-2025-29783", "options": [ { "Exploitation": "none" }, { "Automatable": "yes" }, { "Technical Impact": "total" } ], "role": "CISA Coordinator", "timestamp": "2025-03-19T18:30:27.510910Z", "version": "2.0.3" }, "type": "ssvc" } } ], "providerMetadata": { "dateUpdated": "2025-03-19T18:30:38.466Z", "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "shortName": "CISA-ADP" }, "title": "CISA ADP Vulnrichment" } ], "cna": { "affected": [ { "product": "vllm", "vendor": "vllm-project", "versions": [ { "status": "affected", "version": "\u003e= 0.6.5, \u003c 0.8.0" } ] } ], "descriptions": [ { "lang": "en", "value": "vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. When vLLM is configured to use Mooncake, unsafe deserialization exposed directly over ZMQ/TCP on all network interfaces will allow attackers to execute remote code on distributed hosts. This is a remote code execution vulnerability impacting any deployments using Mooncake to distribute KV across distributed hosts. This vulnerability is fixed in 0.8.0." } ], "metrics": [ { "cvssV3_1": { "attackComplexity": "LOW", "attackVector": "ADJACENT_NETWORK", "availabilityImpact": "HIGH", "baseScore": 9.1, "baseSeverity": "CRITICAL", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "privilegesRequired": "LOW", "scope": "CHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:A/AC:L/PR:L/UI:N/S:C/C:H/I:H/A:H", "version": "3.1" } } ], "problemTypes": [ { "descriptions": [ { "cweId": "CWE-502", "description": "CWE-502: Deserialization of Untrusted Data", "lang": "en", "type": "CWE" } ] } ], "providerMetadata": { "dateUpdated": "2025-03-22T00:02:54.404Z", "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "shortName": "GitHub_M" }, "references": [ { "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-x3m8-f7g5-qhm7", "tags": [ "x_refsource_CONFIRM" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-x3m8-f7g5-qhm7" }, { "name": "https://github.com/vllm-project/vllm/pull/14228", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/pull/14228" }, { "name": "https://github.com/vllm-project/vllm/commit/288ca110f68d23909728627d3100e5a8db820aa2", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/commit/288ca110f68d23909728627d3100e5a8db820aa2" } ], "source": { "advisory": "GHSA-x3m8-f7g5-qhm7", "discovery": "UNKNOWN" }, "title": "vLLM Allows Remote Code Execution via Mooncake Integration" } }, "cveMetadata": { "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "assignerShortName": "GitHub_M", "cveId": "CVE-2025-29783", "datePublished": "2025-03-19T15:33:28.951Z", "dateReserved": "2025-03-11T14:23:00.475Z", "dateUpdated": "2025-03-22T00:02:54.404Z", "state": "PUBLISHED" }, "dataType": "CVE_RECORD", "dataVersion": "5.1" }
CVE-2025-29770 (GCVE-0-2025-29770)
Vulnerability from cvelistv5
Published
2025-03-19 15:31
Modified
2025-03-19 20:15
Severity ?
VLAI Severity ?
EPSS score ?
CWE
- CWE-770 - Allocation of Resources Without Limits or Throttling
Summary
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. The outlines library is one of the backends used by vLLM to support structured output (a.k.a. guided decoding). Outlines provides an optional cache for its compiled grammars on the local filesystem. This cache has been on by default in vLLM. Outlines is also available by default through the OpenAI compatible API server. The affected code in vLLM is vllm/model_executor/guided_decoding/outlines_logits_processors.py, which unconditionally uses the cache from outlines. A malicious user can send a stream of very short decoding requests with unique schemas, resulting in an addition to the cache for each request. This can result in a Denial of Service if the filesystem runs out of space. Note that even if vLLM was configured to use a different backend by default, it is still possible to choose outlines on a per-request basis using the guided_decoding_backend key of the extra_body field of the request. This issue applies only to the V0 engine and is fixed in 0.8.0.
References
Impacted products
Vendor | Product | Version | ||
---|---|---|---|---|
vllm-project | vllm |
Version: < 0.8.0 |
{ "containers": { "adp": [ { "metrics": [ { "other": { "content": { "id": "CVE-2025-29770", "options": [ { "Exploitation": "none" }, { "Automatable": "no" }, { "Technical Impact": "partial" } ], "role": "CISA Coordinator", "timestamp": "2025-03-19T20:14:04.764365Z", "version": "2.0.3" }, "type": "ssvc" } } ], "providerMetadata": { "dateUpdated": "2025-03-19T20:15:47.505Z", "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "shortName": "CISA-ADP" }, "title": "CISA ADP Vulnrichment" } ], "cna": { "affected": [ { "product": "vllm", "vendor": "vllm-project", "versions": [ { "status": "affected", "version": "\u003c 0.8.0" } ] } ], "descriptions": [ { "lang": "en", "value": "vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. The outlines library is one of the backends used by vLLM to support structured output (a.k.a. guided decoding). Outlines provides an optional cache for its compiled grammars on the local filesystem. This cache has been on by default in vLLM. Outlines is also available by default through the OpenAI compatible API server. The affected code in vLLM is vllm/model_executor/guided_decoding/outlines_logits_processors.py, which unconditionally uses the cache from outlines. A malicious user can send a stream of very short decoding requests with unique schemas, resulting in an addition to the cache for each request. This can result in a Denial of Service if the filesystem runs out of space. Note that even if vLLM was configured to use a different backend by default, it is still possible to choose outlines on a per-request basis using the guided_decoding_backend key of the extra_body field of the request. This issue applies only to the V0 engine and is fixed in 0.8.0." } ], "metrics": [ { "cvssV3_1": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 6.5, "baseSeverity": "MEDIUM", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H", "version": "3.1" } } ], "problemTypes": [ { "descriptions": [ { "cweId": "CWE-770", "description": "CWE-770: Allocation of Resources Without Limits or Throttling", "lang": "en", "type": "CWE" } ] } ], "providerMetadata": { "dateUpdated": "2025-03-19T15:31:00.403Z", "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "shortName": "GitHub_M" }, "references": [ { "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-mgrm-fgjv-mhv8", "tags": [ "x_refsource_CONFIRM" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-mgrm-fgjv-mhv8" }, { "name": "https://github.com/vllm-project/vllm/pull/14837", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/pull/14837" }, { "name": "https://github.com/vllm-project/vllm/blob/53be4a863486d02bd96a59c674bbec23eec508f6/vllm/model_executor/guided_decoding/outlines_logits_processors.py", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/blob/53be4a863486d02bd96a59c674bbec23eec508f6/vllm/model_executor/guided_decoding/outlines_logits_processors.py" } ], "source": { "advisory": "GHSA-mgrm-fgjv-mhv8", "discovery": "UNKNOWN" }, "title": "vLLM denial of service via outlines unbounded cache on disk" } }, "cveMetadata": { "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "assignerShortName": "GitHub_M", "cveId": "CVE-2025-29770", "datePublished": "2025-03-19T15:31:00.403Z", "dateReserved": "2025-03-11T14:23:00.474Z", "dateUpdated": "2025-03-19T20:15:47.505Z", "state": "PUBLISHED" }, "dataType": "CVE_RECORD", "dataVersion": "5.1" }
CVE-2025-48943 (GCVE-0-2025-48943)
Vulnerability from cvelistv5
Published
2025-05-30 18:36
Modified
2025-05-30 18:56
Severity ?
VLAI Severity ?
EPSS score ?
CWE
- CWE-248 - Uncaught Exception
Summary
vLLM is an inference and serving engine for large language models (LLMs). Version 0.8.0 up to but excluding 0.9.0 have a Denial of Service (ReDoS) that causes the vLLM server to crash if an invalid regex was provided while using structured output. This vulnerability is similar to GHSA-6qc9-v4r8-22xg/CVE-2025-48942, but for regex instead of a JSON schema. Version 0.9.0 fixes the issue.
References
► | URL | Tags | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Impacted products
Vendor | Product | Version | ||
---|---|---|---|---|
vllm-project | vllm |
Version: >= 0.8.0, < 0.9.0 |
{ "containers": { "adp": [ { "metrics": [ { "other": { "content": { "id": "CVE-2025-48943", "options": [ { "Exploitation": "none" }, { "Automatable": "no" }, { "Technical Impact": "partial" } ], "role": "CISA Coordinator", "timestamp": "2025-05-30T18:54:23.948239Z", "version": "2.0.3" }, "type": "ssvc" } } ], "providerMetadata": { "dateUpdated": "2025-05-30T18:56:18.715Z", "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "shortName": "CISA-ADP" }, "title": "CISA ADP Vulnrichment" } ], "cna": { "affected": [ { "product": "vllm", "vendor": "vllm-project", "versions": [ { "status": "affected", "version": "\u003e= 0.8.0, \u003c 0.9.0" } ] } ], "descriptions": [ { "lang": "en", "value": "vLLM is an inference and serving engine for large language models (LLMs). Version 0.8.0 up to but excluding 0.9.0 have a Denial of Service (ReDoS) that causes the vLLM server to crash if an invalid regex was provided while using structured output. This vulnerability is similar to GHSA-6qc9-v4r8-22xg/CVE-2025-48942, but for regex instead of a JSON schema. Version 0.9.0 fixes the issue." } ], "metrics": [ { "cvssV3_1": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 6.5, "baseSeverity": "MEDIUM", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H", "version": "3.1" } } ], "problemTypes": [ { "descriptions": [ { "cweId": "CWE-248", "description": "CWE-248: Uncaught Exception", "lang": "en", "type": "CWE" } ] } ], "providerMetadata": { "dateUpdated": "2025-05-30T18:37:17.548Z", "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "shortName": "GitHub_M" }, "references": [ { "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-9hcf-v7m4-6m2j", "tags": [ "x_refsource_CONFIRM" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-9hcf-v7m4-6m2j" }, { "name": "https://github.com/vllm-project/vllm/issues/17313", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/issues/17313" }, { "name": "https://github.com/vllm-project/vllm/pull/17623", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/pull/17623" }, { "name": "https://github.com/vllm-project/vllm/commit/08bf7840780980c7568c573c70a6a8db94fd45ff", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/commit/08bf7840780980c7568c573c70a6a8db94fd45ff" } ], "source": { "advisory": "GHSA-9hcf-v7m4-6m2j", "discovery": "UNKNOWN" }, "title": "vLLM allows clients to crash the openai server with invalid regex" } }, "cveMetadata": { "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "assignerShortName": "GitHub_M", "cveId": "CVE-2025-48943", "datePublished": "2025-05-30T18:36:01.519Z", "dateReserved": "2025-05-28T18:49:07.582Z", "dateUpdated": "2025-05-30T18:56:18.715Z", "state": "PUBLISHED" }, "dataType": "CVE_RECORD", "dataVersion": "5.1" }
CVE-2025-25183 (GCVE-0-2025-25183)
Vulnerability from cvelistv5
Published
2025-02-07 19:59
Modified
2025-02-12 20:51
Severity ?
VLAI Severity ?
EPSS score ?
CWE
- CWE-354 - Improper Validation of Integrity Check Value
Summary
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Maliciously constructed statements can lead to hash collisions, resulting in cache reuse, which can interfere with subsequent responses and cause unintended behavior. Prefix caching makes use of Python's built-in hash() function. As of Python 3.12, the behavior of hash(None) has changed to be a predictable constant value. This makes it more feasible that someone could try exploit hash collisions. The impact of a collision would be using cache that was generated using different content. Given knowledge of prompts in use and predictable hashing behavior, someone could intentionally populate the cache using a prompt known to collide with another prompt in use. This issue has been addressed in version 0.7.2 and all users are advised to upgrade. There are no known workarounds for this vulnerability.
References
► | URL | Tags | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
Impacted products
Vendor | Product | Version | ||
---|---|---|---|---|
vllm-project | vllm |
Version: < 0.7.2 |
{ "containers": { "adp": [ { "metrics": [ { "other": { "content": { "id": "CVE-2025-25183", "options": [ { "Exploitation": "none" }, { "Automatable": "no" }, { "Technical Impact": "partial" } ], "role": "CISA Coordinator", "timestamp": "2025-02-07T20:33:57.205558Z", "version": "2.0.3" }, "type": "ssvc" } } ], "providerMetadata": { "dateUpdated": "2025-02-12T20:51:46.402Z", "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "shortName": "CISA-ADP" }, "title": "CISA ADP Vulnrichment" } ], "cna": { "affected": [ { "product": "vllm", "vendor": "vllm-project", "versions": [ { "status": "affected", "version": "\u003c 0.7.2" } ] } ], "descriptions": [ { "lang": "en", "value": "vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Maliciously constructed statements can lead to hash collisions, resulting in cache reuse, which can interfere with subsequent responses and cause unintended behavior. Prefix caching makes use of Python\u0027s built-in hash() function. As of Python 3.12, the behavior of hash(None) has changed to be a predictable constant value. This makes it more feasible that someone could try exploit hash collisions. The impact of a collision would be using cache that was generated using different content. Given knowledge of prompts in use and predictable hashing behavior, someone could intentionally populate the cache using a prompt known to collide with another prompt in use. This issue has been addressed in version 0.7.2 and all users are advised to upgrade. There are no known workarounds for this vulnerability." } ], "metrics": [ { "cvssV3_1": { "attackComplexity": "HIGH", "attackVector": "NETWORK", "availabilityImpact": "NONE", "baseScore": 2.6, "baseSeverity": "LOW", "confidentialityImpact": "NONE", "integrityImpact": "LOW", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "REQUIRED", "vectorString": "CVSS:3.1/AV:N/AC:H/PR:L/UI:R/S:U/C:N/I:L/A:N", "version": "3.1" } } ], "problemTypes": [ { "descriptions": [ { "cweId": "CWE-354", "description": "CWE-354: Improper Validation of Integrity Check Value", "lang": "en", "type": "CWE" } ] } ], "providerMetadata": { "dateUpdated": "2025-02-07T19:59:01.370Z", "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "shortName": "GitHub_M" }, "references": [ { "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-rm76-4mrf-v9r8", "tags": [ "x_refsource_CONFIRM" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-rm76-4mrf-v9r8" }, { "name": "https://github.com/vllm-project/vllm/pull/12621", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/pull/12621" }, { "name": "https://github.com/python/cpython/commit/432117cd1f59c76d97da2eaff55a7d758301dbc7", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/python/cpython/commit/432117cd1f59c76d97da2eaff55a7d758301dbc7" } ], "source": { "advisory": "GHSA-rm76-4mrf-v9r8", "discovery": "UNKNOWN" }, "title": "vLLM using built-in hash() from Python 3.12 leads to predictable hash collisions in vLLM prefix cache" } }, "cveMetadata": { "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "assignerShortName": "GitHub_M", "cveId": "CVE-2025-25183", "datePublished": "2025-02-07T19:59:01.370Z", "dateReserved": "2025-02-03T19:30:53.399Z", "dateUpdated": "2025-02-12T20:51:46.402Z", "state": "PUBLISHED" }, "dataType": "CVE_RECORD", "dataVersion": "5.1" }
CVE-2025-30202 (GCVE-0-2025-30202)
Vulnerability from cvelistv5
Published
2025-04-30 00:24
Modified
2025-04-30 13:16
Severity ?
VLAI Severity ?
EPSS score ?
CWE
- CWE-770 - Allocation of Resources Without Limits or Throttling
Summary
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.5.2 and prior to 0.8.5 are vulnerable to denial of service and data exposure via ZeroMQ on multi-node vLLM deployment. In a multi-node vLLM deployment, vLLM uses ZeroMQ for some multi-node communication purposes. The primary vLLM host opens an XPUB ZeroMQ socket and binds it to ALL interfaces. While the socket is always opened for a multi-node deployment, it is only used when doing tensor parallelism across multiple hosts. Any client with network access to this host can connect to this XPUB socket unless its port is blocked by a firewall. Once connected, these arbitrary clients will receive all of the same data broadcasted to all of the secondary vLLM hosts. This data is internal vLLM state information that is not useful to an attacker. By potentially connecting to this socket many times and not reading data published to them, an attacker can also cause a denial of service by slowing down or potentially blocking the publisher. This issue has been patched in version 0.8.5.
References
► | URL | Tags | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
Impacted products
Vendor | Product | Version | ||
---|---|---|---|---|
vllm-project | vllm |
Version: >= 0.5.2, < 0.8.5 |
{ "containers": { "adp": [ { "metrics": [ { "other": { "content": { "id": "CVE-2025-30202", "options": [ { "Exploitation": "none" }, { "Automatable": "yes" }, { "Technical Impact": "partial" } ], "role": "CISA Coordinator", "timestamp": "2025-04-30T13:16:29.868734Z", "version": "2.0.3" }, "type": "ssvc" } } ], "providerMetadata": { "dateUpdated": "2025-04-30T13:16:43.914Z", "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "shortName": "CISA-ADP" }, "title": "CISA ADP Vulnrichment" } ], "cna": { "affected": [ { "product": "vllm", "vendor": "vllm-project", "versions": [ { "status": "affected", "version": "\u003e= 0.5.2, \u003c 0.8.5" } ] } ], "descriptions": [ { "lang": "en", "value": "vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.5.2 and prior to 0.8.5 are vulnerable to denial of service and data exposure via ZeroMQ on multi-node vLLM deployment. In a multi-node vLLM deployment, vLLM uses ZeroMQ for some multi-node communication purposes. The primary vLLM host opens an XPUB ZeroMQ socket and binds it to ALL interfaces. While the socket is always opened for a multi-node deployment, it is only used when doing tensor parallelism across multiple hosts. Any client with network access to this host can connect to this XPUB socket unless its port is blocked by a firewall. Once connected, these arbitrary clients will receive all of the same data broadcasted to all of the secondary vLLM hosts. This data is internal vLLM state information that is not useful to an attacker. By potentially connecting to this socket many times and not reading data published to them, an attacker can also cause a denial of service by slowing down or potentially blocking the publisher. This issue has been patched in version 0.8.5." } ], "metrics": [ { "cvssV3_1": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 7.5, "baseSeverity": "HIGH", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "privilegesRequired": "NONE", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H", "version": "3.1" } } ], "problemTypes": [ { "descriptions": [ { "cweId": "CWE-770", "description": "CWE-770: Allocation of Resources Without Limits or Throttling", "lang": "en", "type": "CWE" } ] } ], "providerMetadata": { "dateUpdated": "2025-04-30T00:24:45.723Z", "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "shortName": "GitHub_M" }, "references": [ { "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-9f8f-2vmf-885j", "tags": [ "x_refsource_CONFIRM" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-9f8f-2vmf-885j" }, { "name": "https://github.com/vllm-project/vllm/pull/6183", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/pull/6183" }, { "name": "https://github.com/vllm-project/vllm/commit/a0304dc504c85f421d38ef47c64f83046a13641c", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/commit/a0304dc504c85f421d38ef47c64f83046a13641c" } ], "source": { "advisory": "GHSA-9f8f-2vmf-885j", "discovery": "UNKNOWN" }, "title": "Data exposure via ZeroMQ on multi-node vLLM deployment" } }, "cveMetadata": { "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "assignerShortName": "GitHub_M", "cveId": "CVE-2025-30202", "datePublished": "2025-04-30T00:24:45.723Z", "dateReserved": "2025-03-18T18:15:13.849Z", "dateUpdated": "2025-04-30T13:16:43.914Z", "state": "PUBLISHED" }, "dataType": "CVE_RECORD", "dataVersion": "5.1" }
CVE-2024-11041 (GCVE-0-2024-11041)
Vulnerability from cvelistv5
Published
2025-03-20 10:10
Modified
2025-03-20 18:18
Severity ?
VLAI Severity ?
EPSS score ?
CWE
- CWE-502 - Deserialization of Untrusted Data
Summary
vllm-project vllm version v0.6.2 contains a vulnerability in the MessageQueue.dequeue() API function. The function uses pickle.loads to parse received sockets directly, leading to a remote code execution vulnerability. An attacker can exploit this by sending a malicious payload to the MessageQueue, causing the victim's machine to execute arbitrary code.
References
Impacted products
Vendor | Product | Version | ||
---|---|---|---|---|
vllm-project | vllm-project/vllm |
Version: unspecified < |
{ "containers": { "adp": [ { "metrics": [ { "other": { "content": { "id": "CVE-2024-11041", "options": [ { "Exploitation": "poc" }, { "Automatable": "yes" }, { "Technical Impact": "total" } ], "role": "CISA Coordinator", "timestamp": "2025-03-20T17:51:10.653784Z", "version": "2.0.3" }, "type": "ssvc" } } ], "providerMetadata": { "dateUpdated": "2025-03-20T18:18:48.224Z", "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "shortName": "CISA-ADP" }, "title": "CISA ADP Vulnrichment" } ], "cna": { "affected": [ { "product": "vllm-project/vllm", "vendor": "vllm-project", "versions": [ { "lessThanOrEqual": "latest", "status": "affected", "version": "unspecified", "versionType": "custom" } ] } ], "descriptions": [ { "lang": "en", "value": "vllm-project vllm version v0.6.2 contains a vulnerability in the MessageQueue.dequeue() API function. The function uses pickle.loads to parse received sockets directly, leading to a remote code execution vulnerability. An attacker can exploit this by sending a malicious payload to the MessageQueue, causing the victim\u0027s machine to execute arbitrary code." } ], "metrics": [ { "cvssV3_0": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 9.8, "baseSeverity": "CRITICAL", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "privilegesRequired": "NONE", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H", "version": "3.0" } } ], "problemTypes": [ { "descriptions": [ { "cweId": "CWE-502", "description": "CWE-502 Deserialization of Untrusted Data", "lang": "en", "type": "CWE" } ] } ], "providerMetadata": { "dateUpdated": "2025-03-20T10:10:40.868Z", "orgId": "c09c270a-b464-47c1-9133-acb35b22c19a", "shortName": "@huntr_ai" }, "references": [ { "url": "https://huntr.com/bounties/00136195-11e0-4ad0-98d5-72db066e867f" } ], "source": { "advisory": "00136195-11e0-4ad0-98d5-72db066e867f", "discovery": "EXTERNAL" }, "title": "Remote Code Execution in vllm-project/vllm" } }, "cveMetadata": { "assignerOrgId": "c09c270a-b464-47c1-9133-acb35b22c19a", "assignerShortName": "@huntr_ai", "cveId": "CVE-2024-11041", "datePublished": "2025-03-20T10:10:40.868Z", "dateReserved": "2024-11-09T04:47:57.295Z", "dateUpdated": "2025-03-20T18:18:48.224Z", "state": "PUBLISHED" }, "dataType": "CVE_RECORD", "dataVersion": "5.1" }
CVE-2025-46722 (GCVE-0-2025-46722)
Vulnerability from cvelistv5
Published
2025-05-29 16:36
Modified
2025-05-29 18:13
Severity ?
VLAI Severity ?
EPSS score ?
CWE
Summary
vLLM is an inference and serving engine for large language models (LLMs). In versions starting from 0.7.0 to before 0.9.0, in the file vllm/multimodal/hasher.py, the MultiModalHasher class has a security and data integrity issue in its image hashing method. Currently, it serializes PIL.Image.Image objects using only obj.tobytes(), which returns only the raw pixel data, without including metadata such as the image’s shape (width, height, mode). As a result, two images of different sizes (e.g., 30x100 and 100x30) with the same pixel byte sequence could generate the same hash value. This may lead to hash collisions, incorrect cache hits, and even data leakage or security risks. This issue has been patched in version 0.9.0.
References
► | URL | Tags | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
Impacted products
Vendor | Product | Version | ||
---|---|---|---|---|
vllm-project | vllm |
Version: >= 0.7.0, < 0.9.0 |
{ "containers": { "adp": [ { "metrics": [ { "other": { "content": { "id": "CVE-2025-46722", "options": [ { "Exploitation": "none" }, { "Automatable": "no" }, { "Technical Impact": "partial" } ], "role": "CISA Coordinator", "timestamp": "2025-05-29T18:12:29.713264Z", "version": "2.0.3" }, "type": "ssvc" } } ], "providerMetadata": { "dateUpdated": "2025-05-29T18:13:02.824Z", "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "shortName": "CISA-ADP" }, "title": "CISA ADP Vulnrichment" } ], "cna": { "affected": [ { "product": "vllm", "vendor": "vllm-project", "versions": [ { "status": "affected", "version": "\u003e= 0.7.0, \u003c 0.9.0" } ] } ], "descriptions": [ { "lang": "en", "value": "vLLM is an inference and serving engine for large language models (LLMs). In versions starting from 0.7.0 to before 0.9.0, in the file vllm/multimodal/hasher.py, the MultiModalHasher class has a security and data integrity issue in its image hashing method. Currently, it serializes PIL.Image.Image objects using only obj.tobytes(), which returns only the raw pixel data, without including metadata such as the image\u2019s shape (width, height, mode). As a result, two images of different sizes (e.g., 30x100 and 100x30) with the same pixel byte sequence could generate the same hash value. This may lead to hash collisions, incorrect cache hits, and even data leakage or security risks. This issue has been patched in version 0.9.0." } ], "metrics": [ { "cvssV3_1": { "attackComplexity": "HIGH", "attackVector": "NETWORK", "availabilityImpact": "LOW", "baseScore": 4.2, "baseSeverity": "MEDIUM", "confidentialityImpact": "LOW", "integrityImpact": "NONE", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:L/I:N/A:L", "version": "3.1" } } ], "problemTypes": [ { "descriptions": [ { "cweId": "CWE-1288", "description": "CWE-1288: Improper Validation of Consistency within Input", "lang": "en", "type": "CWE" } ] }, { "descriptions": [ { "cweId": "CWE-1023", "description": "CWE-1023: Incomplete Comparison with Missing Factors", "lang": "en", "type": "CWE" } ] } ], "providerMetadata": { "dateUpdated": "2025-05-29T16:36:12.879Z", "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "shortName": "GitHub_M" }, "references": [ { "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-c65p-x677-fgj6", "tags": [ "x_refsource_CONFIRM" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-c65p-x677-fgj6" }, { "name": "https://github.com/vllm-project/vllm/pull/17378", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/pull/17378" }, { "name": "https://github.com/vllm-project/vllm/commit/99404f53c72965b41558aceb1bc2380875f5d848", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/commit/99404f53c72965b41558aceb1bc2380875f5d848" } ], "source": { "advisory": "GHSA-c65p-x677-fgj6", "discovery": "UNKNOWN" }, "title": "vLLM has a Weakness in MultiModalHasher Image Hashing Implementation" } }, "cveMetadata": { "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "assignerShortName": "GitHub_M", "cveId": "CVE-2025-46722", "datePublished": "2025-05-29T16:36:12.879Z", "dateReserved": "2025-04-28T20:56:09.084Z", "dateUpdated": "2025-05-29T18:13:02.824Z", "state": "PUBLISHED" }, "dataType": "CVE_RECORD", "dataVersion": "5.1" }
CVE-2025-48942 (GCVE-0-2025-48942)
Vulnerability from cvelistv5
Published
2025-05-30 18:33
Modified
2025-05-30 20:37
Severity ?
VLAI Severity ?
EPSS score ?
CWE
- CWE-248 - Uncaught Exception
Summary
vLLM is an inference and serving engine for large language models (LLMs). In versions 0.8.0 up to but excluding 0.9.0, hitting the /v1/completions API with a invalid json_schema as a Guided Param kills the vllm server. This vulnerability is similar GHSA-9hcf-v7m4-6m2j/CVE-2025-48943, but for regex instead of a JSON schema. Version 0.9.0 fixes the issue.
References
► | URL | Tags | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Impacted products
Vendor | Product | Version | ||
---|---|---|---|---|
vllm-project | vllm |
Version: >= 0.8.0, < 0.9.0 |
{ "containers": { "adp": [ { "metrics": [ { "other": { "content": { "id": "CVE-2025-48942", "options": [ { "Exploitation": "poc" }, { "Automatable": "no" }, { "Technical Impact": "partial" } ], "role": "CISA Coordinator", "timestamp": "2025-05-30T20:36:50.679547Z", "version": "2.0.3" }, "type": "ssvc" } } ], "providerMetadata": { "dateUpdated": "2025-05-30T20:37:06.015Z", "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "shortName": "CISA-ADP" }, "title": "CISA ADP Vulnrichment" } ], "cna": { "affected": [ { "product": "vllm", "vendor": "vllm-project", "versions": [ { "status": "affected", "version": "\u003e= 0.8.0, \u003c 0.9.0" } ] } ], "descriptions": [ { "lang": "en", "value": "vLLM is an inference and serving engine for large language models (LLMs). In versions 0.8.0 up to but excluding 0.9.0, hitting the /v1/completions API with a invalid json_schema as a Guided Param kills the vllm server. This vulnerability is similar GHSA-9hcf-v7m4-6m2j/CVE-2025-48943, but for regex instead of a JSON schema. Version 0.9.0 fixes the issue." } ], "metrics": [ { "cvssV3_1": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 6.5, "baseSeverity": "MEDIUM", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H", "version": "3.1" } } ], "problemTypes": [ { "descriptions": [ { "cweId": "CWE-248", "description": "CWE-248: Uncaught Exception", "lang": "en", "type": "CWE" } ] } ], "providerMetadata": { "dateUpdated": "2025-05-30T18:37:10.641Z", "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "shortName": "GitHub_M" }, "references": [ { "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-6qc9-v4r8-22xg", "tags": [ "x_refsource_CONFIRM" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-6qc9-v4r8-22xg" }, { "name": "https://github.com/vllm-project/vllm/issues/17248", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/issues/17248" }, { "name": "https://github.com/vllm-project/vllm/pull/17623", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/pull/17623" }, { "name": "https://github.com/vllm-project/vllm/commit/08bf7840780980c7568c573c70a6a8db94fd45ff", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/commit/08bf7840780980c7568c573c70a6a8db94fd45ff" } ], "source": { "advisory": "GHSA-6qc9-v4r8-22xg", "discovery": "UNKNOWN" }, "title": "vLLM DOS: Remotely kill vllm over http with invalid JSON schema" } }, "cveMetadata": { "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "assignerShortName": "GitHub_M", "cveId": "CVE-2025-48942", "datePublished": "2025-05-30T18:33:40.488Z", "dateReserved": "2025-05-28T18:49:07.581Z", "dateUpdated": "2025-05-30T20:37:06.015Z", "state": "PUBLISHED" }, "dataType": "CVE_RECORD", "dataVersion": "5.1" }
CVE-2025-48944 (GCVE-0-2025-48944)
Vulnerability from cvelistv5
Published
2025-05-30 18:38
Modified
2025-05-30 18:56
Severity ?
VLAI Severity ?
EPSS score ?
CWE
- CWE-20 - Improper Input Validation
Summary
vLLM is an inference and serving engine for large language models (LLMs). In version 0.8.0 up to but excluding 0.9.0, the vLLM backend used with the /v1/chat/completions OpenAPI endpoint fails to validate unexpected or malformed input in the "pattern" and "type" fields when the tools functionality is invoked. These inputs are not validated before being compiled or parsed, causing a crash of the inference worker with a single request. The worker will remain down until it is restarted. Version 0.9.0 fixes the issue.
References
► | URL | Tags | ||||||
---|---|---|---|---|---|---|---|---|
|
Impacted products
Vendor | Product | Version | ||
---|---|---|---|---|
vllm-project | vllm |
Version: >= 0.8.0, < 0.9.0 |
{ "containers": { "adp": [ { "metrics": [ { "other": { "content": { "id": "CVE-2025-48944", "options": [ { "Exploitation": "poc" }, { "Automatable": "no" }, { "Technical Impact": "partial" } ], "role": "CISA Coordinator", "timestamp": "2025-05-30T18:56:49.162584Z", "version": "2.0.3" }, "type": "ssvc" } } ], "providerMetadata": { "dateUpdated": "2025-05-30T18:56:56.406Z", "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "shortName": "CISA-ADP" }, "title": "CISA ADP Vulnrichment" } ], "cna": { "affected": [ { "product": "vllm", "vendor": "vllm-project", "versions": [ { "status": "affected", "version": "\u003e= 0.8.0, \u003c 0.9.0" } ] } ], "descriptions": [ { "lang": "en", "value": "vLLM is an inference and serving engine for large language models (LLMs). In version 0.8.0 up to but excluding 0.9.0, the vLLM backend used with the /v1/chat/completions OpenAPI endpoint fails to validate unexpected or malformed input in the \"pattern\" and \"type\" fields when the tools functionality is invoked. These inputs are not validated before being compiled or parsed, causing a crash of the inference worker with a single request. The worker will remain down until it is restarted. Version 0.9.0 fixes the issue." } ], "metrics": [ { "cvssV3_1": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 6.5, "baseSeverity": "MEDIUM", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H", "version": "3.1" } } ], "problemTypes": [ { "descriptions": [ { "cweId": "CWE-20", "description": "CWE-20: Improper Input Validation", "lang": "en", "type": "CWE" } ] } ], "providerMetadata": { "dateUpdated": "2025-05-30T18:38:45.505Z", "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "shortName": "GitHub_M" }, "references": [ { "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-vrq3-r879-7m65", "tags": [ "x_refsource_CONFIRM" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-vrq3-r879-7m65" }, { "name": "https://github.com/vllm-project/vllm/pull/17623", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/pull/17623" } ], "source": { "advisory": "GHSA-vrq3-r879-7m65", "discovery": "UNKNOWN" }, "title": "vLLM Tool Schema allows DoS via Malformed pattern and type Fields" } }, "cveMetadata": { "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "assignerShortName": "GitHub_M", "cveId": "CVE-2025-48944", "datePublished": "2025-05-30T18:38:45.505Z", "dateReserved": "2025-05-28T18:49:07.582Z", "dateUpdated": "2025-05-30T18:56:56.406Z", "state": "PUBLISHED" }, "dataType": "CVE_RECORD", "dataVersion": "5.1" }
CVE-2025-46570 (GCVE-0-2025-46570)
Vulnerability from cvelistv5
Published
2025-05-29 16:32
Modified
2025-05-29 18:05
Severity ?
VLAI Severity ?
EPSS score ?
CWE
- CWE-208 - Observable Timing Discrepancy
Summary
vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the prefill process speeds up, which is reflected in the TTFT (Time to First Token). These timing differences caused by matching chunks are significant enough to be recognized and exploited. This issue has been patched in version 0.9.0.
References
► | URL | Tags | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
Impacted products
Vendor | Product | Version | ||
---|---|---|---|---|
vllm-project | vllm |
Version: < 0.9.0 |
{ "containers": { "adp": [ { "metrics": [ { "other": { "content": { "id": "CVE-2025-46570", "options": [ { "Exploitation": "none" }, { "Automatable": "no" }, { "Technical Impact": "partial" } ], "role": "CISA Coordinator", "timestamp": "2025-05-29T18:04:57.706360Z", "version": "2.0.3" }, "type": "ssvc" } } ], "providerMetadata": { "dateUpdated": "2025-05-29T18:05:10.768Z", "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "shortName": "CISA-ADP" }, "title": "CISA ADP Vulnrichment" } ], "cna": { "affected": [ { "product": "vllm", "vendor": "vllm-project", "versions": [ { "status": "affected", "version": "\u003c 0.9.0" } ] } ], "descriptions": [ { "lang": "en", "value": "vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the prefill process speeds up, which is reflected in the TTFT (Time to First Token). These timing differences caused by matching chunks are significant enough to be recognized and exploited. This issue has been patched in version 0.9.0." } ], "metrics": [ { "cvssV3_1": { "attackComplexity": "HIGH", "attackVector": "NETWORK", "availabilityImpact": "NONE", "baseScore": 2.6, "baseSeverity": "LOW", "confidentialityImpact": "LOW", "integrityImpact": "NONE", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "REQUIRED", "vectorString": "CVSS:3.1/AV:N/AC:H/PR:L/UI:R/S:U/C:L/I:N/A:N", "version": "3.1" } } ], "problemTypes": [ { "descriptions": [ { "cweId": "CWE-208", "description": "CWE-208: Observable Timing Discrepancy", "lang": "en", "type": "CWE" } ] } ], "providerMetadata": { "dateUpdated": "2025-05-29T16:32:42.794Z", "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "shortName": "GitHub_M" }, "references": [ { "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-4qjh-9fv9-r85r", "tags": [ "x_refsource_CONFIRM" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-4qjh-9fv9-r85r" }, { "name": "https://github.com/vllm-project/vllm/pull/17045", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/pull/17045" }, { "name": "https://github.com/vllm-project/vllm/commit/77073c77bc2006eb80ea6d5128f076f5e6c6f54f", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/commit/77073c77bc2006eb80ea6d5128f076f5e6c6f54f" } ], "source": { "advisory": "GHSA-4qjh-9fv9-r85r", "discovery": "UNKNOWN" }, "title": "vLLM\u2019s Chunk-Based Prefix Caching Vulnerable to Potential Timing Side-Channel" } }, "cveMetadata": { "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "assignerShortName": "GitHub_M", "cveId": "CVE-2025-46570", "datePublished": "2025-05-29T16:32:42.794Z", "dateReserved": "2025-04-24T21:10:48.175Z", "dateUpdated": "2025-05-29T18:05:10.768Z", "state": "PUBLISHED" }, "dataType": "CVE_RECORD", "dataVersion": "5.1" }
CVE-2025-32444 (GCVE-0-2025-32444)
Vulnerability from cvelistv5
Published
2025-04-30 00:25
Modified
2025-04-30 13:08
Severity ?
VLAI Severity ?
EPSS score ?
CWE
- CWE-502 - Deserialization of Untrusted Data
Summary
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.6.5 and prior to 0.8.5, having vLLM integration with mooncake, are vulnerable to remote code execution due to using pickle based serialization over unsecured ZeroMQ sockets. The vulnerable sockets were set to listen on all network interfaces, increasing the likelihood that an attacker is able to reach the vulnerable ZeroMQ sockets to carry out an attack. vLLM instances that do not make use of the mooncake integration are not vulnerable. This issue has been patched in version 0.8.5.
References
► | URL | Tags | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Impacted products
Vendor | Product | Version | ||
---|---|---|---|---|
vllm-project | vllm |
Version: >= 0.6.5, < 0.8.5 |
{ "containers": { "adp": [ { "metrics": [ { "other": { "content": { "id": "CVE-2025-32444", "options": [ { "Exploitation": "none" }, { "Automatable": "yes" }, { "Technical Impact": "total" } ], "role": "CISA Coordinator", "timestamp": "2025-04-30T13:08:21.425422Z", "version": "2.0.3" }, "type": "ssvc" } } ], "providerMetadata": { "dateUpdated": "2025-04-30T13:08:35.928Z", "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "shortName": "CISA-ADP" }, "title": "CISA ADP Vulnrichment" } ], "cna": { "affected": [ { "product": "vllm", "vendor": "vllm-project", "versions": [ { "status": "affected", "version": "\u003e= 0.6.5, \u003c 0.8.5" } ] } ], "descriptions": [ { "lang": "en", "value": "vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.6.5 and prior to 0.8.5, having vLLM integration with mooncake, are vulnerable to remote code execution due to using pickle based serialization over unsecured ZeroMQ sockets. The vulnerable sockets were set to listen on all network interfaces, increasing the likelihood that an attacker is able to reach the vulnerable ZeroMQ sockets to carry out an attack. vLLM instances that do not make use of the mooncake integration are not vulnerable. This issue has been patched in version 0.8.5." } ], "metrics": [ { "cvssV3_1": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 10, "baseSeverity": "CRITICAL", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "privilegesRequired": "NONE", "scope": "CHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H", "version": "3.1" } } ], "problemTypes": [ { "descriptions": [ { "cweId": "CWE-502", "description": "CWE-502: Deserialization of Untrusted Data", "lang": "en", "type": "CWE" } ] } ], "providerMetadata": { "dateUpdated": "2025-04-30T00:25:00.655Z", "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "shortName": "GitHub_M" }, "references": [ { "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-hj4w-hm2g-p6w5", "tags": [ "x_refsource_CONFIRM" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-hj4w-hm2g-p6w5" }, { "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-x3m8-f7g5-qhm7", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-x3m8-f7g5-qhm7" }, { "name": "https://github.com/vllm-project/vllm/commit/a5450f11c95847cf51a17207af9a3ca5ab569b2c", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/commit/a5450f11c95847cf51a17207af9a3ca5ab569b2c" }, { "name": "https://github.com/vllm-project/vllm/blob/32b14baf8a1f7195ca09484de3008063569b43c5/vllm/distributed/kv_transfer/kv_pipe/mooncake_pipe.py#L179", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/blob/32b14baf8a1f7195ca09484de3008063569b43c5/vllm/distributed/kv_transfer/kv_pipe/mooncake_pipe.py#L179" } ], "source": { "advisory": "GHSA-hj4w-hm2g-p6w5", "discovery": "UNKNOWN" }, "title": "vLLM Vulnerable to Remote Code Execution via Mooncake Integration" } }, "cveMetadata": { "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "assignerShortName": "GitHub_M", "cveId": "CVE-2025-32444", "datePublished": "2025-04-30T00:25:00.655Z", "dateReserved": "2025-04-08T10:54:58.369Z", "dateUpdated": "2025-04-30T13:08:35.928Z", "state": "PUBLISHED" }, "dataType": "CVE_RECORD", "dataVersion": "5.1" }
CVE-2025-30165 (GCVE-0-2025-30165)
Vulnerability from cvelistv5
Published
2025-05-06 16:53
Modified
2025-05-06 17:26
Severity ?
VLAI Severity ?
EPSS score ?
CWE
- CWE-502 - Deserialization of Untrusted Data
Summary
vLLM is an inference and serving engine for large language models. In a multi-node vLLM deployment using the V0 engine, vLLM uses ZeroMQ for some multi-node communication purposes. The secondary vLLM hosts open a `SUB` ZeroMQ socket and connect to an `XPUB` socket on the primary vLLM host. When data is received on this `SUB` socket, it is deserialized with `pickle`. This is unsafe, as it can be abused to execute code on a remote machine. Since the vulnerability exists in a client that connects to the primary vLLM host, this vulnerability serves as an escalation point. If the primary vLLM host is compromised, this vulnerability could be used to compromise the rest of the hosts in the vLLM deployment. Attackers could also use other means to exploit the vulnerability without requiring access to the primary vLLM host. One example would be the use of ARP cache poisoning to redirect traffic to a malicious endpoint used to deliver a payload with arbitrary code to execute on the target machine. Note that this issue only affects the V0 engine, which has been off by default since v0.8.0. Further, the issue only applies to a deployment using tensor parallelism across multiple hosts, which we do not expect to be a common deployment pattern. Since V0 is has been off by default since v0.8.0 and the fix is fairly invasive, the maintainers of vLLM have decided not to fix this issue. Instead, the maintainers recommend that users ensure their environment is on a secure network in case this pattern is in use. The V1 engine is not affected by this issue.
References
► | URL | Tags | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
Impacted products
Vendor | Product | Version | ||
---|---|---|---|---|
vllm-project | vllm |
Version: >= 0.5.2, <= 0.8.5.post1 |
{ "containers": { "adp": [ { "metrics": [ { "other": { "content": { "id": "CVE-2025-30165", "options": [ { "Exploitation": "none" }, { "Automatable": "no" }, { "Technical Impact": "total" } ], "role": "CISA Coordinator", "timestamp": "2025-05-06T17:22:47.717996Z", "version": "2.0.3" }, "type": "ssvc" } } ], "providerMetadata": { "dateUpdated": "2025-05-06T17:26:58.974Z", "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "shortName": "CISA-ADP" }, "title": "CISA ADP Vulnrichment" } ], "cna": { "affected": [ { "product": "vllm", "vendor": "vllm-project", "versions": [ { "status": "affected", "version": "\u003e= 0.5.2, \u003c= 0.8.5.post1" } ] } ], "descriptions": [ { "lang": "en", "value": "vLLM is an inference and serving engine for large language models. In a multi-node vLLM deployment using the V0 engine, vLLM uses ZeroMQ for some multi-node communication purposes. The secondary vLLM hosts open a `SUB` ZeroMQ socket and connect to an `XPUB` socket on the primary vLLM host. When data is received on this `SUB` socket, it is deserialized with `pickle`. This is unsafe, as it can be abused to execute code on a remote machine. Since the vulnerability exists in a client that connects to the primary vLLM host, this vulnerability serves as an escalation point. If the primary vLLM host is compromised, this vulnerability could be used to compromise the rest of the hosts in the vLLM deployment. Attackers could also use other means to exploit the vulnerability without requiring access to the primary vLLM host. One example would be the use of ARP cache poisoning to redirect traffic to a malicious endpoint used to deliver a payload with arbitrary code to execute on the target machine. Note that this issue only affects the V0 engine, which has been off by default since v0.8.0. Further, the issue only applies to a deployment using tensor parallelism across multiple hosts, which we do not expect to be a common deployment pattern. Since V0 is has been off by default since v0.8.0 and the fix is fairly invasive, the maintainers of vLLM have decided not to fix this issue. Instead, the maintainers recommend that users ensure their environment is on a secure network in case this pattern is in use. The V1 engine is not affected by this issue." } ], "metrics": [ { "cvssV3_1": { "attackComplexity": "LOW", "attackVector": "ADJACENT_NETWORK", "availabilityImpact": "HIGH", "baseScore": 8, "baseSeverity": "HIGH", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:A/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H", "version": "3.1" } } ], "problemTypes": [ { "descriptions": [ { "cweId": "CWE-502", "description": "CWE-502: Deserialization of Untrusted Data", "lang": "en", "type": "CWE" } ] } ], "providerMetadata": { "dateUpdated": "2025-05-06T16:53:52.836Z", "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "shortName": "GitHub_M" }, "references": [ { "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-9pcc-gvx5-r5wm", "tags": [ "x_refsource_CONFIRM" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-9pcc-gvx5-r5wm" }, { "name": "https://github.com/vllm-project/vllm/blob/c21b99b91241409c2fdf9f3f8c542e8748b317be/vllm/distributed/device_communicators/shm_broadcast.py#L295-L301", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/blob/c21b99b91241409c2fdf9f3f8c542e8748b317be/vllm/distributed/device_communicators/shm_broadcast.py#L295-L301" }, { "name": "https://github.com/vllm-project/vllm/blob/c21b99b91241409c2fdf9f3f8c542e8748b317be/vllm/distributed/device_communicators/shm_broadcast.py#L468-L470", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/blob/c21b99b91241409c2fdf9f3f8c542e8748b317be/vllm/distributed/device_communicators/shm_broadcast.py#L468-L470" } ], "source": { "advisory": "GHSA-9pcc-gvx5-r5wm", "discovery": "UNKNOWN" }, "title": "Remote Code Execution Vulnerability in vLLM Multi-Node Cluster Configuration" } }, "cveMetadata": { "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "assignerShortName": "GitHub_M", "cveId": "CVE-2025-30165", "datePublished": "2025-05-06T16:53:52.836Z", "dateReserved": "2025-03-17T12:41:42.567Z", "dateUpdated": "2025-05-06T17:26:58.974Z", "state": "PUBLISHED" }, "dataType": "CVE_RECORD", "dataVersion": "5.1" }
CVE-2025-46560 (GCVE-0-2025-46560)
Vulnerability from cvelistv5
Published
2025-04-30 00:24
Modified
2025-04-30 13:09
Severity ?
VLAI Severity ?
EPSS score ?
CWE
- CWE-1333 - Inefficient Regular Expression Complexity
Summary
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.8.0 and prior to 0.8.5 are affected by a critical performance vulnerability in the input preprocessing logic of the multimodal tokenizer. The code dynamically replaces placeholder tokens (e.g., <|audio_|>, <|image_|>) with repeated tokens based on precomputed lengths. Due to inefficient list concatenation operations, the algorithm exhibits quadratic time complexity (O(n²)), allowing malicious actors to trigger resource exhaustion via specially crafted inputs. This issue has been patched in version 0.8.5.
References
Impacted products
Vendor | Product | Version | ||
---|---|---|---|---|
vllm-project | vllm |
Version: >= 0.8.0, < 0.8.5 |
{ "containers": { "adp": [ { "metrics": [ { "other": { "content": { "id": "CVE-2025-46560", "options": [ { "Exploitation": "poc" }, { "Automatable": "no" }, { "Technical Impact": "partial" } ], "role": "CISA Coordinator", "timestamp": "2025-04-30T13:09:10.349287Z", "version": "2.0.3" }, "type": "ssvc" } } ], "providerMetadata": { "dateUpdated": "2025-04-30T13:09:13.422Z", "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "shortName": "CISA-ADP" }, "references": [ { "tags": [ "exploit" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg" } ], "title": "CISA ADP Vulnrichment" } ], "cna": { "affected": [ { "product": "vllm", "vendor": "vllm-project", "versions": [ { "status": "affected", "version": "\u003e= 0.8.0, \u003c 0.8.5" } ] } ], "descriptions": [ { "lang": "en", "value": "vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.8.0 and prior to 0.8.5 are affected by a critical performance vulnerability in the input preprocessing logic of the multimodal tokenizer. The code dynamically replaces placeholder tokens (e.g., \u003c|audio_|\u003e, \u003c|image_|\u003e) with repeated tokens based on precomputed lengths. Due to \u200b\u200binefficient list concatenation operations\u200b\u200b, the algorithm exhibits \u200b\u200bquadratic time complexity (O(n\u00b2))\u200b\u200b, allowing malicious actors to trigger resource exhaustion via specially crafted inputs. This issue has been patched in version 0.8.5." } ], "metrics": [ { "cvssV3_1": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 6.5, "baseSeverity": "MEDIUM", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H", "version": "3.1" } } ], "problemTypes": [ { "descriptions": [ { "cweId": "CWE-1333", "description": "CWE-1333: Inefficient Regular Expression Complexity", "lang": "en", "type": "CWE" } ] } ], "providerMetadata": { "dateUpdated": "2025-04-30T00:24:53.750Z", "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "shortName": "GitHub_M" }, "references": [ { "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg", "tags": [ "x_refsource_CONFIRM" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg" }, { "name": "https://github.com/vllm-project/vllm/blob/8cac35ba435906fb7eb07e44fe1a8c26e8744f4e/vllm/model_executor/models/phi4mm.py#L1182-L1197", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/blob/8cac35ba435906fb7eb07e44fe1a8c26e8744f4e/vllm/model_executor/models/phi4mm.py#L1182-L1197" } ], "source": { "advisory": "GHSA-vc6m-hm49-g9qg", "discovery": "UNKNOWN" }, "title": "vLLM phi4mm: Quadratic Time Complexity in Input Token Processing\u200b leads to denial of service" } }, "cveMetadata": { "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "assignerShortName": "GitHub_M", "cveId": "CVE-2025-46560", "datePublished": "2025-04-30T00:24:53.750Z", "dateReserved": "2025-04-24T21:10:48.174Z", "dateUpdated": "2025-04-30T13:09:13.422Z", "state": "PUBLISHED" }, "dataType": "CVE_RECORD", "dataVersion": "5.1" }
CVE-2025-24357 (GCVE-0-2025-24357)
Vulnerability from cvelistv5
Published
2025-01-27 17:38
Modified
2025-02-12 20:41
Severity ?
VLAI Severity ?
EPSS score ?
CWE
- CWE-502 - Deserialization of Untrusted Data
Summary
vLLM is a library for LLM inference and serving. vllm/model_executor/weight_utils.py implements hf_model_weights_iterator to load the model checkpoint, which is downloaded from huggingface. It uses the torch.load function and the weights_only parameter defaults to False. When torch.load loads malicious pickle data, it will execute arbitrary code during unpickling. This vulnerability is fixed in v0.7.0.
References
► | URL | Tags | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Impacted products
Vendor | Product | Version | ||
---|---|---|---|---|
vllm-project | vllm |
Version: < 0.7.0 |
{ "containers": { "adp": [ { "metrics": [ { "other": { "content": { "id": "CVE-2025-24357", "options": [ { "Exploitation": "none" }, { "Automatable": "no" }, { "Technical Impact": "total" } ], "role": "CISA Coordinator", "timestamp": "2025-01-27T18:20:10.107207Z", "version": "2.0.3" }, "type": "ssvc" } } ], "providerMetadata": { "dateUpdated": "2025-02-12T20:41:36.324Z", "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "shortName": "CISA-ADP" }, "title": "CISA ADP Vulnrichment" } ], "cna": { "affected": [ { "product": "vllm", "vendor": "vllm-project", "versions": [ { "status": "affected", "version": "\u003c 0.7.0" } ] } ], "descriptions": [ { "lang": "en", "value": "vLLM is a library for LLM inference and serving. vllm/model_executor/weight_utils.py implements hf_model_weights_iterator to load the model checkpoint, which is downloaded from huggingface. It uses the torch.load function and the weights_only parameter defaults to False. When torch.load loads malicious pickle data, it will execute arbitrary code during unpickling. This vulnerability is fixed in v0.7.0." } ], "metrics": [ { "cvssV3_1": { "attackComplexity": "HIGH", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 7.5, "baseSeverity": "HIGH", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "privilegesRequired": "NONE", "scope": "UNCHANGED", "userInteraction": "REQUIRED", "vectorString": "CVSS:3.1/AV:N/AC:H/PR:N/UI:R/S:U/C:H/I:H/A:H", "version": "3.1" } } ], "problemTypes": [ { "descriptions": [ { "cweId": "CWE-502", "description": "CWE-502: Deserialization of Untrusted Data", "lang": "en", "type": "CWE" } ] } ], "providerMetadata": { "dateUpdated": "2025-01-27T17:38:20.070Z", "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "shortName": "GitHub_M" }, "references": [ { "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-rh4j-5rhw-hr54", "tags": [ "x_refsource_CONFIRM" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-rh4j-5rhw-hr54" }, { "name": "https://github.com/vllm-project/vllm/pull/12366", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/pull/12366" }, { "name": "https://github.com/vllm-project/vllm/commit/d3d6bb13fb62da3234addf6574922a4ec0513d04", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/commit/d3d6bb13fb62da3234addf6574922a4ec0513d04" }, { "name": "https://pytorch.org/docs/stable/generated/torch.load.html", "tags": [ "x_refsource_MISC" ], "url": "https://pytorch.org/docs/stable/generated/torch.load.html" } ], "source": { "advisory": "GHSA-rh4j-5rhw-hr54", "discovery": "UNKNOWN" }, "title": "vLLM allows a malicious model RCE by torch.load in hf_model_weights_iterator" } }, "cveMetadata": { "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "assignerShortName": "GitHub_M", "cveId": "CVE-2025-24357", "datePublished": "2025-01-27T17:38:20.070Z", "dateReserved": "2025-01-20T15:18:26.988Z", "dateUpdated": "2025-02-12T20:41:36.324Z", "state": "PUBLISHED" }, "dataType": "CVE_RECORD", "dataVersion": "5.1" }
CVE-2025-47277 (GCVE-0-2025-47277)
Vulnerability from cvelistv5
Published
2025-05-20 17:32
Modified
2025-05-20 17:52
Severity ?
VLAI Severity ?
EPSS score ?
CWE
- CWE-502 - Deserialization of Untrusted Data
Summary
vLLM, an inference and serving engine for large language models (LLMs), has an issue in versions 0.6.5 through 0.8.4 that ONLY impacts environments using the `PyNcclPipe` KV cache transfer integration with the V0 engine. No other configurations are affected. vLLM supports the use of the `PyNcclPipe` class to establish a peer-to-peer communication domain for data transmission between distributed nodes. The GPU-side KV-Cache transmission is implemented through the `PyNcclCommunicator` class, while CPU-side control message passing is handled via the `send_obj` and `recv_obj` methods on the CPU side. The intention was that this interface should only be exposed to a private network using the IP address specified by the `--kv-ip` CLI parameter. The vLLM documentation covers how this must be limited to a secured network. The default and intentional behavior from PyTorch is that the `TCPStore` interface listens on ALL interfaces, regardless of what IP address is provided. The IP address given was only used as a client-side address to use. vLLM was fixed to use a workaround to force the `TCPStore` instance to bind its socket to a specified private interface. As of version 0.8.5, vLLM limits the `TCPStore` socket to the private interface as configured.
References
► | URL | Tags | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Impacted products
Vendor | Product | Version | ||
---|---|---|---|---|
vllm-project | vllm |
Version: >= 0.6.5, < 0.8.5 |
{ "containers": { "adp": [ { "metrics": [ { "other": { "content": { "id": "CVE-2025-47277", "options": [ { "Exploitation": "none" }, { "Automatable": "yes" }, { "Technical Impact": "total" } ], "role": "CISA Coordinator", "timestamp": "2025-05-20T17:52:22.643444Z", "version": "2.0.3" }, "type": "ssvc" } } ], "providerMetadata": { "dateUpdated": "2025-05-20T17:52:31.274Z", "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "shortName": "CISA-ADP" }, "title": "CISA ADP Vulnrichment" } ], "cna": { "affected": [ { "product": "vllm", "vendor": "vllm-project", "versions": [ { "status": "affected", "version": "\u003e= 0.6.5, \u003c 0.8.5" } ] } ], "descriptions": [ { "lang": "en", "value": "vLLM, an inference and serving engine for large language models (LLMs), has an issue in versions 0.6.5 through 0.8.4 that ONLY impacts environments using the `PyNcclPipe` KV cache transfer integration with the V0 engine. No other configurations are affected. vLLM supports the use of the\u00a0`PyNcclPipe`\u00a0class to establish a peer-to-peer communication domain for data transmission between distributed nodes. The GPU-side KV-Cache transmission is implemented through the\u00a0`PyNcclCommunicator`\u00a0class, while CPU-side control message passing is handled via the\u00a0`send_obj`\u00a0and\u00a0`recv_obj`\u00a0methods on the CPU side.\u200b The intention was that this interface should only be exposed to a private network using the IP address specified by the `--kv-ip` CLI parameter. The vLLM documentation covers how this must be limited to a secured network. The default and intentional behavior from PyTorch is that the `TCPStore` interface listens on ALL interfaces, regardless of what IP address is provided. The IP address given was only used as a client-side address to use. vLLM was fixed to use a workaround to force the `TCPStore` instance to bind its socket to a specified private interface. As of version 0.8.5, vLLM limits the `TCPStore` socket to the private interface as configured." } ], "metrics": [ { "cvssV3_1": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 9.8, "baseSeverity": "CRITICAL", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "privilegesRequired": "NONE", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H", "version": "3.1" } } ], "problemTypes": [ { "descriptions": [ { "cweId": "CWE-502", "description": "CWE-502: Deserialization of Untrusted Data", "lang": "en", "type": "CWE" } ] } ], "providerMetadata": { "dateUpdated": "2025-05-20T17:32:27.034Z", "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "shortName": "GitHub_M" }, "references": [ { "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-hjq4-87xh-g4fv", "tags": [ "x_refsource_CONFIRM" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-hjq4-87xh-g4fv" }, { "name": "https://github.com/vllm-project/vllm/pull/15988", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/pull/15988" }, { "name": "https://github.com/vllm-project/vllm/commit/0d6e187e88874c39cda7409cf673f9e6546893e7", "tags": [ "x_refsource_MISC" ], "url": "https://github.com/vllm-project/vllm/commit/0d6e187e88874c39cda7409cf673f9e6546893e7" }, { "name": "https://docs.vllm.ai/en/latest/deployment/security.html", "tags": [ "x_refsource_MISC" ], "url": "https://docs.vllm.ai/en/latest/deployment/security.html" } ], "source": { "advisory": "GHSA-hjq4-87xh-g4fv", "discovery": "UNKNOWN" }, "title": "vLLM Allows Remote Code Execution via PyNcclPipe Communication Service" } }, "cveMetadata": { "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa", "assignerShortName": "GitHub_M", "cveId": "CVE-2025-47277", "datePublished": "2025-05-20T17:32:27.034Z", "dateReserved": "2025-05-05T16:53:10.373Z", "dateUpdated": "2025-05-20T17:52:31.274Z", "state": "PUBLISHED" }, "dataType": "CVE_RECORD", "dataVersion": "5.1" }
Vulnerability from fkie_nvd
Published
2025-04-30 01:15
Modified
2025-05-28 19:12
Severity ?
10.0 (Critical) - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H
9.8 (Critical) - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H
9.8 (Critical) - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H
Summary
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.6.5 and prior to 0.8.5, having vLLM integration with mooncake, are vulnerable to remote code execution due to using pickle based serialization over unsecured ZeroMQ sockets. The vulnerable sockets were set to listen on all network interfaces, increasing the likelihood that an attacker is able to reach the vulnerable ZeroMQ sockets to carry out an attack. vLLM instances that do not make use of the mooncake integration are not vulnerable. This issue has been patched in version 0.8.5.
References
▶ | URL | Tags | |
---|---|---|---|
security-advisories@github.com | https://github.com/vllm-project/vllm/blob/32b14baf8a1f7195ca09484de3008063569b43c5/vllm/distributed/kv_transfer/kv_pipe/mooncake_pipe.py#L179 | Product | |
security-advisories@github.com | https://github.com/vllm-project/vllm/commit/a5450f11c95847cf51a17207af9a3ca5ab569b2c | Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/security/advisories/GHSA-hj4w-hm2g-p6w5 | Exploit, Vendor Advisory | |
security-advisories@github.com | https://github.com/vllm-project/vllm/security/advisories/GHSA-x3m8-f7g5-qhm7 | Not Applicable |
{ "configurations": [ { "nodes": [ { "cpeMatch": [ { "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "matchCriteriaId": "24BAE45E-0FCF-4E74-953A-88F12E093C0F", "versionEndExcluding": "0.8.5", "versionStartIncluding": "0.6.5", "vulnerable": true } ], "negate": false, "operator": "OR" } ] } ], "cveTags": [], "descriptions": [ { "lang": "en", "value": "vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.6.5 and prior to 0.8.5, having vLLM integration with mooncake, are vulnerable to remote code execution due to using pickle based serialization over unsecured ZeroMQ sockets. The vulnerable sockets were set to listen on all network interfaces, increasing the likelihood that an attacker is able to reach the vulnerable ZeroMQ sockets to carry out an attack. vLLM instances that do not make use of the mooncake integration are not vulnerable. This issue has been patched in version 0.8.5." }, { "lang": "es", "value": "vLLM es un motor de inferencia y servicio de alto rendimiento y eficiente en memoria para LLM. Las versiones a partir de la 0.6.5 y anteriores a la 0.8.5, que integran vLLM con mooncake, son vulnerables a la ejecuci\u00f3n remota de c\u00f3digo debido al uso de serializaci\u00f3n basada en pickle sobre sockets ZeroMQ no seguros. Los sockets vulnerables estaban configurados para escuchar en todas las interfaces de red, lo que aumenta la probabilidad de que un atacante pueda acceder a los sockets ZeroMQ vulnerables para ejecutar un ataque. Las instancias de vLLM que no utilizan la integraci\u00f3n con mooncake no son vulnerables. Este problema se ha corregido en la versi\u00f3n 0.8.5." } ], "id": "CVE-2025-32444", "lastModified": "2025-05-28T19:12:58.377", "metrics": { "cvssMetricV31": [ { "cvssData": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 10.0, "baseSeverity": "CRITICAL", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "privilegesRequired": "NONE", "scope": "CHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H", "version": "3.1" }, "exploitabilityScore": 3.9, "impactScore": 6.0, "source": "security-advisories@github.com", "type": "Secondary" }, { "cvssData": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 9.8, "baseSeverity": "CRITICAL", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "privilegesRequired": "NONE", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H", "version": "3.1" }, "exploitabilityScore": 3.9, "impactScore": 5.9, "source": "nvd@nist.gov", "type": "Primary" } ] }, "published": "2025-04-30T01:15:51.953", "references": [ { "source": "security-advisories@github.com", "tags": [ "Product" ], "url": "https://github.com/vllm-project/vllm/blob/32b14baf8a1f7195ca09484de3008063569b43c5/vllm/distributed/kv_transfer/kv_pipe/mooncake_pipe.py#L179" }, { "source": "security-advisories@github.com", "tags": [ "Patch" ], "url": "https://github.com/vllm-project/vllm/commit/a5450f11c95847cf51a17207af9a3ca5ab569b2c" }, { "source": "security-advisories@github.com", "tags": [ "Exploit", "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-hj4w-hm2g-p6w5" }, { "source": "security-advisories@github.com", "tags": [ "Not Applicable" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-x3m8-f7g5-qhm7" } ], "sourceIdentifier": "security-advisories@github.com", "vulnStatus": "Analyzed", "weaknesses": [ { "description": [ { "lang": "en", "value": "CWE-502" } ], "source": "security-advisories@github.com", "type": "Primary" } ] }
Vulnerability from fkie_nvd
Published
2025-05-29 17:15
Modified
2025-06-24 18:25
Severity ?
Summary
vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the prefill process speeds up, which is reflected in the TTFT (Time to First Token). These timing differences caused by matching chunks are significant enough to be recognized and exploited. This issue has been patched in version 0.9.0.
References
▶ | URL | Tags | |
---|---|---|---|
security-advisories@github.com | https://github.com/vllm-project/vllm/commit/77073c77bc2006eb80ea6d5128f076f5e6c6f54f | Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/pull/17045 | Issue Tracking, Vendor Advisory | |
security-advisories@github.com | https://github.com/vllm-project/vllm/security/advisories/GHSA-4qjh-9fv9-r85r | Vendor Advisory |
{ "configurations": [ { "nodes": [ { "cpeMatch": [ { "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "matchCriteriaId": "A8F1E19D-D7C6-477D-B737-277EF3E3F20F", "versionEndExcluding": "0.9.0", "vulnerable": true } ], "negate": false, "operator": "OR" } ] } ], "cveTags": [], "descriptions": [ { "lang": "en", "value": "vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the prefill process speeds up, which is reflected in the TTFT (Time to First Token). These timing differences caused by matching chunks are significant enough to be recognized and exploited. This issue has been patched in version 0.9.0." }, { "lang": "es", "value": "vLLM es un motor de inferencia y entrega para modelos de lenguaje grandes (LLM). Antes de la versi\u00f3n 0.9.0, al procesar una nueva solicitud, si el mecanismo PageAttention encuentra un fragmento de prefijo coincidente, el proceso de precompletado se acelera, lo que se refleja en el TTFT (Tiempo hasta el Primer Token). Estas diferencias de tiempo causadas por la coincidencia de fragmentos son lo suficientemente significativas como para ser detectadas y explotadas. Este problema se ha corregido en la versi\u00f3n 0.9.0." } ], "id": "CVE-2025-46570", "lastModified": "2025-06-24T18:25:31.883", "metrics": { "cvssMetricV31": [ { "cvssData": { "attackComplexity": "HIGH", "attackVector": "NETWORK", "availabilityImpact": "NONE", "baseScore": 2.6, "baseSeverity": "LOW", "confidentialityImpact": "LOW", "integrityImpact": "NONE", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "REQUIRED", "vectorString": "CVSS:3.1/AV:N/AC:H/PR:L/UI:R/S:U/C:L/I:N/A:N", "version": "3.1" }, "exploitabilityScore": 1.2, "impactScore": 1.4, "source": "security-advisories@github.com", "type": "Secondary" } ] }, "published": "2025-05-29T17:15:21.327", "references": [ { "source": "security-advisories@github.com", "tags": [ "Patch" ], "url": "https://github.com/vllm-project/vllm/commit/77073c77bc2006eb80ea6d5128f076f5e6c6f54f" }, { "source": "security-advisories@github.com", "tags": [ "Issue Tracking", "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/pull/17045" }, { "source": "security-advisories@github.com", "tags": [ "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-4qjh-9fv9-r85r" } ], "sourceIdentifier": "security-advisories@github.com", "vulnStatus": "Analyzed", "weaknesses": [ { "description": [ { "lang": "en", "value": "CWE-208" } ], "source": "security-advisories@github.com", "type": "Primary" }, { "description": [ { "lang": "en", "value": "CWE-203" } ], "source": "nvd@nist.gov", "type": "Primary" } ] }
Vulnerability from fkie_nvd
Published
2025-01-27 18:15
Modified
2025-06-27 19:30
Severity ?
7.5 (High) - CVSS:3.1/AV:N/AC:H/PR:N/UI:R/S:U/C:H/I:H/A:H
8.8 (High) - CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H
8.8 (High) - CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H
Summary
vLLM is a library for LLM inference and serving. vllm/model_executor/weight_utils.py implements hf_model_weights_iterator to load the model checkpoint, which is downloaded from huggingface. It uses the torch.load function and the weights_only parameter defaults to False. When torch.load loads malicious pickle data, it will execute arbitrary code during unpickling. This vulnerability is fixed in v0.7.0.
References
▶ | URL | Tags | |
---|---|---|---|
security-advisories@github.com | https://github.com/vllm-project/vllm/commit/d3d6bb13fb62da3234addf6574922a4ec0513d04 | Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/pull/12366 | Issue Tracking, Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/security/advisories/GHSA-rh4j-5rhw-hr54 | Vendor Advisory | |
security-advisories@github.com | https://pytorch.org/docs/stable/generated/torch.load.html | Technical Description |
{ "configurations": [ { "nodes": [ { "cpeMatch": [ { "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "matchCriteriaId": "78210BFE-5D31-4D84-BA73-75C1594A3A3C", "versionEndExcluding": "0.7.0", "vulnerable": true } ], "negate": false, "operator": "OR" } ] } ], "cveTags": [], "descriptions": [ { "lang": "en", "value": "vLLM is a library for LLM inference and serving. vllm/model_executor/weight_utils.py implements hf_model_weights_iterator to load the model checkpoint, which is downloaded from huggingface. It uses the torch.load function and the weights_only parameter defaults to False. When torch.load loads malicious pickle data, it will execute arbitrary code during unpickling. This vulnerability is fixed in v0.7.0." }, { "lang": "es", "value": "vLLM es una librer\u00eda para la inferencia y el servicio de LLM. vllm/model_executor/weight_utils.py implementa hf_model_weights_iterator para cargar el punto de control del modelo, que se descarga desde huggingface. Utiliza la funci\u00f3n Torch.load y el par\u00e1metro weights_only tiene el valor predeterminado Falso. Cuando Torch.load carga datos pickle maliciosos, ejecutar\u00e1 c\u00f3digo arbitrario durante el desensamblaje. Esta vulnerabilidad se corrigi\u00f3 en la versi\u00f3n v0.7.0." } ], "id": "CVE-2025-24357", "lastModified": "2025-06-27T19:30:59.223", "metrics": { "cvssMetricV31": [ { "cvssData": { "attackComplexity": "HIGH", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 7.5, "baseSeverity": "HIGH", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "privilegesRequired": "NONE", "scope": "UNCHANGED", "userInteraction": "REQUIRED", "vectorString": "CVSS:3.1/AV:N/AC:H/PR:N/UI:R/S:U/C:H/I:H/A:H", "version": "3.1" }, "exploitabilityScore": 1.6, "impactScore": 5.9, "source": "security-advisories@github.com", "type": "Secondary" }, { "cvssData": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 8.8, "baseSeverity": "HIGH", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "privilegesRequired": "NONE", "scope": "UNCHANGED", "userInteraction": "REQUIRED", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H", "version": "3.1" }, "exploitabilityScore": 2.8, "impactScore": 5.9, "source": "nvd@nist.gov", "type": "Primary" } ] }, "published": "2025-01-27T18:15:41.523", "references": [ { "source": "security-advisories@github.com", "tags": [ "Patch" ], "url": "https://github.com/vllm-project/vllm/commit/d3d6bb13fb62da3234addf6574922a4ec0513d04" }, { "source": "security-advisories@github.com", "tags": [ "Issue Tracking", "Patch" ], "url": "https://github.com/vllm-project/vllm/pull/12366" }, { "source": "security-advisories@github.com", "tags": [ "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-rh4j-5rhw-hr54" }, { "source": "security-advisories@github.com", "tags": [ "Technical Description" ], "url": "https://pytorch.org/docs/stable/generated/torch.load.html" } ], "sourceIdentifier": "security-advisories@github.com", "vulnStatus": "Analyzed", "weaknesses": [ { "description": [ { "lang": "en", "value": "CWE-502" } ], "source": "security-advisories@github.com", "type": "Primary" } ] }
Vulnerability from fkie_nvd
Published
2025-03-19 16:15
Modified
2025-07-01 20:52
Severity ?
Summary
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. When vLLM is configured to use Mooncake, unsafe deserialization exposed directly over ZMQ/TCP on all network interfaces will allow attackers to execute remote code on distributed hosts. This is a remote code execution vulnerability impacting any deployments using Mooncake to distribute KV across distributed hosts. This vulnerability is fixed in 0.8.0.
References
▶ | URL | Tags | |
---|---|---|---|
security-advisories@github.com | https://github.com/vllm-project/vllm/commit/288ca110f68d23909728627d3100e5a8db820aa2 | Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/pull/14228 | Issue Tracking, Vendor Advisory | |
security-advisories@github.com | https://github.com/vllm-project/vllm/security/advisories/GHSA-x3m8-f7g5-qhm7 | Vendor Advisory |
{ "configurations": [ { "nodes": [ { "cpeMatch": [ { "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "matchCriteriaId": "090A99DD-3EC3-40E2-9615-1DFDCF3B8A6A", "versionEndExcluding": "0.8.0", "versionStartIncluding": "0.6.5", "vulnerable": true } ], "negate": false, "operator": "OR" } ] } ], "cveTags": [], "descriptions": [ { "lang": "en", "value": "vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. When vLLM is configured to use Mooncake, unsafe deserialization exposed directly over ZMQ/TCP on all network interfaces will allow attackers to execute remote code on distributed hosts. This is a remote code execution vulnerability impacting any deployments using Mooncake to distribute KV across distributed hosts. This vulnerability is fixed in 0.8.0." }, { "lang": "es", "value": "vLLM es un motor de inferencia y servicio de alto rendimiento y eficiente en el uso de memoria para LLM. Cuando vLLM se configura para usar Mooncake, la deserializaci\u00f3n insegura expuesta directamente a trav\u00e9s de ZMQ/TCP en todas las interfaces de red permitir\u00e1 a los atacantes ejecutar c\u00f3digo remoto en hosts distribuidos. Esta vulnerabilidad de ejecuci\u00f3n remota de c\u00f3digo afecta a cualquier implementaci\u00f3n que use Mooncake para distribuir KV entre hosts distribuidos. Esta vulnerabilidad se corrigi\u00f3 en la versi\u00f3n 0.8.0." } ], "id": "CVE-2025-29783", "lastModified": "2025-07-01T20:52:17.273", "metrics": { "cvssMetricV31": [ { "cvssData": { "attackComplexity": "LOW", "attackVector": "ADJACENT_NETWORK", "availabilityImpact": "HIGH", "baseScore": 9.0, "baseSeverity": "CRITICAL", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "privilegesRequired": "LOW", "scope": "CHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:A/AC:L/PR:L/UI:N/S:C/C:H/I:H/A:H", "version": "3.1" }, "exploitabilityScore": 2.3, "impactScore": 6.0, "source": "security-advisories@github.com", "type": "Secondary" } ] }, "published": "2025-03-19T16:15:32.477", "references": [ { "source": "security-advisories@github.com", "tags": [ "Patch" ], "url": "https://github.com/vllm-project/vllm/commit/288ca110f68d23909728627d3100e5a8db820aa2" }, { "source": "security-advisories@github.com", "tags": [ "Issue Tracking", "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/pull/14228" }, { "source": "security-advisories@github.com", "tags": [ "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-x3m8-f7g5-qhm7" } ], "sourceIdentifier": "security-advisories@github.com", "vulnStatus": "Analyzed", "weaknesses": [ { "description": [ { "lang": "en", "value": "CWE-502" } ], "source": "security-advisories@github.com", "type": "Secondary" } ] }
Vulnerability from fkie_nvd
Published
2025-05-30 19:15
Modified
2025-06-24 17:44
Severity ?
Summary
vLLM is an inference and serving engine for large language models (LLMs). In versions 0.8.0 up to but excluding 0.9.0, hitting the /v1/completions API with a invalid json_schema as a Guided Param kills the vllm server. This vulnerability is similar GHSA-9hcf-v7m4-6m2j/CVE-2025-48943, but for regex instead of a JSON schema. Version 0.9.0 fixes the issue.
References
▶ | URL | Tags | |
---|---|---|---|
security-advisories@github.com | https://github.com/vllm-project/vllm/commit/08bf7840780980c7568c573c70a6a8db94fd45ff | Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/issues/17248 | Issue Tracking | |
security-advisories@github.com | https://github.com/vllm-project/vllm/pull/17623 | Issue Tracking, Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/security/advisories/GHSA-6qc9-v4r8-22xg | Exploit, Vendor Advisory |
{ "configurations": [ { "nodes": [ { "cpeMatch": [ { "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "matchCriteriaId": "8E26EB9F-426B-4A73-B8CA-2D9F3727AF2C", "versionEndExcluding": "0.9.0", "versionStartIncluding": "0.8.0", "vulnerable": true } ], "negate": false, "operator": "OR" } ] } ], "cveTags": [], "descriptions": [ { "lang": "en", "value": "vLLM is an inference and serving engine for large language models (LLMs). In versions 0.8.0 up to but excluding 0.9.0, hitting the /v1/completions API with a invalid json_schema as a Guided Param kills the vllm server. This vulnerability is similar GHSA-9hcf-v7m4-6m2j/CVE-2025-48943, but for regex instead of a JSON schema. Version 0.9.0 fixes the issue." }, { "lang": "es", "value": "vLLM es un motor de inferencia y servicio para modelos de lenguaje grandes (LLM). En las versiones 0.8.0 y 0.9.0, excepto esta, al acceder a la API /v1/completions con un json_schema no v\u00e1lido como par\u00e1metro guiado, se desactiva el servidor vllm. Esta vulnerabilidad es similar a la vulnerabilidad GHSA-9hcf-v7m4-6m2j/CVE-2025-48943, pero para expresiones regulares en lugar de un esquema JSON. La versi\u00f3n 0.9.0 corrige el problema." } ], "id": "CVE-2025-48942", "lastModified": "2025-06-24T17:44:47.737", "metrics": { "cvssMetricV31": [ { "cvssData": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 6.5, "baseSeverity": "MEDIUM", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H", "version": "3.1" }, "exploitabilityScore": 2.8, "impactScore": 3.6, "source": "security-advisories@github.com", "type": "Secondary" } ] }, "published": "2025-05-30T19:15:30.130", "references": [ { "source": "security-advisories@github.com", "tags": [ "Patch" ], "url": "https://github.com/vllm-project/vllm/commit/08bf7840780980c7568c573c70a6a8db94fd45ff" }, { "source": "security-advisories@github.com", "tags": [ "Issue Tracking" ], "url": "https://github.com/vllm-project/vllm/issues/17248" }, { "source": "security-advisories@github.com", "tags": [ "Issue Tracking", "Patch" ], "url": "https://github.com/vllm-project/vllm/pull/17623" }, { "source": "security-advisories@github.com", "tags": [ "Exploit", "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-6qc9-v4r8-22xg" } ], "sourceIdentifier": "security-advisories@github.com", "vulnStatus": "Analyzed", "weaknesses": [ { "description": [ { "lang": "en", "value": "CWE-248" } ], "source": "security-advisories@github.com", "type": "Primary" } ] }
Vulnerability from fkie_nvd
Published
2025-05-20 18:15
Modified
2025-08-13 16:35
Severity ?
Summary
vLLM, an inference and serving engine for large language models (LLMs), has an issue in versions 0.6.5 through 0.8.4 that ONLY impacts environments using the `PyNcclPipe` KV cache transfer integration with the V0 engine. No other configurations are affected. vLLM supports the use of the `PyNcclPipe` class to establish a peer-to-peer communication domain for data transmission between distributed nodes. The GPU-side KV-Cache transmission is implemented through the `PyNcclCommunicator` class, while CPU-side control message passing is handled via the `send_obj` and `recv_obj` methods on the CPU side. The intention was that this interface should only be exposed to a private network using the IP address specified by the `--kv-ip` CLI parameter. The vLLM documentation covers how this must be limited to a secured network. The default and intentional behavior from PyTorch is that the `TCPStore` interface listens on ALL interfaces, regardless of what IP address is provided. The IP address given was only used as a client-side address to use. vLLM was fixed to use a workaround to force the `TCPStore` instance to bind its socket to a specified private interface. As of version 0.8.5, vLLM limits the `TCPStore` socket to the private interface as configured.
References
▶ | URL | Tags | |
---|---|---|---|
security-advisories@github.com | https://docs.vllm.ai/en/latest/deployment/security.html | Technical Description | |
security-advisories@github.com | https://github.com/vllm-project/vllm/commit/0d6e187e88874c39cda7409cf673f9e6546893e7 | Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/pull/15988 | Issue Tracking, Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/security/advisories/GHSA-hjq4-87xh-g4fv | Exploit, Vendor Advisory |
{ "configurations": [ { "nodes": [ { "cpeMatch": [ { "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "matchCriteriaId": "24BAE45E-0FCF-4E74-953A-88F12E093C0F", "versionEndExcluding": "0.8.5", "versionStartIncluding": "0.6.5", "vulnerable": true } ], "negate": false, "operator": "OR" } ] } ], "cveTags": [], "descriptions": [ { "lang": "en", "value": "vLLM, an inference and serving engine for large language models (LLMs), has an issue in versions 0.6.5 through 0.8.4 that ONLY impacts environments using the `PyNcclPipe` KV cache transfer integration with the V0 engine. No other configurations are affected. vLLM supports the use of the\u00a0`PyNcclPipe`\u00a0class to establish a peer-to-peer communication domain for data transmission between distributed nodes. The GPU-side KV-Cache transmission is implemented through the\u00a0`PyNcclCommunicator`\u00a0class, while CPU-side control message passing is handled via the\u00a0`send_obj`\u00a0and\u00a0`recv_obj`\u00a0methods on the CPU side.\u200b The intention was that this interface should only be exposed to a private network using the IP address specified by the `--kv-ip` CLI parameter. The vLLM documentation covers how this must be limited to a secured network. The default and intentional behavior from PyTorch is that the `TCPStore` interface listens on ALL interfaces, regardless of what IP address is provided. The IP address given was only used as a client-side address to use. vLLM was fixed to use a workaround to force the `TCPStore` instance to bind its socket to a specified private interface. As of version 0.8.5, vLLM limits the `TCPStore` socket to the private interface as configured." }, { "lang": "es", "value": "vLLM, un motor de inferencia y servicio para modelos de lenguaje grandes (LLM), presenta un problema en las versiones 0.6.5 a 0.8.4 que SOLO afecta a entornos que utilizan la integraci\u00f3n de transferencia de cach\u00e9 KV `PyNcclPipe` con el motor V0. Ninguna otra configuraci\u00f3n se ve afectada. vLLM admite el uso de la clase `PyNcclPipe` para establecer un dominio de comunicaci\u00f3n punto a punto para la transmisi\u00f3n de datos entre nodos distribuidos. La transmisi\u00f3n de cach\u00e9 KV del lado de la GPU se implementa mediante la clase `PyNcclCommunicator`, mientras que el paso de mensajes de control del lado de la CPU se gestiona mediante los m\u00e9todos `send_obj` y `recv_obj` en el lado de la CPU. El objetivo era que esta interfaz solo se expusiera a una red privada utilizando la direcci\u00f3n IP especificada por el par\u00e1metro de CLI `--kv-ip`. La documentaci\u00f3n de vLLM explica c\u00f3mo esto debe limitarse a una red segura. El comportamiento predeterminado e intencional de PyTorch es que la interfaz `TCPStore` escucha en TODAS las interfaces, independientemente de la direcci\u00f3n IP proporcionada. La direcci\u00f3n IP proporcionada solo se usaba como direcci\u00f3n del cliente. vLLM se corrigi\u00f3 para usar una soluci\u00f3n alternativa que obligaba a la instancia `TCPStore` a vincular su socket a una interfaz privada espec\u00edfica. A partir de la versi\u00f3n 0.8.5, vLLM limita el socket `TCPStore` a la interfaz privada configurada." } ], "id": "CVE-2025-47277", "lastModified": "2025-08-13T16:35:57.357", "metrics": { "cvssMetricV31": [ { "cvssData": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 9.8, "baseSeverity": "CRITICAL", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "privilegesRequired": "NONE", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H", "version": "3.1" }, "exploitabilityScore": 3.9, "impactScore": 5.9, "source": "security-advisories@github.com", "type": "Secondary" } ] }, "published": "2025-05-20T18:15:46.730", "references": [ { "source": "security-advisories@github.com", "tags": [ "Technical Description" ], "url": "https://docs.vllm.ai/en/latest/deployment/security.html" }, { "source": "security-advisories@github.com", "tags": [ "Patch" ], "url": "https://github.com/vllm-project/vllm/commit/0d6e187e88874c39cda7409cf673f9e6546893e7" }, { "source": "security-advisories@github.com", "tags": [ "Issue Tracking", "Patch" ], "url": "https://github.com/vllm-project/vllm/pull/15988" }, { "source": "security-advisories@github.com", "tags": [ "Exploit", "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-hjq4-87xh-g4fv" } ], "sourceIdentifier": "security-advisories@github.com", "vulnStatus": "Analyzed", "weaknesses": [ { "description": [ { "lang": "en", "value": "CWE-502" } ], "source": "security-advisories@github.com", "type": "Primary" } ] }
Vulnerability from fkie_nvd
Published
2025-05-30 18:15
Modified
2025-06-19 00:55
Severity ?
Summary
vLLM, an inference and serving engine for large language models (LLMs), has a Regular Expression Denial of Service (ReDoS) vulnerability in the file `vllm/entrypoints/openai/tool_parsers/pythonic_tool_parser.py` of versions 0.6.4 up to but excluding 0.9.0. The root cause is the use of a highly complex and nested regular expression for tool call detection, which can be exploited by an attacker to cause severe performance degradation or make the service unavailable. The pattern contains multiple nested quantifiers, optional groups, and inner repetitions which make it vulnerable to catastrophic backtracking. Version 0.9.0 contains a patch for the issue.
References
▶ | URL | Tags | |
---|---|---|---|
security-advisories@github.com | https://github.com/vllm-project/vllm/commit/4fc1bf813ad80172c1db31264beaef7d93fe0601 | Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/pull/18454 | Issue Tracking, Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/security/advisories/GHSA-w6q7-j642-7c25 | Exploit, Vendor Advisory | |
134c704f-9b21-4f2e-91b3-4a467353bcc0 | https://github.com/vllm-project/vllm/security/advisories/GHSA-w6q7-j642-7c25 | Exploit, Vendor Advisory |
{ "configurations": [ { "nodes": [ { "cpeMatch": [ { "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "matchCriteriaId": "18A93B72-AD3E-46D7-8948-E0765D4A7CB1", "versionEndExcluding": "0.9.0", "versionStartIncluding": "0.6.4", "vulnerable": true } ], "negate": false, "operator": "OR" } ] } ], "cveTags": [], "descriptions": [ { "lang": "en", "value": "vLLM, an inference and serving engine for large language models (LLMs), has a Regular Expression Denial of Service (ReDoS) vulnerability in the file `vllm/entrypoints/openai/tool_parsers/pythonic_tool_parser.py` of versions 0.6.4 up to but excluding 0.9.0. The root cause is the use of a highly complex and nested regular expression for tool call detection, which can be exploited by an attacker to cause severe performance degradation or make the service unavailable. The pattern contains multiple nested quantifiers, optional groups, and inner repetitions which make it vulnerable to catastrophic backtracking. Version 0.9.0 contains a patch for the issue." }, { "lang": "es", "value": "vLLM, un motor de inferencia y servicio para modelos de lenguaje grandes (LLM), presenta una vulnerabilidad de denegaci\u00f3n de servicio por expresi\u00f3n regular (ReDoS) en el archivo `vllm/entrypoints/openai/tool_parsers/pythonic_tool_parser.py` de las versiones 0.6.4 a 0.9.0, excepto esta \u00faltima. La causa principal es el uso de una expresi\u00f3n regular anidada y altamente compleja para la detecci\u00f3n de llamadas a herramientas, que un atacante puede explotar para causar una degradaci\u00f3n grave del rendimiento o inhabilitar el servicio. El patr\u00f3n contiene m\u00faltiples cuantificadores anidados, grupos opcionales y repeticiones internas, lo que lo hace vulnerable a un retroceso catastr\u00f3fico. La versi\u00f3n 0.9.0 incluye un parche para este problema." } ], "id": "CVE-2025-48887", "lastModified": "2025-06-19T00:55:27.710", "metrics": { "cvssMetricV31": [ { "cvssData": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 6.5, "baseSeverity": "MEDIUM", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H", "version": "3.1" }, "exploitabilityScore": 2.8, "impactScore": 3.6, "source": "security-advisories@github.com", "type": "Secondary" } ] }, "published": "2025-05-30T18:15:32.500", "references": [ { "source": "security-advisories@github.com", "tags": [ "Patch" ], "url": "https://github.com/vllm-project/vllm/commit/4fc1bf813ad80172c1db31264beaef7d93fe0601" }, { "source": "security-advisories@github.com", "tags": [ "Issue Tracking", "Patch" ], "url": "https://github.com/vllm-project/vllm/pull/18454" }, { "source": "security-advisories@github.com", "tags": [ "Exploit", "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-w6q7-j642-7c25" }, { "source": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "tags": [ "Exploit", "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-w6q7-j642-7c25" } ], "sourceIdentifier": "security-advisories@github.com", "vulnStatus": "Analyzed", "weaknesses": [ { "description": [ { "lang": "en", "value": "CWE-1333" } ], "source": "security-advisories@github.com", "type": "Primary" } ] }
Vulnerability from fkie_nvd
Published
2025-05-29 17:15
Modified
2025-06-24 18:12
Severity ?
4.2 (Medium) - CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:L/I:N/A:L
7.3 (High) - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:L/A:L
7.3 (High) - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:L/A:L
Summary
vLLM is an inference and serving engine for large language models (LLMs). In versions starting from 0.7.0 to before 0.9.0, in the file vllm/multimodal/hasher.py, the MultiModalHasher class has a security and data integrity issue in its image hashing method. Currently, it serializes PIL.Image.Image objects using only obj.tobytes(), which returns only the raw pixel data, without including metadata such as the image’s shape (width, height, mode). As a result, two images of different sizes (e.g., 30x100 and 100x30) with the same pixel byte sequence could generate the same hash value. This may lead to hash collisions, incorrect cache hits, and even data leakage or security risks. This issue has been patched in version 0.9.0.
References
▶ | URL | Tags | |
---|---|---|---|
security-advisories@github.com | https://github.com/vllm-project/vllm/commit/99404f53c72965b41558aceb1bc2380875f5d848 | Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/pull/17378 | Issue Tracking, Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/security/advisories/GHSA-c65p-x677-fgj6 | Vendor Advisory |
{ "configurations": [ { "nodes": [ { "cpeMatch": [ { "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "matchCriteriaId": "08DBEEAC-7BAC-4A44-894B-5F544B5CF9D3", "versionEndExcluding": "0.9.0", "versionStartIncluding": "0.7.0", "vulnerable": true } ], "negate": false, "operator": "OR" } ] } ], "cveTags": [], "descriptions": [ { "lang": "en", "value": "vLLM is an inference and serving engine for large language models (LLMs). In versions starting from 0.7.0 to before 0.9.0, in the file vllm/multimodal/hasher.py, the MultiModalHasher class has a security and data integrity issue in its image hashing method. Currently, it serializes PIL.Image.Image objects using only obj.tobytes(), which returns only the raw pixel data, without including metadata such as the image\u2019s shape (width, height, mode). As a result, two images of different sizes (e.g., 30x100 and 100x30) with the same pixel byte sequence could generate the same hash value. This may lead to hash collisions, incorrect cache hits, and even data leakage or security risks. This issue has been patched in version 0.9.0." }, { "lang": "es", "value": "vLLM es un motor de inferencia y servicio para modelos de lenguaje grandes (LLM). En versiones desde la 0.7.0 hasta anteriores a la 0.9.0, en el archivo vllm/multimodal/hasher.py, la clase MultiModalHasher presenta un problema de seguridad e integridad de datos en su m\u00e9todo de hash de im\u00e1genes. Actualmente, serializa los objetos PIL.Image.Image utilizando \u00fanicamente obj.tobytes(), que devuelve \u00fanicamente los datos de p\u00edxeles sin procesar, sin incluir metadatos como la forma de la imagen (ancho, alto, modo). Como resultado, dos im\u00e1genes de diferentes tama\u00f1os (p. ej., 30x100 y 100x30) con la misma secuencia de bytes de p\u00edxeles podr\u00edan generar el mismo valor hash. Esto puede provocar colisiones de hash, aciertos de cach\u00e9 incorrectos e incluso fugas de datos o riesgos de seguridad. Este problema se ha corregido en la versi\u00f3n 0.9.0." } ], "id": "CVE-2025-46722", "lastModified": "2025-06-24T18:12:30.023", "metrics": { "cvssMetricV31": [ { "cvssData": { "attackComplexity": "HIGH", "attackVector": "NETWORK", "availabilityImpact": "LOW", "baseScore": 4.2, "baseSeverity": "MEDIUM", "confidentialityImpact": "LOW", "integrityImpact": "NONE", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:L/I:N/A:L", "version": "3.1" }, "exploitabilityScore": 1.6, "impactScore": 2.5, "source": "security-advisories@github.com", "type": "Secondary" }, { "cvssData": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "LOW", "baseScore": 7.3, "baseSeverity": "HIGH", "confidentialityImpact": "LOW", "integrityImpact": "LOW", "privilegesRequired": "NONE", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:L/A:L", "version": "3.1" }, "exploitabilityScore": 3.9, "impactScore": 3.4, "source": "nvd@nist.gov", "type": "Primary" } ] }, "published": "2025-05-29T17:15:21.523", "references": [ { "source": "security-advisories@github.com", "tags": [ "Patch" ], "url": "https://github.com/vllm-project/vllm/commit/99404f53c72965b41558aceb1bc2380875f5d848" }, { "source": "security-advisories@github.com", "tags": [ "Issue Tracking", "Patch" ], "url": "https://github.com/vllm-project/vllm/pull/17378" }, { "source": "security-advisories@github.com", "tags": [ "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-c65p-x677-fgj6" } ], "sourceIdentifier": "security-advisories@github.com", "vulnStatus": "Analyzed", "weaknesses": [ { "description": [ { "lang": "en", "value": "CWE-1023" }, { "lang": "en", "value": "CWE-1288" } ], "source": "security-advisories@github.com", "type": "Primary" } ] }
Vulnerability from fkie_nvd
Published
2025-02-07 20:15
Modified
2025-07-01 20:58
Severity ?
Summary
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Maliciously constructed statements can lead to hash collisions, resulting in cache reuse, which can interfere with subsequent responses and cause unintended behavior. Prefix caching makes use of Python's built-in hash() function. As of Python 3.12, the behavior of hash(None) has changed to be a predictable constant value. This makes it more feasible that someone could try exploit hash collisions. The impact of a collision would be using cache that was generated using different content. Given knowledge of prompts in use and predictable hashing behavior, someone could intentionally populate the cache using a prompt known to collide with another prompt in use. This issue has been addressed in version 0.7.2 and all users are advised to upgrade. There are no known workarounds for this vulnerability.
References
▶ | URL | Tags | |
---|---|---|---|
security-advisories@github.com | https://github.com/python/cpython/commit/432117cd1f59c76d97da2eaff55a7d758301dbc7 | Not Applicable | |
security-advisories@github.com | https://github.com/vllm-project/vllm/pull/12621 | Issue Tracking | |
security-advisories@github.com | https://github.com/vllm-project/vllm/security/advisories/GHSA-rm76-4mrf-v9r8 | Vendor Advisory |
{ "configurations": [ { "nodes": [ { "cpeMatch": [ { "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "matchCriteriaId": "A5911C1A-F107-4B9B-BAE9-36A2B5181321", "versionEndExcluding": "0.7.2", "vulnerable": true } ], "negate": false, "operator": "OR" } ] } ], "cveTags": [], "descriptions": [ { "lang": "en", "value": "vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Maliciously constructed statements can lead to hash collisions, resulting in cache reuse, which can interfere with subsequent responses and cause unintended behavior. Prefix caching makes use of Python\u0027s built-in hash() function. As of Python 3.12, the behavior of hash(None) has changed to be a predictable constant value. This makes it more feasible that someone could try exploit hash collisions. The impact of a collision would be using cache that was generated using different content. Given knowledge of prompts in use and predictable hashing behavior, someone could intentionally populate the cache using a prompt known to collide with another prompt in use. This issue has been addressed in version 0.7.2 and all users are advised to upgrade. There are no known workarounds for this vulnerability." }, { "lang": "es", "value": "vLLM es un motor de inferencia y servicio de alto rendimiento y uso eficiente de la memoria para LLM. Las declaraciones construidas de forma malintencionada pueden provocar colisiones de hash, lo que da como resultado la reutilizaci\u00f3n de la memoria cach\u00e9, lo que puede interferir con las respuestas posteriores y provocar un comportamiento no deseado. El almacenamiento en cach\u00e9 de prefijos utiliza la funci\u00f3n hash() incorporada de Python. A partir de Python 3.12, el comportamiento de hash(None) ha cambiado para ser un valor constante predecible. Esto hace que sea m\u00e1s factible que alguien pueda intentar explotar las colisiones de hash. El impacto de una colisi\u00f3n ser\u00eda el uso de la memoria cach\u00e9 generada con un contenido diferente. Dado el conocimiento de los mensajes en uso y el comportamiento predecible del hash, alguien podr\u00eda rellenar intencionalmente la memoria cach\u00e9 utilizando un mensaje que se sabe que colisiona con otro mensaje en uso. Este problema se ha solucionado en la versi\u00f3n 0.7.2 y se recomienda a todos los usuarios que actualicen. No existen workarounds para esta vulnerabilidad." } ], "id": "CVE-2025-25183", "lastModified": "2025-07-01T20:58:00.170", "metrics": { "cvssMetricV31": [ { "cvssData": { "attackComplexity": "HIGH", "attackVector": "NETWORK", "availabilityImpact": "NONE", "baseScore": 2.6, "baseSeverity": "LOW", "confidentialityImpact": "NONE", "integrityImpact": "LOW", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "REQUIRED", "vectorString": "CVSS:3.1/AV:N/AC:H/PR:L/UI:R/S:U/C:N/I:L/A:N", "version": "3.1" }, "exploitabilityScore": 1.2, "impactScore": 1.4, "source": "security-advisories@github.com", "type": "Secondary" } ] }, "published": "2025-02-07T20:15:34.083", "references": [ { "source": "security-advisories@github.com", "tags": [ "Not Applicable" ], "url": "https://github.com/python/cpython/commit/432117cd1f59c76d97da2eaff55a7d758301dbc7" }, { "source": "security-advisories@github.com", "tags": [ "Issue Tracking" ], "url": "https://github.com/vllm-project/vllm/pull/12621" }, { "source": "security-advisories@github.com", "tags": [ "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-rm76-4mrf-v9r8" } ], "sourceIdentifier": "security-advisories@github.com", "vulnStatus": "Analyzed", "weaknesses": [ { "description": [ { "lang": "en", "value": "CWE-354" } ], "source": "security-advisories@github.com", "type": "Primary" } ] }
Vulnerability from fkie_nvd
Published
2025-05-30 19:15
Modified
2025-06-24 17:40
Severity ?
Summary
vLLM is an inference and serving engine for large language models (LLMs). Version 0.8.0 up to but excluding 0.9.0 have a Denial of Service (ReDoS) that causes the vLLM server to crash if an invalid regex was provided while using structured output. This vulnerability is similar to GHSA-6qc9-v4r8-22xg/CVE-2025-48942, but for regex instead of a JSON schema. Version 0.9.0 fixes the issue.
References
▶ | URL | Tags | |
---|---|---|---|
security-advisories@github.com | https://github.com/vllm-project/vllm/commit/08bf7840780980c7568c573c70a6a8db94fd45ff | Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/issues/17313 | Issue Tracking | |
security-advisories@github.com | https://github.com/vllm-project/vllm/pull/17623 | Issue Tracking, Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/security/advisories/GHSA-9hcf-v7m4-6m2j | Vendor Advisory |
{ "configurations": [ { "nodes": [ { "cpeMatch": [ { "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "matchCriteriaId": "8E26EB9F-426B-4A73-B8CA-2D9F3727AF2C", "versionEndExcluding": "0.9.0", "versionStartIncluding": "0.8.0", "vulnerable": true } ], "negate": false, "operator": "OR" } ] } ], "cveTags": [], "descriptions": [ { "lang": "en", "value": "vLLM is an inference and serving engine for large language models (LLMs). Version 0.8.0 up to but excluding 0.9.0 have a Denial of Service (ReDoS) that causes the vLLM server to crash if an invalid regex was provided while using structured output. This vulnerability is similar to GHSA-6qc9-v4r8-22xg/CVE-2025-48942, but for regex instead of a JSON schema. Version 0.9.0 fixes the issue." }, { "lang": "es", "value": "vLLM es un motor de inferencia y servicio para modelos de lenguaje grandes (LLM). Las versiones 0.8.0 y 0.9.0, excepto esta, presentan una vulnerabilidad de denegaci\u00f3n de servicio (ReDoS) que provoca el bloqueo del servidor vLLM si se proporciona una expresi\u00f3n regular no v\u00e1lida al usar la salida estructurada. Esta vulnerabilidad es similar a GHSA-6qc9-v4r8-22xg/CVE-2025-48942, pero para expresiones regulares en lugar de un esquema JSON. La versi\u00f3n 0.9.0 corrige el problema." } ], "id": "CVE-2025-48943", "lastModified": "2025-06-24T17:40:52.923", "metrics": { "cvssMetricV31": [ { "cvssData": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 6.5, "baseSeverity": "MEDIUM", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H", "version": "3.1" }, "exploitabilityScore": 2.8, "impactScore": 3.6, "source": "security-advisories@github.com", "type": "Secondary" } ] }, "published": "2025-05-30T19:15:30.280", "references": [ { "source": "security-advisories@github.com", "tags": [ "Patch" ], "url": "https://github.com/vllm-project/vllm/commit/08bf7840780980c7568c573c70a6a8db94fd45ff" }, { "source": "security-advisories@github.com", "tags": [ "Issue Tracking" ], "url": "https://github.com/vllm-project/vllm/issues/17313" }, { "source": "security-advisories@github.com", "tags": [ "Issue Tracking", "Patch" ], "url": "https://github.com/vllm-project/vllm/pull/17623" }, { "source": "security-advisories@github.com", "tags": [ "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-9hcf-v7m4-6m2j" } ], "sourceIdentifier": "security-advisories@github.com", "vulnStatus": "Analyzed", "weaknesses": [ { "description": [ { "lang": "en", "value": "CWE-248" } ], "source": "security-advisories@github.com", "type": "Primary" } ] }
Vulnerability from fkie_nvd
Published
2025-05-06 17:16
Modified
2025-07-31 18:05
Severity ?
Summary
vLLM is an inference and serving engine for large language models. In a multi-node vLLM deployment using the V0 engine, vLLM uses ZeroMQ for some multi-node communication purposes. The secondary vLLM hosts open a `SUB` ZeroMQ socket and connect to an `XPUB` socket on the primary vLLM host. When data is received on this `SUB` socket, it is deserialized with `pickle`. This is unsafe, as it can be abused to execute code on a remote machine. Since the vulnerability exists in a client that connects to the primary vLLM host, this vulnerability serves as an escalation point. If the primary vLLM host is compromised, this vulnerability could be used to compromise the rest of the hosts in the vLLM deployment. Attackers could also use other means to exploit the vulnerability without requiring access to the primary vLLM host. One example would be the use of ARP cache poisoning to redirect traffic to a malicious endpoint used to deliver a payload with arbitrary code to execute on the target machine. Note that this issue only affects the V0 engine, which has been off by default since v0.8.0. Further, the issue only applies to a deployment using tensor parallelism across multiple hosts, which we do not expect to be a common deployment pattern. Since V0 is has been off by default since v0.8.0 and the fix is fairly invasive, the maintainers of vLLM have decided not to fix this issue. Instead, the maintainers recommend that users ensure their environment is on a secure network in case this pattern is in use. The V1 engine is not affected by this issue.
References
▶ | URL | Tags | |
---|---|---|---|
security-advisories@github.com | https://github.com/vllm-project/vllm/blob/c21b99b91241409c2fdf9f3f8c542e8748b317be/vllm/distributed/device_communicators/shm_broadcast.py#L295-L301 | Product | |
security-advisories@github.com | https://github.com/vllm-project/vllm/blob/c21b99b91241409c2fdf9f3f8c542e8748b317be/vllm/distributed/device_communicators/shm_broadcast.py#L468-L470 | Product | |
security-advisories@github.com | https://github.com/vllm-project/vllm/security/advisories/GHSA-9pcc-gvx5-r5wm | Vendor Advisory |
{ "configurations": [ { "nodes": [ { "cpeMatch": [ { "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "matchCriteriaId": "E2646F2B-C4B5-4D2B-B8E0-4113504AD8FF", "versionStartIncluding": "0.5.2", "vulnerable": true } ], "negate": false, "operator": "OR" } ] } ], "cveTags": [], "descriptions": [ { "lang": "en", "value": "vLLM is an inference and serving engine for large language models. In a multi-node vLLM deployment using the V0 engine, vLLM uses ZeroMQ for some multi-node communication purposes. The secondary vLLM hosts open a `SUB` ZeroMQ socket and connect to an `XPUB` socket on the primary vLLM host. When data is received on this `SUB` socket, it is deserialized with `pickle`. This is unsafe, as it can be abused to execute code on a remote machine. Since the vulnerability exists in a client that connects to the primary vLLM host, this vulnerability serves as an escalation point. If the primary vLLM host is compromised, this vulnerability could be used to compromise the rest of the hosts in the vLLM deployment. Attackers could also use other means to exploit the vulnerability without requiring access to the primary vLLM host. One example would be the use of ARP cache poisoning to redirect traffic to a malicious endpoint used to deliver a payload with arbitrary code to execute on the target machine. Note that this issue only affects the V0 engine, which has been off by default since v0.8.0. Further, the issue only applies to a deployment using tensor parallelism across multiple hosts, which we do not expect to be a common deployment pattern. Since V0 is has been off by default since v0.8.0 and the fix is fairly invasive, the maintainers of vLLM have decided not to fix this issue. Instead, the maintainers recommend that users ensure their environment is on a secure network in case this pattern is in use. The V1 engine is not affected by this issue." }, { "lang": "es", "value": "vLLM es un motor de inferencia y servicio para modelos de lenguaje extensos. En una implementaci\u00f3n de vLLM multinodo con el motor V0, vLLM utiliza ZeroMQ para la comunicaci\u00f3n multinodo. Los hosts secundarios de vLLM abren un socket \"SUB\" de ZeroMQ y se conectan a un socket \"XPUB\" en el host principal de vLLM. Cuando se reciben datos en este socket \"SUB\", se deserializan con \"pickle\". Esto es peligroso, ya que puede utilizarse para ejecutar c\u00f3digo en una m\u00e1quina remota. Dado que la vulnerabilidad existe en un cliente que se conecta al host principal de vLLM, sirve como punto de escalada. Si el host principal de vLLM se ve comprometido, esta vulnerabilidad podr\u00eda utilizarse para comprometer el resto de los hosts de la implementaci\u00f3n de vLLM. Los atacantes tambi\u00e9n podr\u00edan utilizar otros medios para explotar la vulnerabilidad sin necesidad de acceder al host principal de vLLM. Un ejemplo ser\u00eda el uso de envenenamiento de cach\u00e9 ARP para redirigir el tr\u00e1fico a un endpoint malicioso utilizado para entregar un payload con c\u00f3digo arbitrario que se ejecuta en la m\u00e1quina objetivo. Tenga en cuenta que este problema solo afecta al motor V0, que ha estado desactivado por defecto desde la versi\u00f3n v0.8.0. Adem\u00e1s, el problema solo se aplica a implementaciones que utilizan paralelismo tensorial en varios hosts, lo cual no esperamos que sea un patr\u00f3n de implementaci\u00f3n com\u00fan. Dado que V0 ha estado desactivado por defecto desde la versi\u00f3n v0.8.0 y la soluci\u00f3n es bastante invasiva, los responsables de vLLM han decidido no corregir este problema. En su lugar, recomiendan a los usuarios que se aseguren de que su entorno est\u00e9 en una red segura en caso de que se utilice este patr\u00f3n. El motor V1 no se ve afectado por este problema." } ], "id": "CVE-2025-30165", "lastModified": "2025-07-31T18:05:30.623", "metrics": { "cvssMetricV31": [ { "cvssData": { "attackComplexity": "LOW", "attackVector": "ADJACENT_NETWORK", "availabilityImpact": "HIGH", "baseScore": 8.0, "baseSeverity": "HIGH", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:A/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H", "version": "3.1" }, "exploitabilityScore": 2.1, "impactScore": 5.9, "source": "security-advisories@github.com", "type": "Secondary" } ] }, "published": "2025-05-06T17:16:11.660", "references": [ { "source": "security-advisories@github.com", "tags": [ "Product" ], "url": "https://github.com/vllm-project/vllm/blob/c21b99b91241409c2fdf9f3f8c542e8748b317be/vllm/distributed/device_communicators/shm_broadcast.py#L295-L301" }, { "source": "security-advisories@github.com", "tags": [ "Product" ], "url": "https://github.com/vllm-project/vllm/blob/c21b99b91241409c2fdf9f3f8c542e8748b317be/vllm/distributed/device_communicators/shm_broadcast.py#L468-L470" }, { "source": "security-advisories@github.com", "tags": [ "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-9pcc-gvx5-r5wm" } ], "sourceIdentifier": "security-advisories@github.com", "vulnStatus": "Analyzed", "weaknesses": [ { "description": [ { "lang": "en", "value": "CWE-502" } ], "source": "security-advisories@github.com", "type": "Primary" } ] }
Vulnerability from fkie_nvd
Published
2025-03-20 10:15
Modified
2025-07-31 14:48
Severity ?
Summary
vllm-project vllm version v0.6.2 contains a vulnerability in the MessageQueue.dequeue() API function. The function uses pickle.loads to parse received sockets directly, leading to a remote code execution vulnerability. An attacker can exploit this by sending a malicious payload to the MessageQueue, causing the victim's machine to execute arbitrary code.
References
▶ | URL | Tags | |
---|---|---|---|
security@huntr.dev | https://huntr.com/bounties/00136195-11e0-4ad0-98d5-72db066e867f | Exploit, Third Party Advisory |
{ "configurations": [ { "nodes": [ { "cpeMatch": [ { "criteria": "cpe:2.3:a:vllm:vllm:0.6.2:*:*:*:*:*:*:*", "matchCriteriaId": "5C723AC6-7D43-4776-B486-9F870A5645A6", "vulnerable": true } ], "negate": false, "operator": "OR" } ] } ], "cveTags": [], "descriptions": [ { "lang": "en", "value": "vllm-project vllm version v0.6.2 contains a vulnerability in the MessageQueue.dequeue() API function. The function uses pickle.loads to parse received sockets directly, leading to a remote code execution vulnerability. An attacker can exploit this by sending a malicious payload to the MessageQueue, causing the victim\u0027s machine to execute arbitrary code." }, { "lang": "es", "value": "vllm-project vllm versi\u00f3n v0.6.2 contiene una vulnerabilidad en la funci\u00f3n de la API MessageQueue.dequeue(). Esta funci\u00f3n utiliza pickle.loads para analizar directamente los sockets recibidos, lo que genera una vulnerabilidad de ejecuci\u00f3n remota de c\u00f3digo. Un atacante puede explotar esto enviando un payload a MessageQueue, lo que provoca que el equipo de la v\u00edctima ejecute c\u00f3digo arbitrario." } ], "id": "CVE-2024-11041", "lastModified": "2025-07-31T14:48:32.163", "metrics": { "cvssMetricV30": [ { "cvssData": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 9.8, "baseSeverity": "CRITICAL", "confidentialityImpact": "HIGH", "integrityImpact": "HIGH", "privilegesRequired": "NONE", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H", "version": "3.0" }, "exploitabilityScore": 3.9, "impactScore": 5.9, "source": "security@huntr.dev", "type": "Secondary" } ] }, "published": "2025-03-20T10:15:23.420", "references": [ { "source": "security@huntr.dev", "tags": [ "Exploit", "Third Party Advisory" ], "url": "https://huntr.com/bounties/00136195-11e0-4ad0-98d5-72db066e867f" } ], "sourceIdentifier": "security@huntr.dev", "vulnStatus": "Analyzed", "weaknesses": [ { "description": [ { "lang": "en", "value": "CWE-502" } ], "source": "security@huntr.dev", "type": "Primary" } ] }
Vulnerability from fkie_nvd
Published
2025-03-19 16:15
Modified
2025-07-31 15:58
Severity ?
Summary
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. The outlines library is one of the backends used by vLLM to support structured output (a.k.a. guided decoding). Outlines provides an optional cache for its compiled grammars on the local filesystem. This cache has been on by default in vLLM. Outlines is also available by default through the OpenAI compatible API server. The affected code in vLLM is vllm/model_executor/guided_decoding/outlines_logits_processors.py, which unconditionally uses the cache from outlines. A malicious user can send a stream of very short decoding requests with unique schemas, resulting in an addition to the cache for each request. This can result in a Denial of Service if the filesystem runs out of space. Note that even if vLLM was configured to use a different backend by default, it is still possible to choose outlines on a per-request basis using the guided_decoding_backend key of the extra_body field of the request. This issue applies only to the V0 engine and is fixed in 0.8.0.
References
▶ | URL | Tags | |
---|---|---|---|
security-advisories@github.com | https://github.com/vllm-project/vllm/blob/53be4a863486d02bd96a59c674bbec23eec508f6/vllm/model_executor/guided_decoding/outlines_logits_processors.py | Product | |
security-advisories@github.com | https://github.com/vllm-project/vllm/pull/14837 | Issue Tracking, Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/security/advisories/GHSA-mgrm-fgjv-mhv8 | Vendor Advisory, Patch |
{ "configurations": [ { "nodes": [ { "cpeMatch": [ { "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "matchCriteriaId": "4596758A-5F3D-4330-BB37-EEF73CC90D9E", "versionEndExcluding": "0.8.0", "vulnerable": true } ], "negate": false, "operator": "OR" } ] } ], "cveTags": [], "descriptions": [ { "lang": "en", "value": "vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. The outlines library is one of the backends used by vLLM to support structured output (a.k.a. guided decoding). Outlines provides an optional cache for its compiled grammars on the local filesystem. This cache has been on by default in vLLM. Outlines is also available by default through the OpenAI compatible API server. The affected code in vLLM is vllm/model_executor/guided_decoding/outlines_logits_processors.py, which unconditionally uses the cache from outlines. A malicious user can send a stream of very short decoding requests with unique schemas, resulting in an addition to the cache for each request. This can result in a Denial of Service if the filesystem runs out of space. Note that even if vLLM was configured to use a different backend by default, it is still possible to choose outlines on a per-request basis using the guided_decoding_backend key of the extra_body field of the request. This issue applies only to the V0 engine and is fixed in 0.8.0." }, { "lang": "es", "value": "vLLM es un motor de inferencia y servicio de alto rendimiento y eficiente en memoria para LLM. La librer\u00eda de esquemas es uno de los backends que vLLM utiliza para la salida estructurada (tambi\u00e9n conocida como decodificaci\u00f3n guiada). Outlines proporciona una cach\u00e9 opcional para sus gram\u00e1ticas compiladas en el sistema de archivos local. Esta cach\u00e9 est\u00e1 activada por defecto en vLLM. Outlines tambi\u00e9n est\u00e1 disponible por defecto a trav\u00e9s del servidor de API compatible con OpenAI. El c\u00f3digo afectado en vLLM es vllm/model_executor/guided_decoding/outlines_logits_processors.py, que utiliza incondicionalmente la cach\u00e9 de outlines. Un usuario malintencionado puede enviar un flujo de solicitudes de decodificaci\u00f3n muy cortas con esquemas \u00fanicos, lo que resulta en una adici\u00f3n a la cach\u00e9 para cada solicitud. Esto puede provocar una denegaci\u00f3n de servicio si el sistema de archivos se queda sin espacio. Tenga en cuenta que, incluso si vLLM se configur\u00f3 para usar un backend diferente por defecto, a\u00fan es posible seleccionar esquemas por solicitud mediante la clave `guided_decoding_backend` del campo `extra_body` de la solicitud. Este problema solo afecta al motor V0 y se solucion\u00f3 en la versi\u00f3n 0.8.0." } ], "id": "CVE-2025-29770", "lastModified": "2025-07-31T15:58:58.277", "metrics": { "cvssMetricV31": [ { "cvssData": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 6.5, "baseSeverity": "MEDIUM", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H", "version": "3.1" }, "exploitabilityScore": 2.8, "impactScore": 3.6, "source": "security-advisories@github.com", "type": "Secondary" } ] }, "published": "2025-03-19T16:15:31.977", "references": [ { "source": "security-advisories@github.com", "tags": [ "Product" ], "url": "https://github.com/vllm-project/vllm/blob/53be4a863486d02bd96a59c674bbec23eec508f6/vllm/model_executor/guided_decoding/outlines_logits_processors.py" }, { "source": "security-advisories@github.com", "tags": [ "Issue Tracking", "Patch" ], "url": "https://github.com/vllm-project/vllm/pull/14837" }, { "source": "security-advisories@github.com", "tags": [ "Vendor Advisory", "Patch" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-mgrm-fgjv-mhv8" } ], "sourceIdentifier": "security-advisories@github.com", "vulnStatus": "Analyzed", "weaknesses": [ { "description": [ { "lang": "en", "value": "CWE-770" } ], "source": "security-advisories@github.com", "type": "Primary" } ] }
Vulnerability from fkie_nvd
Published
2025-05-30 19:15
Modified
2025-07-01 20:42
Severity ?
Summary
vLLM is an inference and serving engine for large language models (LLMs). In version 0.8.0 up to but excluding 0.9.0, the vLLM backend used with the /v1/chat/completions OpenAPI endpoint fails to validate unexpected or malformed input in the "pattern" and "type" fields when the tools functionality is invoked. These inputs are not validated before being compiled or parsed, causing a crash of the inference worker with a single request. The worker will remain down until it is restarted. Version 0.9.0 fixes the issue.
References
▶ | URL | Tags | |
---|---|---|---|
security-advisories@github.com | https://github.com/vllm-project/vllm/pull/17623 | Issue Tracking, Vendor Advisory | |
security-advisories@github.com | https://github.com/vllm-project/vllm/security/advisories/GHSA-vrq3-r879-7m65 | Exploit, Vendor Advisory |
{ "configurations": [ { "nodes": [ { "cpeMatch": [ { "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "matchCriteriaId": "8E26EB9F-426B-4A73-B8CA-2D9F3727AF2C", "versionEndExcluding": "0.9.0", "versionStartIncluding": "0.8.0", "vulnerable": true } ], "negate": false, "operator": "OR" } ] } ], "cveTags": [], "descriptions": [ { "lang": "en", "value": "vLLM is an inference and serving engine for large language models (LLMs). In version 0.8.0 up to but excluding 0.9.0, the vLLM backend used with the /v1/chat/completions OpenAPI endpoint fails to validate unexpected or malformed input in the \"pattern\" and \"type\" fields when the tools functionality is invoked. These inputs are not validated before being compiled or parsed, causing a crash of the inference worker with a single request. The worker will remain down until it is restarted. Version 0.9.0 fixes the issue." }, { "lang": "es", "value": "vLLM es un motor de inferencia y servicio para modelos de lenguaje grandes (LLM). Desde la versi\u00f3n 0.8.0 hasta la 0.9.0 (excluyendo esta \u00faltima), el backend de vLLM utilizado con el endpoint de OpenAPI /v1/chat/completions no valida entradas inesperadas o incorrectas en los campos \"patr\u00f3n\" y \"tipo\" al invocar la funcionalidad de herramientas. Estas entradas no se validan antes de compilarse o analizarse, lo que provoca un bloqueo del trabajador de inferencia con una sola solicitud. El trabajador permanece inactivo hasta que se reinicia. La versi\u00f3n 0.9.0 corrige este problema." } ], "id": "CVE-2025-48944", "lastModified": "2025-07-01T20:42:13.840", "metrics": { "cvssMetricV31": [ { "cvssData": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 6.5, "baseSeverity": "MEDIUM", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H", "version": "3.1" }, "exploitabilityScore": 2.8, "impactScore": 3.6, "source": "security-advisories@github.com", "type": "Secondary" } ] }, "published": "2025-05-30T19:15:30.433", "references": [ { "source": "security-advisories@github.com", "tags": [ "Issue Tracking", "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/pull/17623" }, { "source": "security-advisories@github.com", "tags": [ "Exploit", "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-vrq3-r879-7m65" } ], "sourceIdentifier": "security-advisories@github.com", "vulnStatus": "Analyzed", "weaknesses": [ { "description": [ { "lang": "en", "value": "CWE-20" } ], "source": "security-advisories@github.com", "type": "Primary" } ] }
Vulnerability from fkie_nvd
Published
2025-04-30 01:15
Modified
2025-05-14 19:59
Severity ?
7.5 (High) - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H
7.5 (High) - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H
7.5 (High) - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H
Summary
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.5.2 and prior to 0.8.5 are vulnerable to denial of service and data exposure via ZeroMQ on multi-node vLLM deployment. In a multi-node vLLM deployment, vLLM uses ZeroMQ for some multi-node communication purposes. The primary vLLM host opens an XPUB ZeroMQ socket and binds it to ALL interfaces. While the socket is always opened for a multi-node deployment, it is only used when doing tensor parallelism across multiple hosts. Any client with network access to this host can connect to this XPUB socket unless its port is blocked by a firewall. Once connected, these arbitrary clients will receive all of the same data broadcasted to all of the secondary vLLM hosts. This data is internal vLLM state information that is not useful to an attacker. By potentially connecting to this socket many times and not reading data published to them, an attacker can also cause a denial of service by slowing down or potentially blocking the publisher. This issue has been patched in version 0.8.5.
References
▶ | URL | Tags | |
---|---|---|---|
security-advisories@github.com | https://github.com/vllm-project/vllm/commit/a0304dc504c85f421d38ef47c64f83046a13641c | Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/pull/6183 | Issue Tracking, Patch | |
security-advisories@github.com | https://github.com/vllm-project/vllm/security/advisories/GHSA-9f8f-2vmf-885j | Vendor Advisory, Exploit |
{ "configurations": [ { "nodes": [ { "cpeMatch": [ { "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "matchCriteriaId": "15AC3826-5B31-40E2-9964-6F8930043285", "versionEndExcluding": "0.8.5", "versionStartIncluding": "0.5.2", "vulnerable": true } ], "negate": false, "operator": "OR" } ] } ], "cveTags": [], "descriptions": [ { "lang": "en", "value": "vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.5.2 and prior to 0.8.5 are vulnerable to denial of service and data exposure via ZeroMQ on multi-node vLLM deployment. In a multi-node vLLM deployment, vLLM uses ZeroMQ for some multi-node communication purposes. The primary vLLM host opens an XPUB ZeroMQ socket and binds it to ALL interfaces. While the socket is always opened for a multi-node deployment, it is only used when doing tensor parallelism across multiple hosts. Any client with network access to this host can connect to this XPUB socket unless its port is blocked by a firewall. Once connected, these arbitrary clients will receive all of the same data broadcasted to all of the secondary vLLM hosts. This data is internal vLLM state information that is not useful to an attacker. By potentially connecting to this socket many times and not reading data published to them, an attacker can also cause a denial of service by slowing down or potentially blocking the publisher. This issue has been patched in version 0.8.5." }, { "lang": "es", "value": "vLLM es un motor de inferencia y servicio de alto rendimiento y eficiente en memoria para LLM. Las versiones a partir de la 0.5.2 y anteriores a la 0.8.5 son vulnerables a denegaci\u00f3n de servicio y exposici\u00f3n de datos a trav\u00e9s de ZeroMQ en implementaciones de vLLM multinodo. En una implementaci\u00f3n de vLLM multinodo, vLLM utiliza ZeroMQ para algunos fines de comunicaci\u00f3n multinodo. El host vLLM principal abre un socket XPUB ZeroMQ y lo vincula a TODAS las interfaces. Si bien el socket siempre est\u00e1 abierto para implementaciones multinodo, solo se utiliza al realizar paralelismo tensorial en varios hosts. Cualquier cliente con acceso de red a este host puede conectarse a este socket XPUB a menos que su puerto est\u00e9 bloqueado por un firewall. Una vez conectados, estos clientes arbitrarios recibir\u00e1n los mismos datos transmitidos a todos los hosts vLLM secundarios. Estos datos son informaci\u00f3n interna del estado de vLLM que no es \u00fatil para un atacante. Al conectarse a este socket muchas veces y no leer los datos publicados, un atacante tambi\u00e9n puede causar una denegaci\u00f3n de servicio al ralentizar o incluso bloquear al publicador. Este problema se ha corregido en la versi\u00f3n 0.8.5." } ], "id": "CVE-2025-30202", "lastModified": "2025-05-14T19:59:42.390", "metrics": { "cvssMetricV31": [ { "cvssData": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 7.5, "baseSeverity": "HIGH", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "privilegesRequired": "NONE", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H", "version": "3.1" }, "exploitabilityScore": 3.9, "impactScore": 3.6, "source": "security-advisories@github.com", "type": "Secondary" }, { "cvssData": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 7.5, "baseSeverity": "HIGH", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "privilegesRequired": "NONE", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H", "version": "3.1" }, "exploitabilityScore": 3.9, "impactScore": 3.6, "source": "nvd@nist.gov", "type": "Primary" } ] }, "published": "2025-04-30T01:15:51.800", "references": [ { "source": "security-advisories@github.com", "tags": [ "Patch" ], "url": "https://github.com/vllm-project/vllm/commit/a0304dc504c85f421d38ef47c64f83046a13641c" }, { "source": "security-advisories@github.com", "tags": [ "Issue Tracking", "Patch" ], "url": "https://github.com/vllm-project/vllm/pull/6183" }, { "source": "security-advisories@github.com", "tags": [ "Vendor Advisory", "Exploit" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-9f8f-2vmf-885j" } ], "sourceIdentifier": "security-advisories@github.com", "vulnStatus": "Analyzed", "weaknesses": [ { "description": [ { "lang": "en", "value": "CWE-770" } ], "source": "security-advisories@github.com", "type": "Primary" } ] }
Vulnerability from fkie_nvd
Published
2025-04-30 01:15
Modified
2025-05-28 19:15
Severity ?
6.5 (Medium) - CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H
7.5 (High) - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H
7.5 (High) - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H
Summary
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.8.0 and prior to 0.8.5 are affected by a critical performance vulnerability in the input preprocessing logic of the multimodal tokenizer. The code dynamically replaces placeholder tokens (e.g., <|audio_|>, <|image_|>) with repeated tokens based on precomputed lengths. Due to inefficient list concatenation operations, the algorithm exhibits quadratic time complexity (O(n²)), allowing malicious actors to trigger resource exhaustion via specially crafted inputs. This issue has been patched in version 0.8.5.
References
▶ | URL | Tags | |
---|---|---|---|
security-advisories@github.com | https://github.com/vllm-project/vllm/blob/8cac35ba435906fb7eb07e44fe1a8c26e8744f4e/vllm/model_executor/models/phi4mm.py#L1182-L1197 | Product | |
security-advisories@github.com | https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg | Exploit, Vendor Advisory | |
134c704f-9b21-4f2e-91b3-4a467353bcc0 | https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg | Exploit, Vendor Advisory |
{ "configurations": [ { "nodes": [ { "cpeMatch": [ { "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*", "matchCriteriaId": "19C6D0C7-632B-4AA7-97E5-CCF21EC350E5", "versionEndExcluding": "0.8.5", "versionStartIncluding": "0.8.0", "vulnerable": true } ], "negate": false, "operator": "OR" } ] } ], "cveTags": [], "descriptions": [ { "lang": "en", "value": "vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.8.0 and prior to 0.8.5 are affected by a critical performance vulnerability in the input preprocessing logic of the multimodal tokenizer. The code dynamically replaces placeholder tokens (e.g., \u003c|audio_|\u003e, \u003c|image_|\u003e) with repeated tokens based on precomputed lengths. Due to \u200b\u200binefficient list concatenation operations\u200b\u200b, the algorithm exhibits \u200b\u200bquadratic time complexity (O(n\u00b2))\u200b\u200b, allowing malicious actors to trigger resource exhaustion via specially crafted inputs. This issue has been patched in version 0.8.5." }, { "lang": "es", "value": "vLLM es un motor de inferencia y servicio de alto rendimiento y eficiente en memoria para LLM. Las versiones a partir de la 0.8.0 y anteriores a la 0.8.5 se ven afectadas por una vulnerabilidad cr\u00edtica de rendimiento en la l\u00f3gica de preprocesamiento de entrada del tokenizador multimodal. El c\u00f3digo reemplaza din\u00e1micamente los tokens de marcador de posici\u00f3n (p. ej., \u0026lt;|audio_|\u0026gt;, \u0026lt;|image_|\u0026gt;) con tokens repetidos basados ??en longitudes precalculadas. Debido a las ineficientes operaciones de concatenaci\u00f3n de listas, el algoritmo presenta una complejidad temporal cuadr\u00e1tica (O(n\u00b2)), lo que permite a los actores maliciosos activar el agotamiento de recursos mediante entradas especialmente manipuladas. Este problema se ha corregido en la versi\u00f3n 0.8.5." } ], "id": "CVE-2025-46560", "lastModified": "2025-05-28T19:15:56.887", "metrics": { "cvssMetricV31": [ { "cvssData": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 6.5, "baseSeverity": "MEDIUM", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "privilegesRequired": "LOW", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H", "version": "3.1" }, "exploitabilityScore": 2.8, "impactScore": 3.6, "source": "security-advisories@github.com", "type": "Secondary" }, { "cvssData": { "attackComplexity": "LOW", "attackVector": "NETWORK", "availabilityImpact": "HIGH", "baseScore": 7.5, "baseSeverity": "HIGH", "confidentialityImpact": "NONE", "integrityImpact": "NONE", "privilegesRequired": "NONE", "scope": "UNCHANGED", "userInteraction": "NONE", "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H", "version": "3.1" }, "exploitabilityScore": 3.9, "impactScore": 3.6, "source": "nvd@nist.gov", "type": "Primary" } ] }, "published": "2025-04-30T01:15:52.097", "references": [ { "source": "security-advisories@github.com", "tags": [ "Product" ], "url": "https://github.com/vllm-project/vllm/blob/8cac35ba435906fb7eb07e44fe1a8c26e8744f4e/vllm/model_executor/models/phi4mm.py#L1182-L1197" }, { "source": "security-advisories@github.com", "tags": [ "Exploit", "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg" }, { "source": "134c704f-9b21-4f2e-91b3-4a467353bcc0", "tags": [ "Exploit", "Vendor Advisory" ], "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg" } ], "sourceIdentifier": "security-advisories@github.com", "vulnStatus": "Analyzed", "weaknesses": [ { "description": [ { "lang": "en", "value": "CWE-1333" } ], "source": "security-advisories@github.com", "type": "Secondary" } ] }