vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.23.1rc0, vLLM's /v1/audio/transcriptions endpoint limits compressed upload size but not decoded PCM output. A 25MB OPUS file expands to ~14.9GB of float32 PCM at decode time. This vulnerability is fixed in 0.23.1rc0.
vLLM: OOM Denial of Service via Audio Decompression Bomb
Problem type
Affected products
vllm-project
< 0.23.1rc0 - AFFECTED
References
https://github.com/vllm-project/vllm/security/advisories/GHSA-6pr9-rp53-2pmc
https://github.com/vllm-project/vllm/pull/44970
GitHub Security Advisories
GHSA-6pr9-rp53-2pmc
vLLM: OOM Denial of Service via Audio Decompression Bomb
https://github.com/advisories/GHSA-6pr9-rp53-2pmcSummary
vLLM's /v1/audio/transcriptions endpoint limits compressed upload size but not decoded PCM output. A 25MB OPUS file expands to ~14.9GB of float32 PCM at decode time. Tested on vLLM v0.19.0.
Details
SpeechToTextProcessor rejects uploads over VLLM_MAX_AUDIO_CLIP_FILESIZE_MB (default 25MB) based on compressed byte length, but the audio decoder in audio.py accumulates all decoded frames into memory with no size limit before returning:
# speech_to_text.py L184-189
if len(audio_data) / 1024 ** 2 > self.max_audio_filesize_mb:
raise VLLMValidationError(...)
y, sr = load_audio(buf, sr=self.asr_config.sample_rate) # decoded size unchecked
# audio.py L77-107
chunks: list[npt.NDArray] = []
for frame in container.decode(stream):
chunks.append(frame.to_ndarray())
audio = np.concatenate(chunks, axis=-1).astype(np.float32) # single contiguous allocation
A 25MB OPUS file at 6kbps encodes ~8.7 hours of audio. Decoding produces ~5.7GB of float32 PCM (232x amplification), and np.concatenate then allocates a second contiguous array, bringing peak RSS to ~14.9GB from a single request. SpeechToTextConfig.max_audio_clip_s (default 30s) applies only after the full decode and does not prevent the allocation.
Impact
An unauthenticated attacker can exhaust server memory with a small number of concurrent requests, each a valid upload within the documented size limit. Severity was assessed with reference to prior OOM vulnerability reports in vLLM.
Fix
A fix for this vulnerability was merged here: https://github.com/vllm-project/vllm/pull/44970
https://github.com/vllm-project/vllm/security/advisories/GHSA-6pr9-rp53-2pmc
https://github.com/vllm-project/vllm/pull/44970
https://github.com/vllm-project/vllm/commit/1b1359c33269446f13c05da9a90c25174cbea590
https://github.com/vllm-project/vllm/releases/tag/v0.23.1rc0
https://github.com/advisories/GHSA-6pr9-rp53-2pmc
JSON source
https://cveawg.mitre.org/api/cve/CVE-2026-54233Click to expand
{
"dataType": "CVE_RECORD",
"dataVersion": "5.2",
"cveMetadata": {
"cveId": "CVE-2026-54233",
"assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
"assignerShortName": "GitHub_M",
"dateUpdated": "2026-06-22T22:10:45.689Z",
"dateReserved": "2026-06-12T16:25:43.084Z",
"datePublished": "2026-06-22T22:10:45.689Z",
"state": "PUBLISHED"
},
"containers": {
"cna": {
"providerMetadata": {
"orgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
"shortName": "GitHub_M",
"dateUpdated": "2026-06-22T22:10:45.689Z"
},
"title": "vLLM: OOM Denial of Service via Audio Decompression Bomb",
"descriptions": [
{
"lang": "en",
"value": "vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.23.1rc0, vLLM's /v1/audio/transcriptions endpoint limits compressed upload size but not decoded PCM output. A 25MB OPUS file expands to ~14.9GB of float32 PCM at decode time. This vulnerability is fixed in 0.23.1rc0."
}
],
"affected": [
{
"vendor": "vllm-project",
"product": "vllm",
"versions": [
{
"version": "< 0.23.1rc0",
"status": "affected"
}
]
}
],
"problemTypes": [
{
"descriptions": [
{
"lang": "en",
"description": "CWE-409: Improper Handling of Highly Compressed Data (Data Amplification)",
"cweId": "CWE-409",
"type": "CWE"
}
]
}
],
"references": [
{
"url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-6pr9-rp53-2pmc",
"name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-6pr9-rp53-2pmc",
"tags": [
"x_refsource_CONFIRM"
]
},
{
"url": "https://github.com/vllm-project/vllm/pull/44970",
"name": "https://github.com/vllm-project/vllm/pull/44970",
"tags": [
"x_refsource_MISC"
]
}
],
"metrics": [
{
"cvssV3_1": {
"version": "3.1",
"vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H",
"attackVector": "NETWORK",
"attackComplexity": "LOW",
"privilegesRequired": "LOW",
"userInteraction": "NONE",
"scope": "UNCHANGED",
"confidentialityImpact": "NONE",
"integrityImpact": "NONE",
"availabilityImpact": "HIGH",
"baseScore": 6.5,
"baseSeverity": "MEDIUM"
}
}
]
}
}
}