vLLM: OOM Denial of Service via Audio Decompression Bomb

vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.23.1rc0, vLLM's /v1/audio/transcriptions endpoint limits compressed upload size but not decoded PCM output. A 25MB OPUS file expands to ~14.9GB of float32 PCM at decode time. This vulnerability is fixed in 0.23.1rc0.

Problem type

CWE-409: Improper Handling of Highly Compressed Data (Data Amplification)

Affected products

vllm-project

vllm

< 0.23.1rc0 - AFFECTED

References

https://github.com/vllm-project/vllm/security/advisories/GHSA-6pr9-rp53-2pmc

https://github.com/vllm-project/vllm/security/advisories/GHSA-6pr9-rp53-2pmc

x_refsource_CONFIRM

https://github.com/vllm-project/vllm/pull/44970

https://github.com/vllm-project/vllm/pull/44970

x_refsource_MISC

GitHub Security Advisories

GHSA-6pr9-rp53-2pmc

vLLM: OOM Denial of Service via Audio Decompression Bomb

https://github.com/advisories/GHSA-6pr9-rp53-2pmc

Summary

vLLM's /v1/audio/transcriptions endpoint limits compressed upload size but not decoded PCM output. A 25MB OPUS file expands to ~14.9GB of float32 PCM at decode time. Tested on vLLM v0.19.0.

Details

SpeechToTextProcessor rejects uploads over VLLM_MAX_AUDIO_CLIP_FILESIZE_MB (default 25MB) based on compressed byte length, but the audio decoder in audio.py accumulates all decoded frames into memory with no size limit before returning:

# speech_to_text.py L184-189
if len(audio_data) / 1024 ** 2 > self.max_audio_filesize_mb:
    raise VLLMValidationError(...)
y, sr = load_audio(buf, sr=self.asr_config.sample_rate)  # decoded size unchecked

# audio.py L77-107
chunks: list[npt.NDArray] = []
for frame in container.decode(stream):
    chunks.append(frame.to_ndarray())
audio = np.concatenate(chunks, axis=-1).astype(np.float32)  # single contiguous allocation

A 25MB OPUS file at 6kbps encodes ~8.7 hours of audio. Decoding produces ~5.7GB of float32 PCM (232x amplification), and np.concatenate then allocates a second contiguous array, bringing peak RSS to ~14.9GB from a single request. SpeechToTextConfig.max_audio_clip_s (default 30s) applies only after the full decode and does not prevent the allocation.

Impact

An unauthenticated attacker can exhaust server memory with a small number of concurrent requests, each a valid upload within the documented size limit. Severity was assessed with reference to prior OOM vulnerability reports in vLLM.

Fix

A fix for this vulnerability was merged here: https://github.com/vllm-project/vllm/pull/44970

github.com

https://github.com/vllm-project/vllm/security/advisories/GHSA-6pr9-rp53-2pmc

github.com

https://github.com/vllm-project/vllm/pull/44970

github.com

https://github.com/vllm-project/vllm/commit/1b1359c33269446f13c05da9a90c25174cbea590

github.com

https://github.com/vllm-project/vllm/releases/tag/v0.23.1rc0

github.com

https://github.com/advisories/GHSA-6pr9-rp53-2pmc

JSON source

https://cveawg.mitre.org/api/cve/CVE-2026-54233

Click to expand

{
  "dataType": "CVE_RECORD",
  "dataVersion": "5.2",
  "cveMetadata": {
    "cveId": "CVE-2026-54233",
    "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
    "assignerShortName": "GitHub_M",
    "dateUpdated": "2026-06-22T22:10:45.689Z",
    "dateReserved": "2026-06-12T16:25:43.084Z",
    "datePublished": "2026-06-22T22:10:45.689Z",
    "state": "PUBLISHED"
  },
  "containers": {
    "cna": {
      "providerMetadata": {
        "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
        "shortName": "GitHub_M",
        "dateUpdated": "2026-06-22T22:10:45.689Z"
      },
      "title": "vLLM: OOM Denial of Service via Audio Decompression Bomb",
      "descriptions": [
        {
          "lang": "en",
          "value": "vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.23.1rc0, vLLM's /v1/audio/transcriptions endpoint limits compressed upload size but not decoded PCM output. A 25MB OPUS file expands to ~14.9GB of float32 PCM at decode time. This vulnerability is fixed in 0.23.1rc0."
        }
      ],
      "affected": [
        {
          "vendor": "vllm-project",
          "product": "vllm",
          "versions": [
            {
              "version": "< 0.23.1rc0",
              "status": "affected"
            }
          ]
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "lang": "en",
              "description": "CWE-409: Improper Handling of Highly Compressed Data (Data Amplification)",
              "cweId": "CWE-409",
              "type": "CWE"
            }
          ]
        }
      ],
      "references": [
        {
          "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-6pr9-rp53-2pmc",
          "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-6pr9-rp53-2pmc",
          "tags": [
            "x_refsource_CONFIRM"
          ]
        },
        {
          "url": "https://github.com/vllm-project/vllm/pull/44970",
          "name": "https://github.com/vllm-project/vllm/pull/44970",
          "tags": [
            "x_refsource_MISC"
          ]
        }
      ],
      "metrics": [
        {
          "cvssV3_1": {
            "version": "3.1",
            "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H",
            "attackVector": "NETWORK",
            "attackComplexity": "LOW",
            "privilegesRequired": "LOW",
            "userInteraction": "NONE",
            "scope": "UNCHANGED",
            "confidentialityImpact": "NONE",
            "integrityImpact": "NONE",
            "availabilityImpact": "HIGH",
            "baseScore": 6.5,
            "baseSeverity": "MEDIUM"
          }
        }
      ]
    }
  }
}