vLLM: temperature=NaN and temperature=Infinity bypass validation and propagate to GPU kernels

vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.23.1rc0, ll temperature validation gates use comparison operators (<, >), which silently evaluate to False for NaN and for positive Infinity in Python's IEEE 754 float semantics. Both values pass every guard and propagate to GPU sampling kernels, where they produce undefined behavior or CUDA errors that can crash the inference worker. This vulnerability is fixed in 0.23.1rc0.

Problem type

CWE-1287: Improper Validation of Specified Type of Input

Affected products

vllm-project

vllm

< 0.23.1rc0 - AFFECTED

References

https://github.com/vllm-project/vllm/security/advisories/GHSA-7h4p-rffg-7823

https://github.com/vllm-project/vllm/security/advisories/GHSA-7h4p-rffg-7823

x_refsource_CONFIRM

https://github.com/vllm-project/vllm/pull/45116

https://github.com/vllm-project/vllm/pull/45116

x_refsource_MISC

https://github.com/vllm-project/vllm/commit/d598d239737cfa37bcfcb98886ec3f3557fc7198

https://github.com/vllm-project/vllm/commit/d598d239737cfa37bcfcb98886ec3f3557fc7198

x_refsource_MISC

GitHub Security Advisories

GHSA-7h4p-rffg-7823

vLLM: temperature=NaN and temperature=Infinity bypass validation and propagate to GPU kernels

https://github.com/advisories/GHSA-7h4p-rffg-7823

Summary

All temperature validation gates use comparison operators (<, >), which silently evaluate to False for NaN and for positive Infinity in Python's IEEE 754 float semantics. Both values pass every guard and propagate to GPU sampling kernels, where they produce undefined behavior or CUDA errors that can crash the inference worker. Note: -Infinity is correctly caught.

Root Cause

sampling_params.py:384:

if 0 < self.temperature < _MAX_TEMP:  # NaN → False; +Inf → False

sampling_params.py:462:

if self.temperature < 0.0:            # NaN → False; +Inf → False
    raise VLLMValidationError(...)

No math.isnan() or math.isinf() check exists anywhere in sampling_params.py.

Python semantics (verified): float('nan') < 0.0 → False, float('inf') < 0.0 → False.

Impact

Crash of inference worker on GPU kernel execution with NaN/Inf softmax input, degrading service for all concurrent users.

Remediation

Add math.isfinite(self.temperature) check in _verify_args(). Reject non-finite float values with a 400 error.

Fix

A fix for this vulnerability was merged here: https://github.com/vllm-project/vllm/pull/45116

github.com

https://github.com/vllm-project/vllm/security/advisories/GHSA-7h4p-rffg-7823

github.com

https://github.com/vllm-project/vllm/pull/45116

github.com

https://github.com/vllm-project/vllm/commit/d598d239737cfa37bcfcb98886ec3f3557fc7198

github.com

https://github.com/advisories/GHSA-7h4p-rffg-7823

JSON source

https://cveawg.mitre.org/api/cve/CVE-2026-54235

Click to expand

{
  "dataType": "CVE_RECORD",
  "dataVersion": "5.2",
  "cveMetadata": {
    "cveId": "CVE-2026-54235",
    "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
    "assignerShortName": "GitHub_M",
    "dateUpdated": "2026-06-22T21:59:02.710Z",
    "dateReserved": "2026-06-12T16:25:43.084Z",
    "datePublished": "2026-06-22T21:59:02.710Z",
    "state": "PUBLISHED"
  },
  "containers": {
    "cna": {
      "providerMetadata": {
        "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
        "shortName": "GitHub_M",
        "dateUpdated": "2026-06-22T21:59:02.710Z"
      },
      "title": "vLLM: temperature=NaN and temperature=Infinity bypass validation and propagate to GPU kernels",
      "descriptions": [
        {
          "lang": "en",
          "value": "vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.23.1rc0, ll temperature validation gates use comparison operators (<, >), which silently evaluate to False for NaN and for positive Infinity in Python's IEEE 754 float semantics. Both values pass every guard and propagate to GPU sampling kernels, where they produce undefined behavior or CUDA errors that can crash the inference worker. This vulnerability is fixed in 0.23.1rc0."
        }
      ],
      "affected": [
        {
          "vendor": "vllm-project",
          "product": "vllm",
          "versions": [
            {
              "version": "< 0.23.1rc0",
              "status": "affected"
            }
          ]
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "lang": "en",
              "description": "CWE-1287: Improper Validation of Specified Type of Input",
              "cweId": "CWE-1287",
              "type": "CWE"
            }
          ]
        }
      ],
      "references": [
        {
          "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-7h4p-rffg-7823",
          "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-7h4p-rffg-7823",
          "tags": [
            "x_refsource_CONFIRM"
          ]
        },
        {
          "url": "https://github.com/vllm-project/vllm/pull/45116",
          "name": "https://github.com/vllm-project/vllm/pull/45116",
          "tags": [
            "x_refsource_MISC"
          ]
        },
        {
          "url": "https://github.com/vllm-project/vllm/commit/d598d239737cfa37bcfcb98886ec3f3557fc7198",
          "name": "https://github.com/vllm-project/vllm/commit/d598d239737cfa37bcfcb98886ec3f3557fc7198",
          "tags": [
            "x_refsource_MISC"
          ]
        }
      ],
      "metrics": [
        {}
      ]
    }
  }
}