Arbitrary Code Execution in NLTK StanfordSegmenter via Untrusted JAR Loading

NLTK versions <=3.9.2 are vulnerable to arbitrary code execution due to improper input validation in the StanfordSegmenter module. The module dynamically loads external Java .jar files without verification or sandboxing. An attacker can supply or replace the JAR file, enabling the execution of arbitrary Java bytecode at import time. This vulnerability can be exploited through methods such as model poisoning, MITM attacks, or dependency poisoning, leading to remote code execution. The issue arises from the direct execution of the JAR file via subprocess with unvalidated classpath input, allowing malicious classes to execute when loaded by the JVM.

Problem type

CWE-20 Improper Input Validation

Affected products

nltk

nltk/nltk

<= latest - AFFECTED

References

huntr.com

https://huntr.com/bounties/08b109bb-ac24-403f-9422-1c246ce60202

GitHub Security Advisories

GHSA-v2w2-xcg6-53wj

NLTK versions <=3.9.2 are vulnerable to arbitrary code execution due to improper input validation...

https://github.com/advisories/GHSA-v2w2-xcg6-53wj

nvd.nist.gov

https://nvd.nist.gov/vuln/detail/CVE-2026-0848

huntr.com

https://huntr.com/bounties/08b109bb-ac24-403f-9422-1c246ce60202

github.com

https://github.com/advisories/GHSA-v2w2-xcg6-53wj

JSON source

https://cveawg.mitre.org/api/cve/CVE-2026-0848

Click to expand

{
  "dataType": "CVE_RECORD",
  "dataVersion": "5.2",
  "cveMetadata": {
    "cveId": "CVE-2026-0848",
    "assignerOrgId": "c09c270a-b464-47c1-9133-acb35b22c19a",
    "assignerShortName": "@huntr_ai",
    "dateUpdated": "2026-03-05T20:48:05.364Z",
    "dateReserved": "2026-01-10T23:59:44.115Z",
    "datePublished": "2026-03-05T20:48:05.364Z",
    "state": "PUBLISHED"
  },
  "containers": {
    "cna": {
      "providerMetadata": {
        "orgId": "c09c270a-b464-47c1-9133-acb35b22c19a",
        "shortName": "@huntr_ai",
        "dateUpdated": "2026-03-05T20:48:05.364Z"
      },
      "title": "Arbitrary Code Execution in NLTK StanfordSegmenter via Untrusted JAR Loading",
      "descriptions": [
        {
          "lang": "en",
          "value": "NLTK versions <=3.9.2 are vulnerable to arbitrary code execution due to improper input validation in the StanfordSegmenter module. The module dynamically loads external Java .jar files without verification or sandboxing. An attacker can supply or replace the JAR file, enabling the execution of arbitrary Java bytecode at import time. This vulnerability can be exploited through methods such as model poisoning, MITM attacks, or dependency poisoning, leading to remote code execution. The issue arises from the direct execution of the JAR file via subprocess with unvalidated classpath input, allowing malicious classes to execute when loaded by the JVM."
        }
      ],
      "affected": [
        {
          "vendor": "nltk",
          "product": "nltk/nltk",
          "versions": [
            {
              "version": "unspecified",
              "status": "affected",
              "versionType": "custom",
              "lessThanOrEqual": "latest"
            }
          ]
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "lang": "en",
              "description": "CWE-20 Improper Input Validation",
              "cweId": "CWE-20",
              "type": "CWE"
            }
          ]
        }
      ],
      "references": [
        {
          "url": "https://huntr.com/bounties/08b109bb-ac24-403f-9422-1c246ce60202"
        }
      ],
      "metrics": [
        {
          "cvssV3_0": {
            "version": "3.0",
            "vectorString": "CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H",
            "attackVector": "NETWORK",
            "attackComplexity": "LOW",
            "privilegesRequired": "NONE",
            "userInteraction": "NONE",
            "scope": "CHANGED",
            "confidentialityImpact": "HIGH",
            "integrityImpact": "HIGH",
            "availabilityImpact": "HIGH",
            "baseScore": 10,
            "baseSeverity": "CRITICAL"
          }
        }
      ]
    }
  }
}