NVIDIA TRT-LLM Unsafe Deserialization Vulnerabilities Allow Code Execution, Data Tampering

Technical description

Three unsafe deserialization vulnerabilities in NVIDIA TensorRT-LLM (TRT-LLM) were published to NVD on May 20, 2026. CVE-2025-33255 (CVSS 7.5 High) affects the MPI server component; CVE-2026-24163 (CVSS 7.5 High) affects RPC testing; CVE-2026-24142 (CVSS 6.3 Medium) involves deserialization and unsafe serialized handles. All three allow an attacker to cause unsafe deserialization, potentially leading to code execution, denial of service, data tampering, and information disclosure. TRT-LLM is NVIDIA's widely-deployed library for optimized large language model inference, used in production LLM serving environments. The vulnerabilities affect 'any platform' according to NVD descriptions.

Attack vector

An attacker who can reach the TRT-LLM MPI server or RPC testing interface can send maliciously crafted serialized data to trigger unsafe deserialization. Successful exploitation can execute arbitrary code in the context of the TRT-LLM process, tamper with model outputs or configuration, cause denial of service, or disclose sensitive information (model weights, inference data, credentials). The attack surface depends on how TRT-LLM is deployed: cloud-hosted LLM serving endpoints, on-premises inference servers, edge AI deployments, or research clusters. If TRT-LLM services are exposed to untrusted networks or accept user-supplied serialized input, exploitation risk is elevated.

Affected systems

Any deployment using NVIDIA TensorRT-LLM for LLM inference, including cloud LLM serving platforms, on-premises AI infrastructure, edge AI deployments, and research environments. Organizations using TRT-LLM for production model serving (e.g., customer-facing chatbots, internal AI agents, code-generation services) should treat these as critical vulnerabilities. AI infrastructure teams should audit whether TRT-LLM services are network-accessible, whether they accept serialized input from untrusted sources, and what privilege level the TRT-LLM process runs at.

Mitigation

NVIDIA has not yet published patch details in the NVD records as of May 20, 2026. Organizations should monitor NVIDIA security bulletins for patches and workarounds. Interim mitigations: restrict network access to TRT-LLM MPI and RPC interfaces (firewall rules, network segmentation), validate and sanitize any serialized input before passing to TRT-LLM, run TRT-LLM processes with least-privilege service accounts, and implement runtime monitoring to detect anomalous deserialization behavior. For production LLM serving, consider placing TRT-LLM behind API gateways that validate and filter requests before reaching the inference layer.

NVIDIA TRT-LLM Unsafe Deserialization Vulnerabilities Allow Code Execution, Data Tampering

Technical description

Attack vector

Affected systems

Mitigation

Sources