Feature Request: Support for Per-Request Logits Post-Processor Registration #2809

EmileDqy · 2025-02-21T09:28:59Z

Issue

Currently, logits post-processors must be registered at initialization time through ExecutorConfig.logits_post_processor_map. This prevents runtime registration of post-processors and schemas on a per-request basis.

using LogitsPostProcessor = std::function<void(IdType, Tensor&, BeamTokens const&, StreamPtr const&, std::optional<IdType>)>;
using LogitsPostProcessorMap = std::unordered_map<std::string, LogitsPostProcessor>;

I noticed Disaggregated Serving in the 2025 Roadmap, but I am unsure of its scope and whether it is related to this issue.

Impact

Using Triton Inference Server with TensorRT-LLM backend in production:

Tight coupling between application and model deployment
The full set of validation schemas must be known and named at model build time
Application logic changes require model redeployment
Inference server cannot be schema-agnostic

Proposed Solution

Add support for single-use logits post-processors scoped to individual requests. The post-processor would be registered for the duration of the request only and automatically cleaned up afterwards, following the pattern established by vLLM and TGI's grammar implementations for constrained decoding. This functionality can be used in addition to the LogitsPostProcessorMap declared at initialization time, ensuring backward compatibility.

TGI has recently integrated TensorRT-LLM but had to disable its grammar support due to this limitation. This suggests broader ecosystem benefits.

Example Use Case

Using LM Format Enforcer with dynamic schemas that depend on request context. The current workaround requires pre-registering all possible schema combinations at initialization time, which can grow exponentially with context complexity.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Support for Per-Request Logits Post-Processor Registration #2809

Feature Request: Support for Per-Request Logits Post-Processor Registration #2809

EmileDqy commented Feb 21, 2025

Feature Request: Support for Per-Request Logits Post-Processor Registration #2809

Feature Request: Support for Per-Request Logits Post-Processor Registration #2809

Comments

EmileDqy commented Feb 21, 2025

Issue

Impact

Proposed Solution

Example Use Case