Modelopt-v0.23.2 not support Qwen2.5 series LLM model? #142

white-wolf-tech · 2025-02-27T06:10:06Z

When I use the Qwen2.5-3B model and perform quantization using the Int8_sq algorithm.
The checkpoint_convert.py script that comes with the TensorrtLLM library (that is, the Int8_sq algorithm implemented by them, without using the Modelopt library), the compiled engine can be used normally by the tritonserver tensorrtllm-backend.

However, when using the Modelopt library and the same algorithm, the compiled engine cannot be used normally by the tritonserver tensorrtllm-backend. Is this because the current version does not support this model? Or what other problems could there be?

kevalmorabia97 · 2025-02-27T06:12:30Z

What error do you see when using ModelOpt's quantized checkpoint with tritonserver?
Note that TensorRT-LLM under the hood also uses ModelOpt library for quantization

white-wolf-tech · 2025-02-27T06:25:03Z

The detailed situation is here.
NVIDIA/TensorRT-LLM#2810

The phenomenon is that, with the same algorithm, when using the conversion script that comes with TensorrtLLM, the output result is normal. However, after compiling with ModelOpt, all the output tokens are 1023, and the structure after decoding is:

"xx.Componentlocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklocklock"

cjluo-nv · 2025-02-27T19:28:40Z

Also have you tried the llm_ptq examples in this repo as well?

white-wolf-tech · 2025-02-28T05:28:53Z

Also have you tried the llm_ptq examples in this repo as well?

the result is the same

kevalmorabia97 assigned cjluo-nv Feb 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modelopt-v0.23.2 not support Qwen2.5 series LLM model? #142

Modelopt-v0.23.2 not support Qwen2.5 series LLM model? #142

white-wolf-tech commented Feb 27, 2025

kevalmorabia97 commented Feb 27, 2025

white-wolf-tech commented Feb 27, 2025

cjluo-nv commented Feb 27, 2025 •

edited

Loading

white-wolf-tech commented Feb 28, 2025

Modelopt-v0.23.2 not support Qwen2.5 series LLM model? #142

Modelopt-v0.23.2 not support Qwen2.5 series LLM model? #142

Comments

white-wolf-tech commented Feb 27, 2025

kevalmorabia97 commented Feb 27, 2025

white-wolf-tech commented Feb 27, 2025

cjluo-nv commented Feb 27, 2025 • edited Loading

white-wolf-tech commented Feb 28, 2025

cjluo-nv commented Feb 27, 2025 •

edited

Loading