Skip to content

Issues: NVIDIA/TensorRT-LLM

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

tensorrt_llm_ucx_wrapper.dll and tensorrt_llm_ucx_wrapper.lib does not exist triaged Issue has been triaged by maintainers
#2834 opened Feb 28, 2025 by LRLVEC
RoBERTa model conversion does not pass the huggingface test bug Something isn't working
#2829 opened Feb 26, 2025 by arinaruck
2 of 4 tasks
Fail to build Tensorrt-LLM, error related to build ucxx bug Something isn't working triaged Issue has been triaged by maintainers
#2826 opened Feb 26, 2025 by tjliupeng
4 tasks
pytorch backend run error with fp8 hf model bug Something isn't working
#2825 opened Feb 26, 2025 by nickole2018
2 of 4 tasks
Baichuan2 model core dumped when running after quantization to FP8 bug Something isn't working
#2824 opened Feb 26, 2025 by kanebay
2 of 4 tasks
Qwen 2.5 FP8?
#2819 opened Feb 25, 2025 by jolyons123
NoneType object not subscriptable error in quantize.py Investigating Low Precision Issue about lower bit quantization, including int8, int4, fp8 triaged Issue has been triaged by maintainers
#2818 opened Feb 25, 2025 by HyungjoonYang
TRT-LLM 16 -> 17 regression: reduce_fusion with user_buffer plugin on fp8 + llama + L4 bug Something isn't working
#2817 opened Feb 25, 2025 by michaelfeil
2 of 4 tasks
INT8 KV cache for VLMs
#2815 opened Feb 24, 2025 by ZHITENGLI
[TensorRT-LLM][ERROR] Assertion failed: Do not set crossKvCacheFraction for decoder-only model bug Something isn't working triaged Issue has been triaged by maintainers
#2814 opened Feb 24, 2025 by HPUedCSLearner
2 of 4 tasks
(Multi-GPU Triton deployment) MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD with errorcode 1. bug Something isn't working triaged Issue has been triaged by maintainers
#2813 opened Feb 23, 2025 by jasonngap1
1 of 4 tasks
Feature Request: Data Parallelism for Executor API triaged Issue has been triaged by maintainers
#2812 opened Feb 23, 2025 by MahmoudAshraf97
MPI Error when build from the souce code bug Something isn't working triaged Issue has been triaged by maintainers
#2811 opened Feb 22, 2025 by yuqie
BUG in W4A8_awq-kv-FP8, W-fp8-A-fp8-kv-fp8, in the 0.17.0.post1 bug Something isn't working triaged Issue has been triaged by maintainers
#2810 opened Feb 21, 2025 by white-wolf-tech
4 tasks
speculative decoding not work
#2804 opened Feb 20, 2025 by biaochen
Possible bug in Qwen convert_checkpoint.py Investigating Low Precision Issue about lower bit quantization, including int8, int4, fp8 triaged Issue has been triaged by maintainers
#2803 opened Feb 19, 2025 by mathijshenquet
ProTip! Follow long discussions with comments:>50.