tensorrt-llm

Here are 5 public repositories matching this topic...

NetEase-Media / grps

Deep Learning Deployment Framework: Supports tf/torch/trt/trtllm/vllm and other NN frameworks. Support dynamic batching, and streaming modes. It is dual-language compatible with Python and C++, offering scalability, extensibility, and high performance. It helps users quickly deploy models and provide services through HTTP/RPC interfaces.

tensorflow torch tensorrt serving triton-inference-server dynamic-batching vllm tensorrt-llm

Updated Jan 10, 2025
C++

NetEase-Media / grps_trtllm

Star

High-Performance OpenAI LLM Service: A pure C++ high-performance OpenAI LLM service implemented with GPRS+TensorRT-LLM+Tokenizers.cpp, supporting chat and function call, AI agents, distributed multi-GPU inference, multimodal capabilities, and a Gradio chat interface.

openai multi-modal phi function-call qwq ai-agent llm llama-index chatglm tensorrt-llm qwen2 llama3 internvl2 qwen2-vl deepseek-r1 janus-pro

Updated Feb 24, 2025
C++

janhq / cortex.tensorrt-llm

Star

Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU accelerated inference on NVIDIA's GPUs.

nvidia jan tensorrt llm tensorrt-llm

Updated Sep 26, 2024
C++

EdVince / whisper-trtllm

Star

Whisper in TensorRT-LLM

cuda transformers openai whisper asr tensorrt huggingface tensorrt-llm

Updated Sep 21, 2023
C++

nyunAI / TensorRT-LLM

Star

tensorrt-engine tensorrt-llm

Updated Aug 22, 2024
C++

Improve this page

Add a description, image, and links to the tensorrt-llm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the tensorrt-llm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tensorrt-llm

Here are 5 public repositories matching this topic...

NetEase-Media / grps

NetEase-Media / grps_trtllm

janhq / cortex.tensorrt-llm

EdVince / whisper-trtllm

nyunAI / TensorRT-LLM

Improve this page

Add this topic to your repo