Skip to content

Navigation Menu

inferless

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Sign up

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

Inferless

Building World's most reliable Serverless GPU Inference Offering. In Private Beta

45 followers
India
https://www.inferless.com/
company/inferless
@Inferless_
https://ypogria3pl4.typeform.com/to/nzuhQtba?typeform-source=www.inferless.com
https://docs.inferless.com/
contact@inferless.com

Overview
Repositories
Projects
Packages
People

More

Overview
Repositories
Projects
Packages
People

Popular repositories Loading

rmbg-1.4 Public template

State-of-the-art background removal model, designed to effectively separate foreground from background. <metadata> gpu: T4 | collections: ["HF Transformers"] </metadata>

Python 20 11
triton-co-pilot Public

Generate Glue Code in seconds to simplify your Nvidia Triton Inference Server Deployments

Python 19 3
Smaug-72B Public

Smaug-72B - which topped the Hugging Face LLM leaderboard and it’s the first model with an average score of 80, making it the world’s best open-source foundation model.

Python 17 5
qwq-32b-preview Public template

A 32B experimental reasoning model for advanced text generation and robust instruction following. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>

Python 17 6
whisper-large-v3 Public

State‑of‑the‑art speech recognition model for English, delivering transcription accuracy across diverse audio scenarios. <metadata> gpu: T4 | collections: ["CTranslate2"] </metadata>

Python 16 13
deepseek-r1-distill-qwen-32b Public template

A distilled DeepSeek-R1 variant built on Qwen2.5-32B, fine-tuned with curated data for enhanced performance and efficiency. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>

Python 16 23

Repositories

Loading

Type

Select type

All Public Sources Forks Archived Mirrors Templates

Language

Select language

All Dockerfile Python

Sort

Select order

Last updated Name Stars

Showing 10 of 160 repositories

DINet Public
A deformation inpainting network that enables realistic facial dubbing on high-resolution video by seamlessly modifying expressions with advanced inpainting techniques. <metadata> gpu: A100 | collections: ["Using Complex Outputs"] </metadata>

Python 0 1 0 0 Updated Apr 14, 2025
Phi-3.5-MoE-instruct-8bit Public
Phi-3.5-MoE a compact yet powerful model designed for instruction-following tasks. This model is part of the Phi-3 family, known for its efficiency and high performance. The Phi-3 Mini-128K-Instruct exhibited robust, state-of-the-art performance among models with fewer than 13B parameters.

Python 0 0 0 0 Updated Apr 13, 2025
idefics-9b-instruct-8bit Public
IDEFICS (Image-aware Decoder Enhanced à la Flamingo with Interleaved Cross-attentionS) is an open-access reproduction of Flamingo, a closed-source visual language model developed by Deepmind. Like GPT-4, the multimodal model accepts arbitrary sequences of image and text inputs and produces text outputs.

Python 0 3 0 0 Updated Apr 13, 2025
Book-Audio-Summary-Generator Public

Python 0 0 0 0 Updated Apr 12, 2025
TenyxChat-8x7B-v1 Public

Python 0 0 0 0 Updated Apr 12, 2025
Command-r-v01 Public
35B model delivering high performance in reasoning, summarization, and question answering. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>

Python 2 4 0 0 Updated Apr 11, 2025
InternVL2-Llama3-76B-AWQ Public

Python 0 1 0 0 Updated Apr 10, 2025
Stable-Diffusion-3.5-large Public

Python 0 0 0 0 Updated Apr 10, 2025
realvis-xl_v4.0_lightning Public
A lightweight, accelerated variant of RealVisXL V4.0, engineered for real‑time, high‑quality image generation with enhanced efficiency. <metadata> gpu: T4 | collections: ["Diffusers"] </metadata>

Python 0 6 0 0 Updated Apr 10, 2025
tinyllama-1.1b-chat-vllm-gguf Public
Deploy GGUF quantized version of Tinyllama-1.1B GGUF vLLM for efficient inference. <metadata> gpu: A100 | collections: ["Using NFS Volumes", "vLLM"] </metadata>

Python 1 7 0 0 Updated Apr 10, 2025

View all repositories

People

Top languages

Python Dockerfile

Most used topics

Loading…

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.