llama

Star

Here are 43 public repositories matching this topic...

ggml-org / llama.cpp

Star

LLM inference in C/C++

llama ggml

Updated Apr 14, 2025
C++

SJTU-IPADS / PowerInfer

Star

High-speed Large Language Model Serving for Local Deployment

llama large-language-models llm local-inference llm-inference

Updated Feb 19, 2025
C++

LostRuins / koboldcpp

Star

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.

llama language-model gemma mistral koboldai llm llamacpp ggml koboldcpp gguf

Updated Apr 14, 2025
C++

Lightweight inference library for ONNX files, written in C++. It can run Stable Diffusion XL 1.0 on a RPI Zero 2 (or in 298MB of RAM) but also Mistral 7B on desktops and servers. ARM, x86, WASM, RISC-V supported. Accelerated by XNNPACK.

raspberry-pi machine-learning webassembly wasm llama mistral onnx tinyml stable-diffusion yolov8

Updated Apr 10, 2025
C++

zhihu / ZhiLight

Star

A highly optimized LLM inference acceleration engine for Llama and its variants.

cuda pytorch llama gpt inference-engine model-serving llm llm-serving llm-inference deepseek-r1

Updated Apr 14, 2025
C++

UbiquitousLearning / mllm

Star

Fast Multimodal LLM on Mobile Devices

llama multimodal large-language-models

Updated Mar 21, 2025
C++

tenstorrent / tt-metal

Star

🤘 TT-NN operator library, and TT-Metalium low level kernel programming model.

metal accelerator ml resnet llama low-level-programming mistral llm stable-diffusion mixtral tenstorrent

Updated Apr 14, 2025
C++

alibaba / rtp-llm

Star

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

inference llama gpt model-serving llm llmops llm-serving

Updated Jan 21, 2025
C++

mybigday / llama.rn

Star

React Native binding of llama.cpp

android ios react-native llama llm llama-cpp

Updated Mar 24, 2025
C++

vectorch-ai / ScaleLLM

Star

A high-performance inference system for large language models, designed for production environments.

performance gpu model production cuda efficiency inference transformer llama speculative serving llm llm-inference llama3

Updated Apr 12, 2025
C++

intel / xFasterTransformer

Star

intel inference transformer xeon llama model-serving llm chatglm qwen

Updated Apr 10, 2025
C++

andrewkchan / deepseek.cpp

Star

CPU inference for the DeepSeek family of large language models in pure C++

machine-learning cpp transformers llama llm llm-inference deepseek

Updated Apr 14, 2025
C++

andrewkchan / yalm

Star

Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O

machine-learning cpp cuda llama mistral inference-engine llm llamacpp llm-inference

Updated Jan 15, 2025
C++

guinmoon / llmfarm_core.swift

Star

Swift library to work with llama and other large language models.

swift ai falcon llama gpt-2 rwkv gptneox starcoder llama2

Updated Jan 27, 2025
C++

kuvaus / LlamaGPTJ-chat

Star

Simple chat program for LLaMa, GPT-J, and MPT models.

cli ai cpp mpt llama gpt gptj gpt4all

Updated Aug 2, 2023
C++

mgonzs13 / llama_ros

Star

llama.cpp (GGUF LLMs) and llava.cpp (GGUF VLMs) for ROS 2

cpp embeddings llama gpt ros2 vlm reranking llm langchain llava llamacpp ggml gguf rerank llavacpp

Updated Apr 14, 2025
C++

Genta-Technology / Kolosal

Sponsor

Star

Kolosal AI is an OpenSource and Lightweight alternative to LM Studio to run LLMs 100% offline on your device.

Updated Apr 9, 2025
C++

trzy / llava-cpp-server

Star

LLaVA server (llama.cpp).

llama multimodal vision-transformer llm llava llama2

Updated Oct 20, 2023
C++

gotzmann / booster

Star

Booster - open accelerator for LLM models. Better inference and debugging for AI hackers

openai llama gpt llm chatgpt llamacpp llama-cpp vllm ggml exllama oobabooga ollama

Updated Aug 15, 2024
C++

prajwalshettydev / UnrealGenAISupport

Sponsor

Star

UnrealMCP is here!! Automatic blueprint and scene generation from AI!! An Unreal Engine plugin for LLM/GenAI models & MCP UE5 server. Supports Claude Desktop App, Windsurf & Cursor, also includes OpenAI's GPT4o, DeepseekR1, Claude Sonnet 3.7 APIs and Grok 3, with plans to add Gemini, audio & realtime APIs soon.

Updated Apr 12, 2025
C++

Improve this page

Add a description, image, and links to the llama topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llama topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama

Here are 43 public repositories matching this topic...

ggml-org / llama.cpp

SJTU-IPADS / PowerInfer

LostRuins / koboldcpp

vitoplantamura / OnnxStream

zhihu / ZhiLight

UbiquitousLearning / mllm

tenstorrent / tt-metal

alibaba / rtp-llm

mybigday / llama.rn

vectorch-ai / ScaleLLM

intel / xFasterTransformer

andrewkchan / deepseek.cpp

andrewkchan / yalm

guinmoon / llmfarm_core.swift

kuvaus / LlamaGPTJ-chat

mgonzs13 / llama_ros

Genta-Technology / Kolosal

trzy / llava-cpp-server

gotzmann / booster

prajwalshettydev / UnrealGenAISupport

Improve this page

Add this topic to your repo