A Production-Ready, Scalable RAG-powered LLM-based Context-Aware QA App
-
Updated
Jan 27, 2025 - Python
A Production-Ready, Scalable RAG-powered LLM-based Context-Aware QA App
Custom AI Generator -- Pretrain your LLM Models with this Automated Embedding Generator and model Q&A Interface. Uses Retrieval Augmented Generation (RAG) to reduce hallucinations and ground the LLM on a source of truth
A drop-in replacement of fastapi to enable scalable and fault tolerant deployments with ray serve
contains the basic structure that a model serving application should have. This implementation is based on the Ray Serve framework.
Add a description, image, and links to the ray-serve topic page so that developers can more easily learn about it.
To associate your repository with the ray-serve topic, visit your repo's landing page and select "manage topics."