evals
Here are 32 public repositories matching this topic...
AI Observability & Evaluation
-
Updated
Feb 22, 2025 - Jupyter Notebook
Python SDK for AI agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks including CrewAI, Langchain, Autogen, AG2, and CamelAI
-
Updated
Feb 22, 2025 - Python
Laminar - open-source all-in-one platform for engineering AI products. Crate data flywheel for you AI app. Traces, Evals, Datasets, Labels. YC S24.
-
Updated
Feb 22, 2025 - TypeScript
🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite
-
Updated
Feb 23, 2025 - Python
Test your LLM-powered apps with TypeScript. No API key required.
-
Updated
Feb 11, 2025 - TypeScript
Vivaria is METR's tool for running evaluations and conducting agent elicitation research.
-
Updated
Feb 22, 2025 - TypeScript
[NeurIPS 2024] Official code for HourVideo: 1-Hour Video Language Understanding
-
Updated
Jan 9, 2025 - Jupyter Notebook
Evalica, your favourite evaluation toolkit
-
Updated
Feb 5, 2025 - Python
Benchmarking Large Language Models for FHIR
-
Updated
Nov 29, 2024
An implementation of the Anthropic's paper and essay on "A statistical approach to model evaluations"
-
Updated
Jan 27, 2025 - Python
Root Signals Python SDK
-
Updated
Feb 21, 2025 - Python
The OAIEvals Collector: A robust, Go-based metric collector for EVALS data. Supports Kafka, Elastic, Loki, InfluxDB, TimescaleDB integrations, and containerized deployment with Docker. Streamlines OAI-Evals data management efficiently with a low barrier of entry!
-
Updated
Oct 26, 2023 - Go
Open Source Video Understanding API and Large Vision Model Observability Platform.
-
Updated
Jan 15, 2025 - Python
Develop better LLM apps by testing different models and prompts in bulk.
-
Updated
Jul 29, 2024 - Python
Improve this page
Add a description, image, and links to the evals topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the evals topic, visit your repo's landing page and select "manage topics."