freshcrate

Search results for "llm-evaluation"

Clear filters
7 results found (Python)
opik📁2.0.6🌳 Mature18,767

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

AI-Infra-Guard📁v4.1.4🌿 Growing3,428

A full-stack AI Red Teaming platform securing AI ecosystems via OpenClaw Security Scan, Agent Scan, Skills Scan, MCP scan, AI Infra scan and LLM jailbreak evaluation.

arag📁v0.1.0🌿 Growing247

A-RAG: Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces. State-of-the-art RAG framework with keyword, semantic, and chunk read tools for multi-hop QA.

giskard-oss📁giskard-checks/v1.0.2b1🌱 Seedling5,225

🐢 Open-Source Evaluation & Testing library for LLM Agents

mlflow📁v3.11.1🌱 Seedling25,285

The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controllin

AutoRAG📁v0.3.22🌱 Seedling4,693

AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation