freshcrate

Search results for "lua"

Clear filters
20 results found (Python)
opik📁2.0.6🌳 Mature18,767

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

ai-agents-reality-check📁0.0.0🌿 Growing57

Benchmarking the gap between AI agent hype and architecture. Three agent archetypes, 73-point performance spread, stress testing, network resilience, and ensemble coordination analysis with statistica

arthur-engine📁2.1.529🌿 Growing75

Make AI work for Everyone - Monitoring and governing for your AI/ML

arag📁v0.1.0🌿 Growing247

A-RAG: Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces. State-of-the-art RAG framework with keyword, semantic, and chunk read tools for multi-hop QA.

evals📁v0.1.15🌿 Growing103

A comprehensive evaluation framework for AI agents and LLM applications.

AI-Infra-Guard📁v4.1.4🌿 Growing3,428

A full-stack AI Red Teaming platform securing AI ecosystems via OpenClaw Security Scan, Agent Scan, Skills Scan, MCP scan, AI Infra scan and LLM jailbreak evaluation.

OpenClawProBench📁main@2026-04-15🌿 Growing340

OpenClawProBench is a live-first benchmark harness for evaluating LLM agents in the OpenClaw runtime with deterministic grading and repeated-trial reliability.

medusa📁v2026.5.5🌿 Growing252

AI-first security scanner with 76 analyzers, 9,600+ detection rules, and repo poisoning detection for AI/ML, LLM agents, and MCP servers. Scan any GitHub repo with: medusa scan --git user/repo

giskard-oss📁giskard-checks/v1.0.2b1🌱 Seedling5,225

🐢 Open-Source Evaluation & Testing library for LLM Agents

trulens📁trulens-2.7.2🌱 Seedling3,237

Evaluation and Tracking for LLM Experiments and AI Agents

mlflow📁v3.11.1🌱 Seedling25,285

The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controllin

AutoRAG📁v0.3.22🌱 Seedling4,693

AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

memora📁v0.2.27🌱 Seedling386

Give your AI agents persistent memory.

codexlens-search📁v0.8.0🌱 Seedling44

Lightweight semantic code search engine — 2-stage vector + FTS + RRF fusion + MCP server for Claude Code

any-agent📁1.18.0🌱 Seedling1,141

A single interface to use and evaluate different agent frameworks

ragas📁v0.4.3🌱 Seedling13,329

Supercharge Your LLM Application Evaluations 🚀

RagaAI-Catalyst📁v2.2.4💤 Dormant16,130

Python SDK for Agent AI Observability, Monitoring and Evaluation Framework. Includes features like agent, llm and tools tracing, debugging multi-agentic system, self-hosted dashboard and advanced anal