Search results for "llm-as-a-judge"
The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
Open source platform for AI Engineering: OpenTelemetry-native LLM Observability, GPU Monitoring, Guardrails, Evaluations, Prompt Management, Vault, Playground. 🚀💻 Integrates with 50+ LLM Providers,
Autonomous Agents (LLMs) research papers. Updated Daily.
Self-evolving cognitive memory and context engine for AI agents in Java. Empowering 24/7 proactive agents like OpenClaw with understanding and SOTA performance.
Automatically Update LLM-Agent Papers Daily using Github Actions (Update Every 12th hours)
Build and run agents you can see, understand and trust.
A comprehensive evaluation framework for AI agents and LLM applications.
The LLM Evaluation Framework
