Search results for "evals"
From the team behind Gatsby, Mastra is a framework for building AI-powered applications and agents with a modern TypeScript stack.
The open-source TypeScript framework for building AI workflows and agents. Designed for Claude Code describe what you want, Claude builds it, with all the best practices already in place.
A collection of Agent Skills Standard and Best Practice for Programming Languages, Frameworks that help our AI Agent follow best practies on frameworks and programming laguages
๐ชข Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. ๐YC W23
Cognitive architecture for AI-augmented software development. Specialized agents, structured workflows, and multi-platform deployment. Claude Code ยท Codex ยท Copilot ยท Cursor ยท Factory ยท Warp ยท Windsur
Agent ensembles to design, generate, and select the best code for every task.
Latitude is the open-source agent engineering platform
The app framework built for AI coding agents. Own every line. Your AI already knows how to build on it.
Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and
๐ค Kubernetes for AI Agents. Self-hosted, production-grade runtime for orchestrating LLM swarms and autonomous agents. TypeScript-native.
Framework to build resilient language agents as graphs.
AI Agent Engineering Platform built on an Open Source TypeScript AI Agent Framework
Production-grade TypeScript AI runtime focused on reliability, governance, and reproducible LLM systems. Multi-provider gateway, agents, RAG, workflows, policy engine, audit trails, and deterministic
