freshcrate

Search results for "evals"

38 results found
jarvis๐Ÿ“v1.28.0๐ŸŒฟ Growingโญ174

Your AI assistant that never forgets and runs 100% privately on your computer. Leave it on 24/7 - it learns your preferences, helps with code, manages your health goals, searches the web, and connects

pydantic-ai๐Ÿ“v1.84.1๐ŸŒณ Matureโญ16,274

AI Agent Framework, the Pydantic way

langchain๐Ÿ“langchain-core==1.3.0๐ŸŒณ Matureโญ133,178

The agent engineering platform

RAGHub๐Ÿ“main@2026-04-17๐ŸŒณ Matureโญ1,712

A community-driven collection of RAG (Retrieval-Augmented Generation) frameworks, projects, and resources. Contribute and explore the evolving RAG ecosystem.

langfuse๐Ÿ“v3.169.0๐ŸŒฟ Growingโญ24,578

๐Ÿชข Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. ๐ŸŠYC W23

logfire๐Ÿ“v4.32.1๐ŸŒฟ Growingโญ4,161

AI observability platform for production LLM and agent systems.

sample-agentic-frameworks-on-aws๐Ÿ“main@2026-04-17๐ŸŒฟ Growingโญ250

Build Agentic AI solutions on AWS, using latest OSS Agentic Frameworks.

fast-agent๐Ÿ“v0.6.17๐ŸŒฟ Growingโญ3,740

Code, Build and Evaluate agents - excellent Model and Skills/MCP/ACP Support

voratiq๐Ÿ“main@2026-04-21๐ŸŒฟ Growingโญ65

Agent ensembles to design, generate, and select the best code for every task.

evals๐Ÿ“v0.1.15๐ŸŒฟ Growingโญ103

A comprehensive evaluation framework for AI agents and LLM applications.

latitude-llm๐Ÿ“claude-code-telemetry-0.0.5๐ŸŒฟ Growingโญ3,955

Latitude is the open-source agent engineering platform

chinese-llm-benchmark๐Ÿ“v5.9๐ŸŒฟ Growingโญ5,841

ReLE่ฏ„ๆต‹๏ผšไธญๆ–‡AIๅคงๆจกๅž‹่ƒฝๅŠ›่ฏ„ๆต‹๏ผˆๆŒ็ปญๆ›ดๆ–ฐ๏ผ‰๏ผš็›ฎๅ‰ๅทฒๅ›Šๆ‹ฌ359ไธชๅคงๆจกๅž‹๏ผŒ่ฆ†็›–chatgptใ€gpt-5.2ใ€o4-miniใ€่ฐทๆญŒgemini-3-proใ€Claude-4.6ใ€ๆ–‡ๅฟƒERNIE-X1.1ใ€ERNIE-5.0ใ€qwen3-maxใ€qwen3.5-plusใ€็™พๅทใ€่ฎฏ้ฃžๆ˜Ÿ็ซใ€ๅ•†ๆฑคsenseChat็ญ‰ๅ•†็”จๆจกๅž‹๏ผŒ ไปฅๅŠstep3.5-flashใ€kimi-k2.5ใ€ernie4.5ใ€Min

vobase๐Ÿ“create-vobase@0.6.2๐ŸŒฑ Seedlingโญ43

The app framework built for AI coding agents. Own every line. Your AI already knows how to build on it.

promptfoo๐Ÿ“code-scan-action-0.1.5๐ŸŒฟ Growingโญ19,943

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and

awesome-prompts๐Ÿ“main@2026-04-21๐ŸŒฟ Growingโญ7,572

Curated list of chatgpt prompts from the top-rated GPTs in the GPTs Store. Prompt Engineering, prompt attack & prompt protect. Advanced Prompt Engineering papers.

honcho๐Ÿ“main@2026-04-21๐ŸŒฟ Growingโญ2,030

Memory library for building stateful agents

LLM-Agent-Paper-daily๐Ÿ“main@2026-04-21๐ŸŒฑ Seedlingโญ20

Automatically Update LLM-Agent Papers Daily using Github Actions (Update Every 12th hours)

Cogitator-AI๐Ÿ“main@2026-04-21๐ŸŒฑ Seedlingโญ35

๐Ÿค– Kubernetes for AI Agents. Self-hosted, production-grade runtime for orchestrating LLM swarms and autonomous agents. TypeScript-native.

trulens๐Ÿ“trulens-2.7.2๐ŸŒฑ Seedlingโญ3,237

Evaluation and Tracking for LLM Experiments and AI Agents

langgraphjs๐Ÿ“@langchain/langgraph-sdk@1.8.9๐ŸŒฟ Growingโญ2,775

Framework to build resilient language agents as graphs.

Awesome-Agent-Memory๐Ÿ“main@2026-04-16๐ŸŒฟ Growingโญ333

Curated systems, benchmarks, and papers etc. on memory for LLMs/MLLMs --- long-term context, retrieval, and reasoning.

mastra๐Ÿ“@mastra/core@1.24.0๐ŸŒฑ Seedlingโญ22,899

From the team behind Gatsby, Mastra is a framework for building AI-powered applications and agents with a modern TypeScript stack.

voltagent๐Ÿ“@voltagent/server-elysia@2.0.7๐ŸŒฟ Growingโญ7,851

AI Agent Engineering Platform built on an Open Source TypeScript AI Agent Framework

sv-excel-agent๐Ÿ“0.0.0๐ŸŒฑ Seedlingโญ179

An Excel AI agent that uses MCP tools to let LLMs read, edit, and automate Excel spreadsheets.

agent-skills-standard๐Ÿ“php-v1.3.2๐ŸŒฑ Seedlingโญ391

A collection of Agent Skills Standard and Best Practice for Programming Languages, Frameworks that help our AI Agent follow best practies on frameworks and programming laguages

everything-claude-code๐Ÿ“v1.10.0๐ŸŒฑ Seedlingโญ151,139

The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.

instructor๐Ÿ“v1.15.1๐ŸŒฑ Seedlingโญ12,743

structured outputs for llms

tensorzero๐Ÿ“2026.4.0๐ŸŒฑ Seedlingโญ11,204

TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.

agent2๐Ÿ“v0.1.0๐ŸŒฑ Seedlingโญ25

The production runtime for AI agents. Schema in, API out. Built on PydanticAI + FastAPI.

membrane๐Ÿ“v0.2.0๐ŸŒฑ Seedlingโญ75

A selective learning and memory substrate for agentic systems โ€” typed, revisable, decayable memory with competence learning and trust-aware retrieval.

mattermost-plugin-agents๐Ÿ“v1.14.0๐ŸŒฑ Seedlingโญ217

Mattermost Agents plugin supporting multiple LLMs

vassiliylakhonin.github.io๐Ÿ“v0.2.0๐ŸŒฑ Seedlingโญ4

AI-indexed portfolio and CV site with machine-readable profile data, evidence-backed case studies, verification signals, and a live MCP endpoint for agent access.

sec-edgar-mcp๐Ÿ“v1.0.8๐ŸŒฑ Seedlingโญ245

A SEC EDGAR MCP (Model Context Protocol) Server

ragas๐Ÿ“v0.4.3๐ŸŒฑ Seedlingโญ13,329

Supercharge Your LLM Application Evaluations ๐Ÿš€

gptme๐Ÿ“v0.31.0๐ŸŒฑ Seedlingโญ4,266

Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web. Make your own persistent autonomous agent on top!

Agentic-AI-Pipeline๐Ÿ“v1.0.0๐Ÿ’ค Dormantโญ57

๐Ÿฆพ A productionโ€‘ready research outreach AI agent that plans, discovers, reasons, uses tools, autoโ€‘builds cited briefings, and drafts tailored emails with toolโ€‘chaining, memory, tests, and turnkey Dock