freshcrate — Search

Search results for "evals"

22 results found (Python)

trulens 📁trulens-2.7.2🌳 Mature⭐3,261

Evaluation and Tracking for LLM Experiments and AI Agents

agent-evaluation agentops ai-agents ai-monitoring ai-observability evals explainable-ml llm-eval pythonby trueraPython

jarvis 📁v1.28.0🌿 Growing⭐174

Your AI assistant that never forgets and runs 100% privately on your computer. Leave it on 24/7 - it learns your preferences, helps with code, manages your health goals, searches the web, and connects

ai assistant health machine-learning mcp nutrition privacy private pythonby isairPython

gptme 📁v0.31.1.dev20260420🌳 Mature⭐4,274

Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web. Make your own persistent autonomous agent on top!

agent agents ai-agents ai-assistant anthropic chatbot chatgpt cli pythonby gptmePython

pydantic-ai 📁v1.84.1🌳 Mature⭐16,274

AI Agent Framework, the Pydantic way

agent-framework genai llm pydantic pythonby pydanticPython

langchain 📁langchain-core==1.3.0🌳 Mature⭐133,178

The agent engineering platform

agents ai ai-agents anthropic chatgpt deepagents enterprise framework pythonby langchain-aiPython

instructor 📁v1.15.1🏛️ Flagship⭐12,802

structured outputs for llms

openai openai-function-calli openai-functions pydantic-v2 python validationby jxnlPython

logfire 📁v4.32.1🌿 Growing⭐4,161

AI observability platform for production LLM and agent systems.

agent-observability ai ai-observability ai-tools evals fastapi llm-observability logging pythonby pydanticPython

fast-agent 📁v0.6.17🌿 Growing⭐3,740

Code, Build and Evaluate agents - excellent Model and Skills/MCP/ACP Support

acp agent agent-framework agent-skills cli mcp mcp-client mcp-server pythonby evalstatePython

evals 📁v0.1.15🌿 Growing⭐103

A comprehensive evaluation framework for AI agents and LLM applications.

agentic agentic-ai ai evaluation machine-learning python strands-agentsby strands-agentsPython

sec-edgar-mcp 📁v1.0.8🌿 Growing⭐253

A SEC EDGAR MCP (Model Context Protocol) Server

ai artificial-intelligence edgar edgar-database finance genai llm mcp pythonby stefanoamorelliPython

ragas 📁v0.4.3🌳 Mature⭐13,569

Supercharge Your LLM Application Evaluations 🚀

evaluation llm llmops pythonby explodinggradientsPython

honcho 📁main@2026-04-21🌿 Growing⭐2,030

Memory library for building stateful agents

agent-memory ai ai-agents ai-memory anthropic context-engineering continual-learning embeddings pythonby plastic-labsPython

LLM-Agent-Paper-daily 📁main@2026-04-21🌱 Seedling⭐20

Automatically Update LLM-Agent Papers Daily using Github Actions (Update Every 12th hours)

llm llm-agent pythonby Lyz103Python

atomic-knowledge 📁v0.2.0🌱 Seedling⭐36

Markdown-first work-memory protocol for existing agents, with maintained knowledge, candidate notes, evals, and an example KB.

agent ai filesystem knowledge-base llm markdown personal-agent python ragby Nimo1987Python

sv-excel-agent 📁0.0.0🌱 Seedling⭐179

An Excel AI agent that uses MCP tools to let LLMs read, edit, and automate Excel spreadsheets.

pythonby SylvianAIPython

uipath-ai-skills 📁0.0.0🌱 Seedling⭐81

AI skills that turns coding agents into UiPath experts.

ai-skills code-generation python rpa uipathby marcelocruzrpaPython

dory 📁v0.1.0🌱 Seedling⭐14

One memory layer for every AI agent. Local-first, markdown source of truth, and CLI/HTTP/MCP native. Your agent forgot who you are. Again. Dory fixes that.

agents ai-agents claude-code codex docker fastapi knowledge-graph llm model-context-protocol pythonby deeflectPython

deepeval 📁v3.9.5🌳 Mature⭐14,701

The LLM Evaluation Framework

evaluation-framework evaluation-metrics llm-evaluation llm-evaluation-framework llm-evaluation-metrics pythonby confident-aiPython

agent2 📁v0.1.0🌱 Seedling⭐25

The production runtime for AI agents. Schema in, API out. Built on PydanticAI + FastAPI.

agent-runtime ai-agents ai-framework developer-tools docker enterprise-ai fastapi llm pythonby duozokkerPython

Agentic-AI-Pipeline 📁v1.0.0💤 Dormant⭐63

🦾 A production‑ready research outreach AI agent that plans, discovers, reasons, uses tools, auto‑builds cited briefings, and drafts tailored emails with tool‑chaining, memory, tests, and turnkey Dock

agent agentic-ai anthropic anthropic-ai aws chromadb docker duckduckgo pythonby hoangsonwwPython

inspect-ai 📁0.3.209🌱 Seedling

Framework for large language model evaluations

pypiby UK AI Security InstitutePython

google-cloud-aiplatform 📁1.148.1🌱 Seedling

Vertex AI API client library

pypiby Google LLCPython