freshcrate

Search results for "eval"

Clear filters
7 results found (Docs / Meta)
chinese-llm-benchmark📁v5.10🏛️ Flagship5,889

ReLE评测:中文AI大模型能力评测(持续更新):目前已囊括359个大模型,覆盖chatgpt、gpt-5.2、o4-mini、谷歌gemini-3-pro、Claude-4.6、文心ERNIE-X1.1、ERNIE-5.0、qwen3-max、qwen3.5-plus、百川、讯飞星火、商汤senseChat等商用模型, 以及step3.5-flash、kimi-k2.5、ernie4.5、Min

ai-agent-handbook📁0.0.0🌿 Growing67

Comprehensive guide to AI agent engineering: how 30+ frameworks actually work under the hood. Context rot, compaction, system prompt assembly, SOUL.md, agent loops, memory systems, tool sprawl, MCP,

awesome-pydantic-ai📁0.0.0🌱 Seedling58

An opinionated list of awesome Pydantic-AI frameworks, libraries, software and resources.

awesome-prompts📁main@2026-04-21🌿 Growing7,671

Curated list of chatgpt prompts from the top-rated GPTs in the GPTs Store. Prompt Engineering, prompt attack & prompt protect. Advanced Prompt Engineering papers.

Awesome-Repo-Level-Code-Generation📁main@2026-04-10🌿 Growing280

Must-read papers on Repository-level Code Generation & Issue Resolution 🔥

agent-knowledge-cycle📁v2.0.0🌱 Seedling3

Memory-centric self-improving harness for AI agents. Six-phase cycle + Security by Absence. ADRs, JSON schemas, and a dependency-free Python reference.