Search results for "eval"
5 results found (JavaScript)
The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
AI-powered job search system built on Claude Code. 14 skill modes, Go dashboard, PDF generation, batch processing.
Local-first AI agent bootstrap: Playwright Browser MCP + ContextDB for Codex CLI, Claude Code, Gemini CLI, and OpenCode.
Self-evolving Claude Code wrapper — handles any computer work a human can do. 94+ skills, 14 agents, computer use, self-improvement.
Assertion framework for AI agent evaluations - supports skill invocation checks, build validation, and LLM-based judging
