Search results for "assessment"
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
AgentWard β Built for all, hardened for OpenClaw.
π₯ Pickle Rick for Claude Code β autonomous PRD-driven coding loops + relentless code review. Ralph Loop toolkit.
One-stop handbook for building, deploying, and understanding LLM agents with 60+ skeletons, tutorials, ecosystem guides, and evaluation tools.
This Guidance demonstrates how to streamline access to numerous large language models (LLMs) through a unified, industry-standard API gateway based on OpenAI API standards
AI Legal Assistant skill for Claude Code. Contract review, risk analysis, NDA generation, compliance auditing, negotiation strategy, and PDF reports β 14 skills, 5 parallel agents. If you want to lear
SDL-MCP (Symbol Delta Ledger MCP Server) is a cards-first context system for coding agents that saves tokens and improves context.
[ARCHIVED] ε·²θΏη§»ε° MIXIA-Framework repo
Autonomous Agents (LLMs) research papers. Updated Daily.
π₯ Comprehensive survey on Context Engineering: from prompt engineering to production-grade AI systems. hundreds of papers, frameworks, and implementation guides for LLMs and AI agents.
π Connect people and manage tasks with AILinkX, your all-in-one digital life operating system built on advanced AI technology.
Procedural memory for AI coding agents: transforms scattered session history into persistent, cross-agent memory so every agent learns from every other
Official MCP Servers for AWS
Comprehensive paid advertising audit & optimization skill for Claude Code. 225+ checks across Google, Meta, YouTube, LinkedIn, TikTok, Microsoft & Apple Search Ads with weighted scoring, parallel agen
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related website
The Self-Growing Karpathy LLM Wiki β grown by an AI agent yoyo from Karpathy's founding prompt
Curated list of chatgpt prompts from the top-rated GPTs in the GPTs Store. Prompt Engineering, prompt attack & prompt protect. Advanced Prompt Engineering papers.
Automatically Update LLM-Agent Papers Daily using Github Actions (Update Every 12th hours)
It's YOUR data. Take it back. Get your Garmin Connect data into a local SQLite database and AI ready (MCP server)
Conversational & memory-enabled AI research partner for multi-omics analysis. From biological idea to full research paper.
Automated security investigation tool using Microsoft MCP Servers, GitHub Copilot, Python Modules and custom copilot-instructions.
A comprehensive evaluation framework for AI agents and LLM applications.
MaverickMCP - Personal Stock Analysis MCP Server
OpenClawProBench is a live-first benchmark harness for evaluating LLM agents in the OpenClaw runtime with deterministic grading and repeated-trial reliability.
Pragmatic AI Labs MCP Agent Toolkit - An MCP Server designed to make code with agents more deterministic
MCP server for OpenAI's Deep Research APIs, Gemini Deep Research Agent, and Hugging Face's Open Deep Research
Codingbuddy orchestrates 29 specialized AI agents to deliver code quality comparable to a team of human experts through a PLAN β ACT β EVAL workflow.
Cyber Pilot is a traceable delivery system for requirements, design, plans, and code.
DSPEx - Declarative Self-improving Elixir | A BEAM-Native AI Program Optimization Framework
Enterprise-ready MCP Gateway & Registry that centralizes AI development tools with secure OAuth authentication, dynamic tool discovery, and unified access for both autonomous AI agents and AI coding a
Artifical Ecology For Thought and Emergent Reasoning. The Colony That Builds With You.
Self-hosted AI Agent Memory + Code Intelligence Platform β one MCP endpoint for persistent memory, AST-aware code search, shared knowledge, and quality enforcement across all your AI coding agents.
Declarative framework for orchestrating multi-model LLM pipelines with context engineering and quality gates.
754 structured cybersecurity skills for AI agents Β· Mapped to 5 frameworks: MITRE ATT&CK, NIST CSF 2.0, MITRE ATLAS, D3FEND & NIST AI RMF Β· agentskills.io standard Β· Works with Claude Code, GitHub Cop
The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
OSCAL tools for AI agents
Model Context Protocol (MCP) server for Kubernetes and OpenShift
Autonomous VAPT platform. Give it a target (FQDN, IP, CIDR) β it hunts, it reports. Inspired by the Obsidian Order.
Security scanner for MCP server configurations. Detects secrets, CVEs, permission issues, and exfiltration vectors across 10 AI tool clients.
Transform any LLM into an autonomous security testing agent with structured prompts for seven-phase vulnerability hunting.
AI agent security scanner. Detect vulnerabilities in agent configurations, MCP servers, and tool permissions. Available as CLI, GitHub Action, ECC plugin, and GitHub App integration. π‘οΈ
Build AI agents that actually do things. Synapse is an open-source platform for creating, connecting, and orchestrating AI agents powered by any LLM β local or cloud.
A dotfiles repo that treats AI agent behavior as infrastructure
Watchtower is a simple AI-powered penetration testing automation CLI tool that leverages LLMs and LangGraph to orchestrate agentic workflows that you can use to test your websites locally. Generate us
LightAgent: Lightweight AI agent framework with memory, tools & tree-of-thought. Supports multi-agent collaboration, self-learning, and major LLMs (OpenAI/DeepSeek/Qwen). Open-source with MCP/SSE prot
A Model Context Protocol server that provides task orchestration capabilities for AI assistants
π‘βοΈAI-Powered Penetration Testing Framework with automated vulnerability scanning, multi-agent system, and compliance reportingπ‘βοΈ
π§ Discover and evaluate advanced benchmark datasets for Large Language Model agents to enhance performance assessment in real-world tasks.
Benchmark for vector databases.
Multi-agent system for software development
The most comprehensive MCP server for Polymarket β 48 tools spanning direct trading, market discovery, smart money tracking, copy trading, backtesting, risk management, and portfolio optimization. Wor
Enable AI agents to autonomously create, evaluate, and evolve skills across any marketplace without user intervention.
Broken RAG For The Broken Souls
The ultimate native macOS AI Agent. Blends local MLX SLMs with 3D cognitive Metal rendering and autonomous system integrations.
The limbic layer. Personality, register, and internal state monitoring. ALBEDO lives here β the session instrument that governs tone, emotional signal detection, and the felt layer of every response.
TSUKUYOMI is an advanced modular intelligence framework designed for the democratization of Intelligence Analysis via systematic analysis, processing, and reporting across multiple domains. Built on a
These guides are designed to help teams and individuals leverage AI tools like GitHub Copilot, OpenAI, and Claude to build software projects efficiently and effectively
Robust, fast, scalable, and sandboxed open-source online code execution system for humans and AI.
PromptGPT is an opensource framework that enables users to automatically generate high-quality prompts with zero installations, coding necessary or technical knowledge. Promptgpt follows industry best
