Search results for "hallucination"
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
ARIS โ๏ธ (Auto-Research-In-Sleep) โ Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in โ works wi
Agent! connects any AI to your Mac. 13 LLM providers โ cloud, local, or on-device. It writes code, builds Xcode projects, manages git, organizes files, automates Safari, controls any app, and handl
A community-driven collection of RAG (Retrieval-Augmented Generation) frameworks, projects, and resources. Contribute and explore the evolving RAG ecosystem.
LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.
Custom plugins for hermes-agent โ goal management, inter-agent bridge, model selection, cost control
Open source platform for AI Engineering: OpenTelemetry-native LLM Observability, GPU Monitoring, Guardrails, Evaluations, Prompt Management, Vault, Playground. ๐๐ป Integrates with 50+ LLM Providers,
Autonomous Agents (LLMs) research papers. Updated Daily.
๐ฅ Comprehensive survey on Context Engineering: from prompt engineering to production-grade AI systems. hundreds of papers, frameworks, and implementation guides for LLMs and AI agents.
Make AI work for Everyone - Monitoring and governing for your AI/ML
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related website
autonomous AI agent that builds full-stack apps. local models. no cloud. no API keys. runs on your hardware.
Curated list of chatgpt prompts from the top-rated GPTs in the GPTs Store. Prompt Engineering, prompt attack & prompt protect. Advanced Prompt Engineering papers.
OpenBrep: ็จ่ช็ถ่ฏญ่จ้ฉฑๅจ ArchiCAD GDL ๅบๅฏน่ฑก็ๅๅปบใไฟฎๆนไธ็ผ่ฏ
Automatically Update LLM-Agent Papers Daily using Github Actions (Update Every 12th hours)
A comprehensive evaluation framework for AI agents and LLM applications.
Open-Source Intelligent Command Layer
The conversational control layer for customer-facing AI agents - Parlant is a context-engineering framework optimized for controlling customer interactions.
The Mind Palace for AI Agents โ Autonomous Cognitive OS with affect-tagged memory (valence engine), token-economic RL (surprisal gate + UBI), Hebbian learning, ACT-R spreading activation, Synapse Engi
DSPEx - Declarative Self-improving Elixir | A BEAM-Native AI Program Optimization Framework
The LLM Evaluation Framework
Droid LLM Hunter is a tool to scan for vulnerabilities in Android applications using Large Language Models (LLMs).
The AI-Native Search Database. Unifies vector, text, structured and semi-structured data in a single engine, enabling hybrid search and in-database AI workflows.
Broken RAG For The Broken Souls
Syllabus-aware RAG study assistant for university students. Answers strictly from your own notes & PDFs, unit-scoped retrieval, cross-encoder reranking, and a hallucination gate โ built to help studen
Universal LLM Gateway: One API, every LLM. OpenAI/Anthropic-compatible endpoints with multi-provider translation and intelligent load-balancing.
Generate a custom newspaper with an AI agent based on your favorite YouTube channels.
The ultimate native macOS AI Agent. Blends local MLX SLMs with 3D cognitive Metal rendering and autonomous system integrations.
Lightweight hallucination detection framework for RAG applications
A structured reasoning and decision architecture for stable, interpretable, and hallucinationโresistant AI systems. An open standard for humanโAI collaboration and autonomous systems.
TSUKUYOMI is an advanced modular intelligence framework designed for the democratization of Intelligence Analysis via systematic analysis, processing, and reporting across multiple domains. Built on a
Python SDK for Agent AI Observability, Monitoring and Evaluation Framework. Includes features like agent, llm and tools tracing, debugging multi-agentic system, self-hosted dashboard and advanced anal
