Search results for "vlm"
Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Open-Source AI Camera Skills Platform, AI NVR & CCTV Surveillance. Local VLM video analysis with Qwen, DeepSeek, SmolVLM, LLaVA, YOLO26. LLM-powered agentic security camera agent — watches, understand
Give any AI agent a full desktop — it sees the screen, clicks, types, and runs apps like a human. Automate anything with a UI: browsers, legacy software, internal tools. No API needed. One Docker comm
A polyglot document intelligence framework with a Rust core. Extract text, metadata, images, and structured information from PDFs, Office documents, images, and 91+ formats. Available for Rust, Python
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.
OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX bac
The Mind Palace for AI Agents — Autonomous Cognitive OS with affect-tagged memory (valence engine), token-economic RL (surprisal gate + UBI), Hebbian learning, ACT-R spreading activation, Synapse Engi
vMLX - Home of JANG_Q - Cont Batch, Prefix, Paged, KV Cache Quant, VL - Powers MLX Studio. Image gen/edit, OpenAI/Anth
LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.
Official Code Release of SAGE: Scalable Agentic 3D Scene Generation for Embodied AI
RAGLight is a modular framework for Retrieval-Augmented Generation (RAG). It makes it easy to plug in different LLMs, embeddings, and vector stores, and now includes seamless MCP integration to connec
"RAG-Anything: All-in-One RAG Framework"
Autonomous Agents (LLMs) research papers. Updated Daily.
AgenticX is a unified, production-ready multi-agent platform — Python SDK + CLI (agx) + Studio server + Machi desktop app. Features Meta-Agent orchestration, 15+ LLM providers, MCP Hub, hierarchical m
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related website
Your Very Own Agent: The Ultimate, Complete Edition
Curated list of the best truly open-source AI projects, models, tools, and infrastructure.
Claude Code skills, architectural principles, and alternative approaches for AI-assisted development
Local AI server with persistent memory, RAG, and multi-backend inference (MLX / llama.cpp / Ollama). Runs entirely on your machine — zero data sent to external services.
Open-Sable is a local-first autonomous agent framework with AGI-inspired cognitive subsystems (goals, memory, metacognition, tool use). It can run continuously on your machine, integrate with chat int
Automatically Update LLM-Agent Papers Daily using Github Actions (Update Every 12th hours)
Syllabus-aware RAG study assistant for university students. Answers strictly from your own notes & PDFs, unit-scoped retrieval, cross-encoder reranking, and a hallucination gate — built to help studen
The AI agent that sees before it acts — perception-driven, file-based, pluggable personal AI agent framework
