Search results for "cuda"
Your AI assistant that never forgets and runs 100% privately on your computer. Leave it on 24/7 - it learns your preferences, helps with code, manages your health goals, searches the web, and connects
ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works wi
A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp
Code repo for "Most Language Models can be Poets too: An AI Writing Assistant and Constrained Text Generation Studio" at the (CAI2) workshop, jointly held at (COLING 2022)
The implementation for SIGIR 2026: Learning to Retrieve from Agent Trajectories.
LLM Agent that leverages cheminformatics tools to provide informed responses.
An easy to use GUI-based tool that performs live translations using OCR and LLMs (Either cloud or local only)
Autonomous Agents (LLMs) research papers. Updated Daily.
Generic rag framework to apply the power of LLMs on any given dataset
Dragon Brain — persistent long-term memory for AI agents via MCP (Model Context Protocol). Knowledge graph (FalkorDB) + vector search (Qdrant) + CUDA GPU embeddings. Works with Claude, Gemini CLI, Cur
A high-throughput and memory-efficient inference and serving engine for LLMs
A modular MCP server that provides commonly used developer tools for AI coding agents
Enterprise-grade (40m+ lines) codebase intelligence in a zero-setup, private and local Claude Plugin or MCP: managed indexing, hybrid semantic search, polyglot code dependency graphs, and DB/API/infra
🎬 AI-powered YouTube Shorts automation tool using LLMs, real-time search, and text-to-speech. Create engaging short-form videos with automated research, voiceovers, and subtitles.
A-RAG: Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces. State-of-the-art RAG framework with keyword, semantic, and chunk read tools for multi-hop QA.
A Multi-Agentic AI Assistant/Builder
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
Unified framework for building enterprise RAG pipelines with small, specialized models
RAG (Retrieval-augmented generation) ChatBot that provides answers based on contextual information extracted from a collection of Markdown files.
MCP server for OpenAI's Deep Research APIs, Gemini Deep Research Agent, and Hugging Face's Open Deep Research
NextPlaid, ColGREP: Multi-vector search, from database to coding agents.
Open-Sable is a local-first autonomous agent framework with AGI-inspired cognitive subsystems (goals, memory, metacognition, tool use). It can run continuously on your machine, integrate with chat int
OramaCore is the complete runtime you need for your projects, answer engines, copilots, and search. It includes a fully-fledged full-text search engine, vector database, LLM interface, and many more u
LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.
The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.
"RAG-Anything: All-in-One RAG Framework"
High-Performance Engine for Multi-Vector Search
Lightweight semantic code search engine — 2-stage vector + FTS + RRF fusion + MCP server for Claude Code
Implement a Pytorch-like DL library in C++ from scratch, step by step
Local-first AI agent framework with GUI, memory, web search, personality constructs, speech i/o, tools, skills, CLI & Telegram features — fully self-hosted via Ollama.
A type-safe, lightweight, modern, and performant binding Java binding of Microsoft's ONNX Runtime
Local AI anywhere, for everyone — LLM inference, chat UI, voice, agents, workflows, RAG, and image generation. No cloud, no subscriptions.
Self-hosted AI coding assistant
Syllabus-aware RAG study assistant for university students. Answers strictly from your own notes & PDFs, unit-scoped retrieval, cross-encoder reranking, and a hallucination gate — built to help studen
Builds an autonomous AI robot with vision, voice, and decision-making capabilities using Python, PyTorch, and CUDA technology.
A code generator for array-based code on CPUs and GPUs
