freshcrate

Search results for "cuda"

40 results found
llama.cpp📁b8864🌳 Mature103,119

LLM inference in C/C++

jarvis📁v1.28.0🌿 Growing174

Your AI assistant that never forgets and runs 100% privately on your computer. Leave it on 24/7 - it learns your preferences, helps with code, manages your health goals, searches the web, and connects

ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works wi

cyllama📁0.2.11🌱 Seedling22

A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp

Code repo for "Most Language Models can be Poets too: An AI Writing Assistant and Constrained Text Generation Studio" at the (CAI2) workshop, jointly held at (COLING 2022)

LRAT📁0.0.0🌱 Seedling34

The implementation for SIGIR 2026: Learning to Retrieve from Agent Trajectories.

cactus📁0.0.0🌿 Growing50

LLM Agent that leverages cheminformatics tools to provide informed responses.

UGTLive📁0.0.0🌿 Growing73

An easy to use GUI-based tool that performs live translations using OCR and LLMs (Either cloud or local only)

Autonomous-Agents📁main@2026-04-16🌿 Growing1,211

Autonomous Agents (LLMs) research papers. Updated Daily.

RAGMeUp📁scala-ui🌳 Mature675

Generic rag framework to apply the power of LLMs on any given dataset

Dragon-Brain📁v1.1.0🌱 Seedling43

Dragon Brain — persistent long-term memory for AI agents via MCP (Model Context Protocol). Knowledge graph (FalkorDB) + vector search (Qdrant) + CUDA GPU embeddings. Works with Claude, Gemini CLI, Cur

vllm📁v0.19.1🌿 Growing76,155

A high-throughput and memory-efficient inference and serving engine for LLMs

mcp-devtools📁v0.59.53🌿 Growing133

A modular MCP server that provides commonly used developer tools for AI coding agents

SocratiCode📁v1.6.1🌿 Growing810

Enterprise-grade (40m+ lines) codebase intelligence in a zero-setup, private and local Claude Plugin or MCP: managed indexing, hybrid semantic search, polyglot code dependency graphs, and DB/API/infra

VideoGraphAI📁0.0.0🌿 Growing54

🎬 AI-powered YouTube Shorts automation tool using LLMs, real-time search, and text-to-speech. Create engaging short-form videos with automated research, voiceovers, and subtitles.

arag📁v0.1.0🌿 Growing247

A-RAG: Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces. State-of-the-art RAG framework with keyword, semantic, and chunk read tools for multi-hop QA.

AReaL📁v1.0.3🌿 Growing5,017

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

llmware📁v0.4.6🌿 Growing14,857

Unified framework for building enterprise RAG pipelines with small, specialized models

rag-chatbot📁main@2026-04-14🌿 Growing402

RAG (Retrieval-augmented generation) ChatBot that provides answers based on contextual information extracted from a collection of Markdown files.

deep-research-mcp📁main@2026-04-13🌿 Growing58

MCP server for OpenAI's Deep Research APIs, Gemini Deep Research Agent, and Hugging Face's Open Deep Research

next-plaid📁v1.2.0🌿 Growing331

NextPlaid, ColGREP: Multi-vector search, from database to coding agents.

Open-Sable📁v1.7.0🌱 Seedling18

Open-Sable is a local-first autonomous agent framework with AGI-inspired cognitive subsystems (goals, memory, metacognition, tool use). It can run continuously on your machine, integrate with chat int

oramacore📁v1.2.38🌱 Seedling249

OramaCore is the complete runtime you need for your projects, answer engines, copilots, and search. It includes a fully-fledged full-text search engine, vector database, LLM interface, and many more u

LocalAI📁v4.1.3🌱 Seedling45,254

LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.

everything-claude-code📁v1.10.0🌱 Seedling151,139

The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.

spiceai📁v1.11.5🌱 Seedling2,868

A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.

RAG-Anything📁v1.2.10🌱 Seedling15,557

"RAG-Anything: All-in-One RAG Framework"

fast-plaid📁1.4.5🌱 Seedling239

High-Performance Engine for Multi-Vector Search

codexlens-search📁v0.8.0🌱 Seedling44

Lightweight semantic code search engine — 2-stage vector + FTS + RRF fusion + MCP server for Claude Code

OriginDL📁v1.0.0🌱 Seedling245

Implement a Pytorch-like DL library in C++ from scratch, step by step

Somi📁Mineralization🌱 Seedling21

Local-first AI agent framework with GUI, memory, web search, personality constructs, speech i/o, tools, skills, CLI & Telegram features — fully self-hosted via Ollama.

onnxruntime-java📁v2.1.0🌱 Seedling29

A type-safe, lightweight, modern, and performant binding Java binding of Microsoft's ONNX Runtime

DreamServer📁v2.0.0🌱 Seedling478

Local AI anywhere, for everyone — LLM inference, chat UI, voice, agents, workflows, RAG, and image generation. No cloud, no subscriptions.

PAI-RAG📁v0.4.3🌱 Seedling450

An easy-to-use framework for modular RAG

uniAI📁0.0.0🌱 Seedling1

Syllabus-aware RAG study assistant for university students. Answers strictly from your own notes & PDFs, unit-scoped retrieval, cross-encoder reranking, and a hallucination gate — built to help studen

enton📁main@2026-04-21🌱 Seedling1

Builds an autonomous AI robot with vision, voice, and decision-making capabilities using Python, PyTorch, and CUDA technology.

loopy📁v2025.2💤 Dormant629

A code generator for array-based code on CPUs and GPUs

vllm-cli📁v0.2.5💤 Dormant487

A command-line interface tool for serving LLM using vLLM.