Search results for "nvidia"
Dynamic versioning based on VCS tags for uv/hatch project
CuPy: NumPy & SciPy for GPU
Faster Whisper transcription with CTranslate2
A framework for elegantly configuring complex applications
SGLang is a fast serving framework for large language models and vision language models.
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropi
A general-purpose coding agent that runs inside an NVIDIA OpenShell sandbox, orchestrated by Deep Agents and powered by NVIDIA Nemotron. The agent writes and executes code in an isolated, policy-gover
MCP-NixOS - Model Context Protocol Server for NixOS resources
Cognithor - Agent OS: Local-first autonomous agent operating system. 16 LLM providers, 17 channels, 112+ MCP tools, 5-tier memory, A2A protocol, knowledge vault, voice, browser automation, Computer-us
AI Observability & Evaluation
π¬ Harness Vibe Research with Self-evolving AI Scientists
RESTai is an AIaaS (AI as a Service) open-source platform. Supports many public and local LLM suported by Ollama/vLLM/etc. Precise embeddings usage, tuning, analytics etc. Built-in image/audio generat
Your AI assistant that never forgets and runs 100% privately on your computer. Leave it on 24/7 - it learns your preferences, helps with code, manages your health goals, searches the web, and connects
See your agent think. Real-time observability dashboard for OpenClaw AI agents.
SmarterRouter: An intelligent LLM gateway and VRAM-aware router for Ollama, llama.cpp, and OpenAI. Features semantic caching, model profiling, and automatic failover for local AI labs.
A high-throughput and memory-efficient inference and serving engine for LLMs
Curated directory of terminal-native AI coding agents and the harnesses that orchestrate them. Covers open-source tools (Pi, OpenCode, Aider, Goose), platform agents (Claude Code, Codex, Gemini CLI),
Deploy any AI model, agent, database, RAG, and pipeline locally or remotely in minutes
The agent that grows with you
Monocle is a framework for tracing GenAI app code. This repo contains implementation of Monocle for GenAI apps written in Python.
An open-source AI assistant framework with skills and agent architecture
Wraps any OpenAI API interface as Responses with MCPs support so it supports Codex. Adding any missing stateful features. Ollama and Vllm compliant.
Official Code Release of SAGE: Scalable Agentic 3D Scene Generation for Embodied AI
An offline AI-powered video analysis tool with object detection (YOLO), image captioning (BLIP), speech transcription (Whisper), audio event detection (PANNs), and AI-generated summaries (LLMs via Oll
A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp
Structured Outputs
Universal LLM Gateway: One API, every LLM. OpenAI/Anthropic-compatible endpoints with multi-provider translation and intelligent load-balancing.
Zero-friction LLM fine-tuning skill for Claude Code, Gemini CLI & any ACP agent. Unsloth on NVIDIA Β· TRL+MPS/MLX on Apple Silicon. Automates env setup, LoRA training (SFT, DPO, GRPO, vision), post-hoc
One API for 20+ LLM providers, your databases, and your files β self-hosted, open-source AI gateway with RAG, voice, and guardrails.
AgenticX is a unified, production-ready multi-agent platform β Python SDK + CLI (agx) + Studio server + Machi desktop app. Features Meta-Agent orchestration, 15+ LLM providers, MCP Hub, hierarchical m
Unified framework for building enterprise RAG pipelines with small, specialized models
Connect AI models like Claude & GPT with robots using MCP and ROS.
autonomous AI agent that builds full-stack apps. local models. no cloud. no API keys. runs on your hardware.
Curated list of the best truly open-source AI projects, models, tools, and infrastructure.
[NeurIPS 2024 D&B] GTA: A Benchmark for General Tool Agents & [arXiv 2026] GTA-2
Dragon Brain β persistent long-term memory for AI agents via MCP (Model Context Protocol). Knowledge graph (FalkorDB) + vector search (Qdrant) + CUDA GPU embeddings. Works with Claude, Gemini CLI, Cur
Memory that remembers the story not just the facts. Three layer sentence graph for AI agents -> Facts, Episodes, raw Sentences. One DB. Zero config.
π₯ An autonomous AI agent that runs your deep learning experiments 24/7 while you sleep. Zero-cost monitoring, Leader-Worker architecture, constant-size memory.
Conversational & memory-enabled AI research partner for multi-omics analysis. From biological idea to full research paper.
RAG (Retrieval-augmented generation) ChatBot that provides answers based on contextual information extracted from a collection of Markdown files.
Lightweight semantic code search engine β 2-stage vector + FTS + RRF fusion + MCP server for Claude Code
A Multi-Agentic AI Assistant/Builder
Auto-Use Computer Use β drives your OS, browser, scours the web, writes your code. One agent, end to end.
A command-line interface tool for serving LLM using vLLM.
Control robots and physical hardware with natural language through Strands Agents.
π Use Claude Code CLI for free with NVIDIA's unlimited API. This proxy converts requests to NIM format and integrates with a Telegram bot for remote control.
Local-first AI agent framework with GUI, memory, web search, personality constructs, speech i/o, tools, skills, CLI & Telegram features β fully self-hosted via Ollama.
Local-first AI assistant β 9 specialized agents (code, web, debug, securityβ¦), 10M token vector memory, mobile relay via secure tunnel, real-time web search and document processing. Runs 100% on your
π€ Transform speech to text on Windows with fast, local AI processing. Enjoy seamless recording and automatic integration for effective communication.
Builds an autonomous AI robot with vision, voice, and decision-making capabilities using Python, PyTorch, and CUDA technology.
CUDA profiling tools runtime libs.
