freshcrate — Search

Search results for "gpt-oss"

18 results found (Python)

sglang 📁0.5.10.post1🏛️ Flagship⭐26,220

SGLang is a fast serving framework for large language models and vision language models.

pypiby pypiPython

vllm 📁v0.19.1🏛️ Flagship⭐77,587

A high-throughput and memory-efficient inference and serving engine for LLMs

amd blackwell cuda deepseek deepseek-v3 gpt gpt-oss inference pythonby vllm-projectPython

cognithor 📁v0.92.3🌿 Growing⭐115

Cognithor - Agent OS: Local-first autonomous agent operating system. 16 LLM providers, 17 channels, 112+ MCP tools, 5-tier memory, A2A protocol, knowledge vault, voice, browser automation, Computer-us

agent-os ai-agent anthropic autonomous-agent discord-bot document-analysis gdpr-compliant gemini pythonby Alex8791-cyberPython

mcp-client-for-ollama 📁v0.28.0🌳 Mature⭐655

A text-based user interface (TUI) client for interacting with MCP servers using Ollama. Features include agent mode, multi-server, model switching, streaming responses, tool management, human-in-the-l

agentic-ai ai command-line-tool generative-ai linux llm local-llm macos pythonby joniglPython

jarvis 📁v1.28.0🌿 Growing⭐300

Your AI assistant that never forgets and runs 100% privately on your computer. Leave it on 24/7 - it learns your preferences, helps with code, manages your health goals, searches the web, and connects

ai assistant health machine-learning mcp nutrition privacy private pythonby isairPython

vmlx 📁v1.3.34🌿 Growing⭐348

vMLX - Home of JANG_Q - Cont Batch, Prefix, Paged, KV Cache Quant, VL - Powers MLX Studio. Image gen/edit, OpenAI/Anth

anthropic-api kvcache-compression kvcache-optimization kvcache-reuse llm lmstudio macbook mcp-server pythonby jjang-aiPython

LRAT 📁0.0.0🌱 Seedling⭐39

The implementation for SIGIR 2026: Learning to Retrieve from Agent Trajectories.

agent agentic llm python searchby Yuqi-ZhouPython

LLM-API-Key-Proxy 📁dev/build-20260301-1-b62f6e4🌿 Growing⭐465

Universal LLM Gateway: One API, every LLM. OpenAI/Anthropic-compatible endpoints with multi-provider translation and intelligent load-balancing.

api-key gemini-api large-language-model large-language-models llm pythonby MirrowelPython

CASSIA 📁v1.3.1🌿 Growing⭐89

CASSIA: A Multi-Agent LLM-Based Single-Cell Cell Type Annotation Framework

ai4science annotation bioinformatics cell-annotation llm multiagent-systems prompt-engineering python retrieved-augmented-generationby ElliotXiePython

py-gpt 📁v2.7.12🌳 Mature⭐1,738

Desktop AI Assistant powered by GPT-5, GPT-4, o1, o3, Gemini, Claude, Ollama, DeepSeek, Perplexity, Grok, Bielik, chat, vision, voice, RAG, image and video generation, agents, tools, MCP, plugins, spe

ai ai-assistant artificial-intelligence autonomous-agent chatbot claude deepseek desktop-app pythonby szczyglis-devPython

little-coder 📁v0.0.4🌱 Seedling⭐31

A coding agent optimized to smaller LLMs

ai-coding-assistant aider-polygot benchmark code-generation coding-agent coding-agents local-llm ollama pythonby itayinbarrPython

awesome-opensource-ai 📁main@2026-04-20🌿 Growing⭐2,849

Curated list of the best truly open-source AI projects, models, tools, and infrastructure.

agents ai artificial-intelligence awesome awesome-list generative-ai llm machine-learning python ragby alvinrealPython

ollamafreeapi 📁main@2026-04-15🌿 Growing⭐172

OllamaFreeAPI: Free Distributed API for Ollama LLMs Public gateway to our managed Ollama servers with: - Zero-configuration access to 50+ models - Auto load-balanced across global nodes - Free tier w

ai-api deepseek free-ai free-ai-api free-api llama llm mistral pythonby mfoud444Python

rag-chatbot 📁main@2026-04-14🌿 Growing⭐407

RAG (Retrieval-augmented generation) ChatBot that provides answers based on contextual information extracted from a collection of Markdown files.

chatbot chromadb gpu lamacpp llama3 llm python qwen3-5 ragby umbertogriffoPython

llm_context_benchmarks 📁0.0.0🌱 Seedling⭐59

📊 LLM Context Benchmarks - A comprehensive benchmarking tool for testing LLMs with varying context sizes using Ollama. Features dual benchmark modes (API/CLI), automatic hardware detection (optimiz

ai benchmarking llms pythonby ivanfioravantiPython

server-nexe 📁v1.0.2-beta🌱 Seedling⭐9

Local AI server with persistent memory, RAG, and multi-backend inference (MLX / llama.cpp / Ollama). Runs entirely on your machine — zero data sent to external services.

ai apple-silicon embeddings fastapi llama-cpp llm local-ai mlx python vector-databaseby jgoy-labsPython

vllm-cli 📁v0.2.5💤 Dormant⭐491

A command-line interface tool for serving LLM using vLLM.

llm llm-inference llm-tools python vllmby Chen-zexiPython

rag-agent 📁master@2026-04-21🌱 Seedling⭐7

Python LLM-RAG deep agent using LangChain, LangGraph and LangSmith built on Quart web microframework and served using Hypercorn ASGI and WSGI web server.

asgi-server http3-server hypercorn langchain langgraph langsmith llm-agent neo4j pythonby khtehPython