freshcrate
Skin:/
Home > #llm-inference

Tag: #llm-inference

8 packages â€ĸ ⭐ 12,331 total stars

plano0.4.23đŸ›ī¸ Flagship⭐6,366

Plano is an AI-native proxy and data plane for agentic apps — with built-in orchestration, safety, observability, and smart LLM routing so you stay focused on your agents core logic.

spiceaiv2.0.0đŸŒŗ Mature⭐2,880

A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.

neuron-ai3.15.6đŸŒŗ Mature⭐1,858

The PHP Agentic Framework to build production-ready AI driven applications. Connect components (LLMs, vector DBs, memory) to agents that can interact with your data. With its modular architecture it's

MiniSearchmain@2026-06-05đŸŒŋ Growing⭐558

Minimalist web-searching platform with an AI assistant that runs directly from your browser. Uses WebLLM, Wllama and SearXNG. Demo: https://felladrin-minisearch.hf.space

vllm-cliv0.2.5💤 Dormant⭐491

A command-line interface tool for serving LLM using vLLM.

monoclev0.8.3đŸŒŋ Growing⭐79

Monocle is a framework for tracing GenAI app code. This repo contains implementation of Monocle for GenAI apps written in Python.

sample-genai-on-eks-starter-kitv1.1.1đŸŒŋ Growing⭐53

A comprehensive toolkit for deploying production-ready Generative AI infrastructure on Amazon EKS. Includes pre-configured components for: 🚀 AI Gateway (LiteLLM) 🤖 LLM Serving (vLLM, SGLang, Ollama

llmtracev0.2.0🌱 Seedling⭐46

Zero-code LLM security & observability proxy. Real-time prompt injection detection, PII scanning, and cost control for OpenAI-compatible APIs. Built in Rust.