freshcrate

Search results for "llm-inference"

Clear filters
3 results found (Python)
monocle📁v0.7.8🌿 Growing72

Monocle is a framework for tracing GenAI app code. This repo contains implementation of Monocle for GenAI apps written in Python.

vllm-cli📁v0.2.5💤 Dormant491

A command-line interface tool for serving LLM using vLLM.

torchao📁0.17.0🌱 Seedling

Package for applying ao techniques to GPU models