Search results for "llm-inference"
3 results found (Python)
Monocle is a framework for tracing GenAI app code. This repo contains implementation of Monocle for GenAI apps written in Python.
A command-line interface tool for serving LLM using vLLM.
Monocle is a framework for tracing GenAI app code. This repo contains implementation of Monocle for GenAI apps written in Python.
A command-line interface tool for serving LLM using vLLM.