LocalAI

LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.

agents ai api audio-generation decentralized distributed go image-generation libp2p

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.

README

LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.

Drop-in API compatibility — OpenAI, Anthropic, ElevenLabs APIs
36+ backends — llama.cpp, vLLM, transformers, whisper, diffusers, MLX...
Any hardware — NVIDIA, AMD, Intel, Apple Silicon, Vulkan, or CPU-only
Multi-user ready — API key auth, user quotas, role-based access
Built-in AI agents — autonomous agents with tool use, RAG, MCP, and skills
Privacy-first — your data never leaves your infrastructure

Created and maintained by Ettore Di Giacinto.

📖 Documentation | 💬 Discord | 💻 Quickstart | 🖼️ Models | ❓FAQ

Guided tour

model-fit-canvas-mode.mp4

Click to see more!

User and auth

usersquota-1775167475876.mp4

Agents

agents.mp4

Usage metrics per user

usage.mp4

Fine-tuning and Quantization

quantize-fine-tune.mp4

WebRTC

talk.mp4

Quickstart

macOS

Note: The DMG is not signed by Apple. After installing, run: sudo xattr -d com.apple.quarantine /Applications/LocalAI.app. See #6268 for details.

Containers (Docker, podman, ...)

Already ran LocalAI before? Use docker start -i local-ai to restart an existing container.

CPU only:

docker run -ti --name local-ai -p 8080:8080 localai/localai:latest

NVIDIA GPU:

# CUDA 13
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-13

# CUDA 12
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-12

# NVIDIA Jetson ARM64 (CUDA 12, for AGX Orin and similar)
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-nvidia-l4t-arm64

# NVIDIA Jetson ARM64 (CUDA 13, for DGX Spark)
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-nvidia-l4t-arm64-cuda-13

AMD GPU (ROCm):

docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-gpu-hipblas

Intel GPU (oneAPI):

docker run -ti --name local-ai -p 8080:8080 --device=/dev/dri/card1 --device=/dev/dri/renderD128 localai/localai:latest-gpu-intel

Vulkan GPU:

docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-vulkan

Loading models

# From the model gallery (see available models with `local-ai models list` or at https://models.localai.io)
local-ai run llama-3.2-1b-instruct:q4_k_m
# From Huggingface
local-ai run huggingface://TheBloke/phi-2-GGUF/phi-2.Q8_0.gguf
# From the Ollama OCI registry
local-ai run ollama://gemma:2b
# From a YAML config
local-ai run https://gist.githubusercontent.com/.../phi-2.yaml
# From a standard OCI registry (e.g., Docker Hub)
local-ai run oci://localai/phi-2:latest

Automatic Backend Detection: LocalAI automatically detects your GPU capabilities and downloads the appropriate backend. For advanced options, see GPU Acceleration.

For more details, see the Getting Started guide.

Latest News

March 2026: Agent management, New React UI, WebRTC, MLX-distributed via P2P and RDMA, MCP Apps, MCP Client-side
February 2026: Realtime API for audio-to-audio with tool calling, ACE-Step 1.5 support
January 2026: LocalAI 3.10.0 — Anthropic API support, Open Responses API, video & image generation (LTX-2), unified GPU backends, tool streaming, Moonshine, Pocket-TTS. Release notes
December 2025: Dynamic Memory Resource reclaimer, Automatic multi-GPU model fitting (llama.cpp), Vibevoice backend
November 2025: Import models via URL, Multiple chats and history
October 2025: Model Context Protocol (MCP) support for agentic capabilities
September 2025: New Launcher for macOS and Linux, extended backend support for Mac and Nvidia L4T, MLX-Audio, WAN 2.2
August 2025: MLX, MLX-VLM, Diffusers, llama.cpp now supported on Apple Silicon
July 2025: All backends migrated outside the main binary — lightweight, modular architecture

For older news and full release notes, see GitHub Releases and the News page.

Features

Text generation (llama.cpp, transformers, vllm ... and more)
Text to Audio
Audio to Text
Image generation
OpenAI-compatible tools API
Realtime API (Speech-to-speech)
Embeddings generation
Constrained grammars
Download models from Huggingface
Vision API
Object Detection
Reranker API
P2P Inferencing
Distributed Mode — Horizontal scaling with PostgreSQL + NATS
Model Context Protocol (MCP)
Built-in Agents — Autonomous AI agents with tool use, RAG, skills, SSE streaming, and Agent Hub
Backend Gallery — Install/remove backends on the fly via OCI images
Voice Activity Detection (Silero-VAD)
Integrated WebUI

Supported Backends & Acceleration

LocalAI supports 36+ backends including llama.cpp, vLLM, transformers, whisper.cpp, diffusers, MLX, MLX-VLM, and many more. Hardware acceleration is available for NVIDIA (CUDA 12/13), AMD (ROCm), Intel (oneAPI/SYCL), Apple Silicon (Metal), Vulkan, and NVIDIA Jetson (L4T). All backends can be installed on-the-fly from the Backend Gallery.

See the full Backend & Model Compatibility Table and GPU Acceleration guide.

Resources

Autonomous Development Team

LocalAI is helped being maintained by a team of autonomous AI agents led by an AI Scrum Master.

Live Reports: reports.localai.io
Project Board: Agent task tracking
Blog Post: Learn about the experiment

Citation

If you utilize this repository, data in a downstream project, please consider citing it with:

@misc{localai,
  author = {Ettore Di Giacinto},
  title = {LocalAI: The free, Open source OpenAI alternative},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/go-skynet/LocalAI}},

Star history

License

LocalAI is a community-driven project created by Ettore Di Giacinto.

MIT - Author Ettore Di Giacinto mudler@localai.io

Acknowledgements

LocalAI couldn't have been built without the help of great software already available from the community. Thank you!

llama.cpp
https://github.com/tatsu-lab/stanford_alpaca
https://github.com/cornelk/llama-go for the initial ideas
https://github.com/antimatter15/alpaca.cpp
https://github.com/EdVince/Stable-Diffusion-NCNN
https://github.com/ggerganov/whisper.cpp
https://github.com/rhasspy/piper
exo for the MLX distributed auto-parallel sharding implementation

Contributors

This is a community project, a special thanks to our contributors!

Release History

Version	Changes	Urgency	Date
v4.7.1	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Other Changes * fix(config): only inject llama.cpp serving options on the llama.cpp path by @localai-bot in https://github.com/mudler/LocalAI/pull/10822 Full Changelog: https://github.com/mudler/LocalAI/compare/v4.7.0...v4.7.1	High	7/14/2026
v4.6.0	# 🎉 LocalAI 4.6.0 Release! 🚀 <h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> </h1> LocalAI 4.6.0 is out! This is a reliability-focused release: AMD ROCm backends now run on-GPU at full speed, distributed model loads no longer wedge when a worker dies, and realtime sessions warm up predictably. It also brings conversation forking to the built-in chat UI, a Prometheus co	High	7/4/2026
v4.5.0	# 🎉 LocalAI 4.5.0 Release! 🚀 <h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> </h1> LocalAI 4.5.0 is out! This release widens what LocalAI can perceive, sharpens the realtime voice API, and makes multi-user serving fast with zero configuration. Four new backends land, the React UI redesign ships in full, and distributed mode gets a robustness pass. Highlights:	High	6/23/2026
v4.4.3	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Other Changes * chore: :arrow_up: Update CrispStrobe/CrispASR to `d745bda4386ae0f9d1d2f23fff8ec95d76428221` by @localai-bot in https://github.com/mudler/LocalAI/pull/10260 * docs: :arrow_up: update docs version mudler/LocalAI by @localai-bot in https://github.com/mudler/LocalAI/pull/10259 * chore: :arrow_up: Update antirez/ds4 to `d881f2a05e8ff6bec001315a36b794b4aa310173` by @locala	High	6/13/2026
v4.4.0	# 🎉 LocalAI 4.4.0 Release! 🚀 <h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> </h1> LocalAI 4.4.0 is out! This is a big, multimodal-and-distributed release. Two brand-new audio backends land - parakeet.cpp (NVIDIA NeMo Parakeet ASR) and CrispASR (a multi-architecture ASR and TTS engine) - alongside native object detection + segmentation (`rfdetr-cpp`	High	6/10/2026
v4.3.6	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Other Changes * chore: :arrow_up: Update ggml-org/llama.cpp to `22d66b567eef11cf2e9832f04db64ee0323a0fd0` by @localai-bot in https://github.com/mudler/LocalAI/pull/10080 * security(http): refuse redirects on outbound clients via hardened pkg/httpclient by @richiejp in https://github.com/mudler/LocalAI/pull/10087 * feat(parakeet-cpp): add NVIDIA NeMo Parakeet ASR backend (parakeet.cp	High	5/30/2026
v4.3.1	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Other Changes * Fix kokoros backend build break from Backend trait drift by @Copilot in https://github.com/mudler/LocalAI/pull/9972 * chore: :arrow_up: Update antirez/ds4 to `f91c12b50a1448527c435c028bfc70d1b00f6c33` by @localai-bot in https://github.com/mudler/LocalAI/pull/9975 * chore: :arrow_up: Update ikawrakow/ik_llama.cpp to `9f7ba245ab41e118f03aa8dd5134d18a81159d02` by @local	High	5/25/2026
v4.2.6	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Other Changes * feat(llama-cpp): bump to MTP-merge SHA and automatically set MTP defaults by @localai-bot in https://github.com/mudler/LocalAI/pull/9852 * docs: :arrow_up: update docs version mudler/LocalAI by @localai-bot in https://github.com/mudler/LocalAI/pull/9853 * chore: :arrow_up: Update antirez/ds4 to `ef0a4905d05263df8e63689f2dd1efac618a752c` by @localai-bot in https://git	High	5/16/2026
v4.2.1	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Exciting New Features 🎉 * feat: add ds4 backend (DeepSeek V4 Flash) with tool calls, thinking, KV cache by @localai-bot in https://github.com/mudler/LocalAI/pull/9758 ### 👒 Dependencies * chore(deps): bump the go_modules group across 1 directory with 2 updates by @dependabot[bot] in https://github.com/mudler/LocalAI/pull/9759 ### Other Changes * docs: :arrow_up: update docs vers	High	5/11/2026
v4.1.3	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Bug fixes :bug: * fix(token): login via legacy api keys by @mudler in https://github.com/mudler/LocalAI/pull/9249 * fix(anthropic): do not emit empty tokens and fix SSE tool calls by @mudler in https://github.com/mudler/LocalAI/pull/9258 * fix(gpu): better detection for MacOS and Thor by @mudler in https://github.com/mudler/LocalAI/pull/9263 ### 👒 Dependencies * chore(deps): bump	High	4/6/2026
v4.1.2	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Bug fixes :bug: * fix(autoparser): correctly pass by logprobs by @mudler in https://github.com/mudler/LocalAI/pull/9239 * fix(chat): do not retry if we had chatdeltas or tooldeltas from backend by @mudler in https://github.com/mudler/LocalAI/pull/9244 ### Exciting New Features 🎉 * feat(llama.cpp): wire speculative decoding settings by @mudler in https://github.com/mudler/LocalAI/p	Medium	4/6/2026
v4.1.1	This is a patch release to address few regressions from the last release and the upcoming Gemma4, most importantly: - Fixes Gemma 4 tokenization with llama.cpp - Show login in api key only mode - Small fixes to improve Anthropic API compatibility ## What's Changed ### Other Changes * docs: Update Home Assistant integrations list by @loryanstrant in https://github.com/mudler/LocalAI/pull/9206 * chore: :arrow_up: Update ggml-org/llama.cpp to `a1cfb645307edc61a89e41557f290f441043d3c2` by	Medium	4/5/2026
v4.1.0	# 🎉 LocalAI 4.1.0 Release! 🚀 <h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> </h1> LocalAI 4.1.0 is out! 🔥 Just weeks after the landmark 4.0, we're back with another massive drop. This release turns LocalAI into a production-grade AI platform: spin up a distributed cluster with smart routing and autoscaling, lock it down with built-in auth and per-user quotas, fin	Medium	4/2/2026
v4.0.0	<!-- Release notes generated using configuration in .github/release.yml at master --> --- # 🎉 LocalAI 4.0.0 Release! 🚀 <h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> </h1> LocalAI 4.0.0 is out! This major release transforms LocalAI into a complete AI orchestration platform. We’ve embedded agentic and hybrid search capabilities directly into the core, completely ov	Low	3/14/2026
v3.12.1	<!-- Release notes generated using configuration in .github/release.yml at master --> This is a patch release to tag the new llama.cpp version which fixes incompatibilities with Qwen 3 coder. ## What's Changed ### Other Changes * docs: :arrow_up: update docs version mudler/LocalAI by @localai-bot in https://github.com/mudler/LocalAI/pull/8611 * feat(traces): Add backend traces by @richiejp in https://github.com/mudler/LocalAI/pull/8609 * chore: :arrow_up: Update ggml-org/llama.cpp to `	Low	2/21/2026
v3.12.0	# 🎉 LocalAI 3.12.0 Release! 🚀 <h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> </h1> LocalAI 3.12.0 is out! \| Feature \| Summary \| \|--------\|--------\| \| Multi-modal Realtime \| Send text, images, and audio in real-time conversations for richer interactions. \| \| Voxtral Backend \| New high-quality text-to-speech backend added. \| \| Multi-GPU Support \| Improve	Low	2/20/2026
v3.11.0	# 🎉 LocalAI 3.11.0 Release! 🚀 <h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> </h1> LocalAI 3.11.0 is a massive update for Audio and Multimodal capabilities. We are introducing Realtime Audio Conversations, a dedicated Music Generation UI, and a massive expansion of ASR (Speech-to-Text) and TTS backends. Whether you want to talk to your AI, clone vo	Low	2/7/2026
v3.10.1	This is a small patch release intended to provide bugfixes and minor polishment, along, we also added support to Qwen-TTS that was just released yesterday. - Fix reasoning detection on reasoning and instruct models - Support reasoning blocks with openresponses - API fixes to correctly run LTX-2 - Support Qwen3-TTS! ## What's Changed ### Bug fixes :bug: * fix(reasoning): support models with reasoning without starting thinking tag by @mudler in https://github.com/mudler/LocalAI/pull/813	Low	1/23/2026
v3.10.0	# 🎉 LocalAI 3.10.0 Release! 🚀 <h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> </h1> LocalAI 3.10.0 is big on agent capabilities, multi-modal support, and cross-platform reliability. We've added native Anthropic API support, launched a new Video Generation UI, introduced Open Responses API compatibility, and enhanced performance with a **unified GPU b	Low	1/18/2026
v3.9.0	# Xmas-release :santa: LocalAI 3.9.0! 🚀 <h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> </h1> LocalAI 3.9.0 is focused on stability, resource efficiency, and smarter agent workflows. We've addressed critical issues with model loading, improved system resource management, and introduced a new Agent Jobs panel for scheduling and managing background agentic	Low	12/24/2025
v3.8.0	<h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> </h1> Welcome to LocalAI 3.8.0 ! LocalAI 3.8.0 focuses on smoothing out the user experience and exposing more power to the user without requiring restarts or complex configuration files. This release introduces a new onboarding flow and a universal model loader that handles everything from HF URLs to local files. We’ve al	Low	11/26/2025
v3.7.0	<h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> </h1> Welcome to LocalAI 3.7.0 :wave: This release introduces Agentic MCP support with full WebUI integration, a brand-new neutts TTS backend, fuzzy model search, long-form TTS chunking for chatterbox, and a complete WebUI overhaul. We’ve also fixed critical bugs, improved stability, and enhanced co	Low	10/31/2025
v3.6.0	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Bug fixes :bug: * fix: reranking models limited to 512 tokens in llama.cpp backend by @jongames in https://github.com/mudler/LocalAI/pull/6344 ### Exciting New Features 🎉 * feat(kokoro): add support for l4t devices by @mudler in https://github.com/mudler/LocalAI/pull/6322 * feat(chatterbox): support multilingual by @mudler in https://github.com/mudler/LocalAI/pull/6240 ### 🧠 Mod	Low	10/3/2025
v3.5.4	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Bug fixes :bug: * fix(python): make option check uniform across backends by @mudler in https://github.com/mudler/LocalAI/pull/6314 ### Other Changes * chore: :arrow_up: Update ggml-org/whisper.cpp to `44fa2f647cf2a6953493b21ab83b50d5f5dbc483` by @localai-bot in https://github.com/mudler/LocalAI/pull/6317 * chore: :arrow_up: Update ggml-org/llama.cpp to `f432d8d83e7407073634c5e4fd81	Low	9/20/2025
v3.5.3	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Bug fixes :bug: * fix(diffusers): fix float detection by @mudler in https://github.com/mudler/LocalAI/pull/6313 ### 🧠 Models * chore(model gallery): add mistralai_magistral-small-2509 by @mudler in https://github.com/mudler/LocalAI/pull/6309 * chore(model gallery): add impish_qwen_14b-1m by @mudler in https://github.com/mudler/LocalAI/pull/6310 * chore(model gallery): add aquif-3	Low	9/19/2025
v3.5.2	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### 👒 Dependencies * Revert "feat(nvidia-gpu): bump images to cuda 12.8" by @mudler in https://github.com/mudler/LocalAI/pull/6303 ### Other Changes * docs: :arrow_up: update docs version mudler/LocalAI by @localai-bot in https://github.com/mudler/LocalAI/pull/6305 * chore: :arrow_up: Update ggml-org/llama.cpp to `0320ac5264279d74f8ee91bafa6c90e9ab9bbb91` by @localai-bot in https://gi	Low	9/18/2025
v3.5.1	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Bug fixes :bug: * fix: make sure to turn down all processes on exit by @mudler in https://github.com/mudler/LocalAI/pull/6200 * fix(p2p): automatically install llama-cpp for p2p workers by @mudler in https://github.com/mudler/LocalAI/pull/6199 * Point to LocalAI-examples repo for llava by @mauromorales in https://github.com/mudler/LocalAI/pull/6241 * fix: runtime capability detecti	Low	9/17/2025
v3.5.0	<h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> 🚀 LocalAI 3.5.0 </h1> Welcome to LocalAI 3.5.0! This release focuses on expanding backend support, improving usability, refining the overall experience, and keeping reducing footprint of LocalAI, to make it a truly portable, privacy-focused AI stack. We’ve added several new backends, enhanced the WebUI with new features, made signif	Low	9/3/2025
v3.4.0	<h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> 🚀 LocalAI 3.4.0 </h1> ## What’s New in LocalAI 3.4.0 🎉 - WebUI improvements: now size can be set during image generation - New backends: [KittenTTS](github.com/KittenML/KittenTTS), [kokoro](https://github.com/hexgrad/kokoro) and [dia](https://github.com/nari-labs/dia) now are available as backends and models can be installed di	Low	8/12/2025
v3.3.2	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Exciting New Features 🎉 * feat(backends): install from local path by @mudler in https://github.com/mudler/LocalAI/pull/5962 * feat(backends): allow backends to not have a metadata file by @mudler in https://github.com/mudler/LocalAI/pull/5963 ### 📖 Documentation and examples * fix(docs): Improve responsiveness of tables by @dedyf5 in https://github.com/mudler/LocalAI/pull/5954 #	Low	8/4/2025
v3.3.1	<!-- Release notes generated using configuration in .github/release.yml at master --> This is a minor release, however we have addressed some important bug regarding Intel-GPU Images, and we have changed naming of the container images. This release also adds support for Flux Kontext and Flux krea! ## :warning: Breaking change Intel GPU images has been renamed from `latest-gpu-intel-f32` and `latest-gpu-intel-f16` to a single one, `latest-gpu-intel`, for example: ```bash docker r	Low	8/1/2025
v3.3.0	<h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> 🚀 LocalAI 3.3.0 </h1> ## What’s New in LocalAI 3.3.0 🎉 - Object detection! From 3.3.0, now LocalAI supports with a new API - also fast object detection! Just install the `rfdetr-base` model - See [the documentation](https://localai.io/features/object-detection/) to learn more - Backends now have defined mirrors for download -	Low	7/28/2025
v3.2.3	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Bug fixes :bug: * fix(cuda): be consistent with image tag naming by @mudler in https://github.com/mudler/LocalAI/pull/5916 ### 📖 Documentation and examples * chore(docs): add documentation on backend detection override by @mudler in https://github.com/mudler/LocalAI/pull/5915 ### Other Changes * chore: :arrow_up: Update ggml-org/llama.cpp to `c7f3169cd523140a288095f2d79befb20a0b7	Low	7/26/2025
v3.2.2	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Bug fixes :bug: * fix(backends gallery): trim string when reading cap from file by @mudler in https://github.com/mudler/LocalAI/pull/5909 * fix(vulkan): use correct image suffix by @mudler in https://github.com/mudler/LocalAI/pull/5911 * fix(ci): add nvidia-l4t capability to l4t images by @mudler in https://github.com/mudler/LocalAI/pull/5914 ### Exciting New Features 🎉 * feat(ba	Low	7/25/2025
v3.2.1	<!-- Release notes generated using configuration in .github/release.yml at v3.2.1 --> ## What's Changed ### Bug fixes :bug: * fix(install.sh): update to use the new binary naming by @mudler in https://github.com/mudler/LocalAI/pull/5903 * fix(backends gallery): pass-by backend galleries to the model service by @mudler in https://github.com/mudler/LocalAI/pull/5906 ### Other Changes * chore: :arrow_up: Update ggml-org/llama.cpp to `3f4fc97f1d745f1d5d3c853949503136d419e6de` by @localai-bot	Low	7/25/2025
v3.2.0	<h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> 🚀 LocalAI 3.2.0 </h1> Welcome to LocalAI 3.2.0! This is a release that refactors our architecture to be more flexible and lightweight. The core is now separated from all the backends, making LocalAI faster to download, easier to manage, portable, and much more smaller. ## TL;DR – What’s New in LocalAI 3.2.0 🎉 - 🧩 Mod	Low	7/24/2025
v3.1.1	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Bug fixes :bug: * fix(backends gallery): correctly identify gpu vendor by @mudler in https://github.com/mudler/LocalAI/pull/5739 * fix(backends gallery): meta packages do not have URIs by @mudler in https://github.com/mudler/LocalAI/pull/5740 ### Exciting New Features 🎉 * feat(gallery): automatically install missing backends along models by @mudler in https://github.com/mudler/Loc	Low	6/27/2025
v3.1.0	<h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> 🚀 LocalAI 3.1 </h1> # 🚀 Highlights ## Support for Gemma 3n! Gemma 3n has been released and it's now available in LocalAI (currently only for text generation, install it with: - `local-ai run gemma-3n-e2b-it` - `local-ai run gemma-3n-e4b-it` ## ⚠️ Breaking Changes Several important changes that reduce image size, s	Low	6/26/2025
v3.0.0	<h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> 🚀 LocalAI 3.0 – A New Era Begins </h1> Say hello to LocalAI 3.0 — our most ambitious release yet! We’ve taken huge strides toward making LocalAI not just local, but limitless. Whether you're building LLM-powered agents, experimenting with audio pipelines, or deploying multimodal backends at scale — this release is for you. Let	Low	6/19/2025
v2.29.0	<h1 align="center"> <br> <img height="300" src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png"> <br> <br> v2.29.0 </h1> I am thrilled to announce the release of LocalAI v2.29.0! This update focuses heavily on refining our container image strategy, making default images leaner and providing clearer options for users needing specific features or hardware acceleration. We've also added support for new models like Qwen3, enhanced existing	Low	5/12/2025
v2.28.0	# 🎉 LocalAI v2.28.0: New Look & The Rebirth of LocalAGI! 🎉 <table> <tr> <td align="center"> <img src="https://raw.githubusercontent.com/mudler/LocalAI/refs/heads/master/core/http/static/logo.png" width="300" alt="New LocalAI Logo"> <br> <em>Our fresh new look!</em> </td> </tr> </table> Big news, everyone! Not only does LocalAI have a brand new logo, but we're also celebrating the full rebirth of LocalAGI, our powerful agent framework, now complet	Low	4/15/2025
v2.27.0	<!-- Release notes generated using configuration in .github/release.yml at master --> # :rocket: LocalAI v2.27.0 <h1 align="center"> <br> <img src="https://github.com/user-attachments/assets/b3137b56-5661-4aa4-939f-53bde60a5b27" style="display: block;margin-left: auto;margin-right: auto;width: 50%;"> </h1> Welcome to another exciting release of LocalAI v2.27.0! We've been working hard to bring you a fresh WebUI experience and a host of improvements under the hood. Get ready to	Low	3/31/2025
v2.26.0	<h1 align="center"> <br> <img src="https://github.com/user-attachments/assets/498ac36e-16dd-48f9-b810-131ee4938eb4" style="display: block;margin-left: auto;margin-right: auto;width: 50%;"> </h1> # :llama: LocalAI v2.26.0! Hey everyone - very excited about this release! It contains several cleanups, performance improvements and few breaking changes: old backends that are now superseded have been removed (for example, `vall-e-x`), while new backends have been added to expand the ra	Low	2/15/2025
v2.25.0	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Bug fixes :bug: * chore(llava): update clip.patch by @mudler in https://github.com/mudler/LocalAI/pull/4453 ### Exciting New Features 🎉 * feat(llama.cpp): expose cache_type_k and cache_type_v for quant of kv cache by @mudler in https://github.com/mudler/LocalAI/pull/4329 * feat(template): read jinja templates from gguf files by @mudler in https://github.com/mudler/LocalAI/pull/433	Low	1/10/2025
v2.24.2	## What's Changed ### 👒 Dependencies * chore: :arrow_up: Update ggerganov/llama.cpp to `26a8406ba9198eb6fdd8329fa717555b4f77f05f` by @mudler in https://github.com/mudler/LocalAI/pull/4358 Full Changelog: https://github.com/mudler/LocalAI/compare/v2.24.1...v2.24.2	Low	12/10/2024
v2.24.1	<!-- Release notes generated using configuration in .github/release.yml at release/v2.24.1 --> This is a patch release to fix https://github.com/mudler/LocalAI/issues/4334 Full Changelog: https://github.com/mudler/LocalAI/compare/v2.24.0...v2.24.1	Low	12/8/2024
v2.24.0	# LocalAI release v2.24.0! ![b642257566578](https://github.com/user-attachments/assets/eb3bccbc-bd6a-46c3-949e-efc9b4ff1e90) ## 🚀 Highlights - Backend deprecation: We’ve removed `rwkv.cpp` and `bert.cpp`, replacing them with enhanced functionalities in `llama.cpp` for simpler installation and better performance. - New Backends Added: Introducing `bark.cpp` for text-to-audio and `stablediffusion.cpp` for image generation, both powered by the ggml framework. - **Voice Activity De	Low	12/4/2024
v2.23.0	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Breaking Changes 🛠 * feat(templates): use a single template for multimodals messages by @mudler in https://github.com/mudler/LocalAI/pull/3892 ### Bug fixes :bug: * fix(parler-tts): use latest audiotools by @mudler in https://github.com/mudler/LocalAI/pull/3954 * fix(parler-tts): pin grpcio-tools by @mudler in https://github.com/mudler/LocalAI/pull/3960 * fix(gallery): overrides	Low	11/10/2024
v2.22.1	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Bug fixes :bug: * fix(vllm): images and videos are base64 by default by @mudler in https://github.com/mudler/LocalAI/pull/3867 * fix(dependencies): pin pytorch version by @mudler in https://github.com/mudler/LocalAI/pull/3872 * fix(dependencies): move deps that brings pytorch by @mudler in https://github.com/mudler/LocalAI/pull/3873 * fix(vllm): do not set videos if we don't have a	Low	10/21/2024
v2.22.0	<!-- Release notes generated using configuration in .github/release.yml at master --> ## LocalAI v2.22.0 is out :partying_face: ## :bulb: Highlights - Image-to-Text and Video-to-Text Support: The VLLM backend now supports both image-to-text and video-to-text processing. - Enhanced Multimodal Support: Template placeholders are now available, offering more flexibility in multimodal applications - Model Management Made Easy: List all your loaded models directly via the /system	Low	10/12/2024
v2.21.1	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Bug fixes :bug: * fix(health): do not require auth for /healthz and /readyz by @mudler in https://github.com/mudler/LocalAI/pull/3656 ### 👒 Dependencies * chore(deps): Bump sentence-transformers from 3.1.0 to 3.1.1 in /backend/python/sentencetransformers by @dependabot in https://github.com/mudler/LocalAI/pull/3651 * chore(deps): Bump pydantic from 2.8.2 to 2.9.2 in /examples/lang	Low	9/25/2024
v2.21.0	<!-- Release notes generated using configuration in .github/release.yml at master --> ## :bulb: Highlights! LocalAI v2.21 release is out! - Deprecation of the `exllama` backend - AIO images now have `gpt-4o` instead of `gpt-4-vision-preview` for Vision API - vLLM backend now supports embeddings - New endpoint to list system information (`/system`) - `trust_remote_code` is now respected by `sentencetransformers` - Auto warm-up and load models on start - `coqui` backend switched to	Low	9/24/2024
v2.20.1	![local-ai-release-2 20-shadow4](https://github.com/user-attachments/assets/cdcc0f8f-953b-4346-be50-b7542e15def6) It's that time again—I’m excited (and honestly, a bit proud) to announce the release of LocalAI v2.20! This one’s a biggie, with some of the most requested features and enhancements, all designed to make your self-hosted AI journey even smoother and more powerful. ### TL;DR - 🌍 Explorer & Community: Explore global community pools at [explorer.localai.io](https:/	Low	8/23/2024
v2.20.0	![local-ai-release-2 20-shadow4](https://github.com/user-attachments/assets/cdcc0f8f-953b-4346-be50-b7542e15def6) ### TL;DR - 🌍 Explorer & Community: Explore global community pools at [explorer.localai.io](https://explorer.localai.io) - 👀 Demo instance available: Test out LocalAI at [demo.localai.io](https://demo.localai.io) - 🤗 Integration: Hugging Face Local apps now include LocalAI - 🐛 Bug Fixes: Diffusers and hipblas issues resolved - 🎨 New Feature: FLU	Low	8/22/2024
v2.19.4	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### 🧠 Models * chore(model-gallery): :arrow_up: update checksum by @localai-bot in https://github.com/mudler/LocalAI/pull/3040 * chore(model-gallery): :arrow_up: update checksum by @localai-bot in https://github.com/mudler/LocalAI/pull/3043 * models(gallery): add magnum-32b-v1 by @mudler in https://github.com/mudler/LocalAI/pull/3044 * models(gallery): add lumimaid-v0.2-70b-i1 by @mud	Low	8/1/2024
v2.19.3	<!-- Release notes generated using configuration in .github/release.yml at master --> ## What's Changed ### Bug fixes :bug: * fix(gallery): do not attempt to delete duplicate files by @mudler in https://github.com/mudler/LocalAI/pull/3031 * fix(gallery): do clear out errors once displayed by @mudler in https://github.com/mudler/LocalAI/pull/3033 ### Exciting New Features 🎉 * feat(grammar): add llama3.1 schema by @mudler in https://github.com/mudler/LocalAI/pull/3015 ### 🧠 Models * mo	Low	7/28/2024
v2.19.2	This release is a patch release to fix well known issues from 2.19.x ## What's Changed ### Bug fixes :bug: * fix: pin setuptools 69.5.1 by @fakezeta in https://github.com/mudler/LocalAI/pull/2949 * fix(cuda): downgrade to 12.0 to increase compatibility range by @mudler in https://github.com/mudler/LocalAI/pull/2994 * fix(llama.cpp): do not set anymore lora_base by @mudler in https://github.com/mudler/LocalAI/pull/2999 ### Exciting New Features 🎉 * ci(Makefile): reduce binary size by co	Low	7/24/2024
v2.19.1	![local-ai-release-219-shadow](https://github.com/user-attachments/assets/c5d7c930-656f-410d-aab9-455a466925fe) # LocalAI 2.19.1 is out! :mega: ## TLDR; Summary spotlight - 🖧 Federated Instances via P2P: LocalAI now supports federated instances with P2P, offering both load-balanced and non-load-balanced options. - 🎛️ P2P Dashboard: A new dashboard to guide and assist in setting up P2P instances with auto-discovery using shared tokens. - 🔊 TTS Integration: Text-to-Speech (TTS) is n	Low	7/20/2024
v2.19.0	![local-ai-release-219-shadow](https://github.com/user-attachments/assets/c5d7c930-656f-410d-aab9-455a466925fe) # LocalAI 2.19.0 is out! :mega: ## TLDR; Summary spotlight - 🖧 Federated Instances via P2P: LocalAI now supports federated instances with P2P, offering both load-balanced and non-load-balanced options. - 🎛️ P2P Dashboard: A new dashboard to guide and assist in setting up P2P instances with auto-discovery using shared tokens. - 🔊 TTS Integration: Text-to-Speech (TTS) is n	Low	7/19/2024

Dependencies & License Audit

Loading dependencies...

Similar Packages

mcp-toolboxMCP Toolbox for Databases is an open source MCP server for databases.v1.7.0

mesh-llmDistributed AI/LLM for the people. Share compute privately or publicly to power your agents and chat.v0.73.1

whatsapp-agentkitBuild custom WhatsApp AI agents with Claude Code in under 30 minutes, no coding needed.main@2026-06-12

@ipio-ai/cliIPIO CLI — Distributed AI agent coordination for game development1.8.1

@pajamadot/cliPajama Game Studio CLI — Distributed AI agent coordination for game development1.7.1

More in MCP Servers

supersetCode Editor for the AI Agents Era - Run an army of Claude Code, Codex, etc. on your machine

kreuzbergA polyglot document intelligence framework with a Rust core. Extract text, metadata, images, and structured information from PDFs, Office documents, images, and 91+ formats. Available for Rust, Python

ai-engineering-from-scratchLearn it. Build it. Ship it for others.

CodeGraphContextAn MCP server plus a CLI tool that indexes local code into a graph database to provide context to AI assistants.