Home > Uncategorized > ollama

ollama

Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.

deepseek gemma gemma3 glm go golang gpt-oss llama

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.

README

Ollama

Start building with open models.

Download

macOS

curl -fsSL https://ollama.com/install.sh | sh

or download manually

Windows

irm https://ollama.com/install.ps1 | iex

or download manually

Linux

curl -fsSL https://ollama.com/install.sh | sh

Manual install instructions

Docker

The official Ollama Docker image ollama/ollama is available on Docker Hub.

Libraries

Community

Get started

ollama

You'll be prompted to run a model or connect Ollama to your existing agents or applications such as Claude Code, OpenClaw, OpenCode , Codex, Copilot, and more.

Coding

To launch a specific integration:

ollama launch claude

Supported integrations include Claude Code, Codex, Copilot CLI, Droid, and OpenCode.

AI assistant

Use OpenClaw to turn Ollama into a personal AI assistant across WhatsApp, Telegram, Slack, Discord, and more:

ollama launch openclaw

Chat with a model

Run and chat with Gemma 3:

ollama run gemma3

See ollama.com/library for the full list.

See the quickstart guide for more details.

REST API

Ollama has a REST API for running and managing models.

curl http://localhost:11434/api/chat -d '{
  "model": "gemma3",
  "messages": [{
    "role": "user",
    "content": "Why is the sky blue?"
  }],
  "stream": false
}'

See the API documentation for all endpoints.

Python

pip install ollama

from ollama import chat

response = chat(model='gemma3', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])
print(response.message.content)

JavaScript

npm i ollama

import ollama from "ollama";

const response = await ollama.chat({
  model: "gemma3",
  messages: [{ role: "user", content: "Why is the sky blue?" }],
});
console.log(response.message.content);

Supported backends

llama.cpp project founded by Georgi Gerganov.

Documentation

Community Integrations

Want to add your project? Open a pull request.

Chat Interfaces

Web

Open WebUI - Extensible, self-hosted AI interface
Onyx - Connected AI workspace
LibreChat - Enhanced ChatGPT clone with multi-provider support
Lobe Chat - Modern chat framework with plugin ecosystem (docs)
NextChat - Cross-platform ChatGPT UI (docs)
Perplexica - AI-powered search engine, open-source Perplexity alternative
big-AGI - AI suite for professionals
Lollms WebUI - Multi-model web interface
ChatOllama - Chatbot with knowledge bases
Bionic GPT - On-premise AI platform
Chatbot UI - ChatGPT-style web interface
Hollama - Minimal web interface
Chatbox - Desktop and web AI client
chat - Chat web app for teams
Ollama RAG Chatbot - Chat with multiple PDFs using RAG
Tkinter-based client - Python desktop client

Desktop

Dify.AI - LLM app development platform
AnythingLLM - All-in-one AI app for Mac, Windows, and Linux
Maid - Cross-platform mobile and desktop client
Witsy - AI desktop app for Mac, Windows, and Linux
Cherry Studio - Multi-provider desktop client
Ollama App - Multi-platform client for desktop and mobile
PyGPT - AI desktop assistant for Linux, Windows, and Mac
Alpaca - GTK4 client for Linux and macOS
SwiftChat - Cross-platform including iOS, Android, and Apple Vision Pro
Enchanted - Native macOS and iOS client
RWKV-Runner - Multi-model desktop runner
Ollama Grid Search - Evaluate and compare models
macai - macOS client for Ollama and ChatGPT
AI Studio - Multi-provider desktop IDE
Reins - Parameter tuning and reasoning model support
ConfiChat - Privacy-focused with optional encryption
LLocal.in - Electron desktop client
MindMac - AI chat client for Mac
Msty - Multi-model desktop client
BoltAI for Mac - AI chat client for Mac
IntelliBar - AI-powered assistant for macOS
Kerlig AI - AI writing assistant for macOS
Hillnote - Markdown-first AI workspace
Perfect Memory AI - Productivity AI personalized by screen and meeting history

Mobile

Ollama Android Chat - One-click Ollama on Android

SwiftChat, Enchanted, Maid, Ollama App, Reins, and ConfiChat listed above also support mobile platforms.

Code Editors & Development

Cline - VS Code extension for multi-file/whole-repo coding
Continue - Open-source AI code assistant for any IDE
Void - Open source AI code editor, Cursor alternative
Copilot for Obsidian - AI assistant for Obsidian
twinny - Copilot and Copilot chat alternative
gptel Emacs client - LLM client for Emacs
Ollama Copilot - Use Ollama as GitHub Copilot
Obsidian Local GPT - Local AI for Obsidian
Ellama Emacs client - LLM tool for Emacs
orbiton - Config-free text editor with Ollama tab completion
AI ST Completion - Sublime Text 4 AI assistant
VT Code - Rust-based terminal coding agent with Tree-sitter
QodeAssist - AI coding assistant for Qt Creator
AI Toolkit for VS Code - Microsoft-official VS Code extension
Open Interpreter - Natural language interface for computers

Libraries & SDKs

LiteLLM - Unified API for 100+ LLM providers
Semantic Kernel - Microsoft AI orchestration SDK
LangChain4j - Java LangChain (example)
LangChainGo - Go LangChain (example)
Spring AI - Spring framework AI support (docs)
LangChain and LangChain.js with example
Ollama for Ruby - Ruby LLM library
any-llm - Unified LLM interface by Mozilla
OllamaSharp for .NET - .NET SDK
LangChainRust - Rust LangChain (example)
Agents-Flex for Java - Java agent framework (example)
Elixir LangChain - Elixir LangChain
Ollama-rs for Rust - Rust SDK
LangChain for .NET - .NET LangChain (example)
chromem-go - Go vector database with Ollama embeddings (example)
LangChainDart - Dart LangChain
LlmTornado - Unified C# interface for multiple inference APIs
Ollama4j for Java - Java SDK
Ollama for Laravel - Laravel integration
Ollama for Swift - Swift SDK
LlamaIndex and LlamaIndexTS - Data framework for LLM apps
Haystack - AI pipeline framework
Firebase Genkit - Google AI framework
Ollama-hpp for C++ - C++ SDK
PromptingTools.jl - Julia LLM toolkit (example)
Ollama for R - rollama - R SDK
Portkey - AI gateway
Testcontainers - Container-based testing
LLPhant - PHP AI framework

Frameworks & Agents

AutoGPT - Autonomous AI agent platform
crewAI - Multi-agent orchestration framework
Strands Agents - Model-driven agent building by AWS
Cheshire Cat - AI assistant framework
any-agent - Unified agent framework interface by Mozilla
Stakpak - Open source DevOps agent
Hexabot - Conversational AI builder
Neuro SAN - Multi-agent orchestration (docs)

RAG & Knowledge Bases

RAGFlow - RAG engine based on deep document understanding
R2R - Open-source RAG engine
MaxKB - Ready-to-use RAG chatbot
Minima - On-premises or fully local RAG
Chipper - AI interface with Haystack RAG
ARGO - RAG and deep research on Mac/Windows/Linux
Archyve - RAG-enabling document library
Casibase - AI knowledge base with RAG and SSO
BrainSoup - Native client with RAG and multi-agent automation

Bots & Messaging

LangBot - Multi-platform messaging bots with agents and RAG
AstrBot - Multi-platform chatbot with RAG and plugins
Discord-Ollama Chat Bot - TypeScript Discord bot
Ollama Telegram Bot - Telegram bot
LLM Telegram Bot - Telegram bot for roleplay

Terminal & CLI

aichat - All-in-one LLM CLI with Shell Assistant, RAG, and AI tools
oterm - Terminal client for Ollama
gollama - Go-based model manager for Ollama
tlm - Local shell copilot
tenere - TUI for LLMs
ParLlama - TUI for Ollama
llm-ollama - Plugin for Datasette's LLM CLI
ShellOracle - Shell command suggestions
LLM-X - Progressive web app for LLMs
cmdh - Natural language to shell commands
VT - Minimal multimodal AI chat app

Productivity & Apps

AppFlowy - AI collaborative workspace, self-hostable Notion alternative
Screenpipe - 24/7 screen and mic recording with AI-powered search
Vibe - Transcribe and analyze meetings
Page Assist - Chrome extension for AI-powered browsing
NativeMind - Private, on-device browser AI assistant
Ollama Fortress - Security proxy for Ollama
1Panel - Web-based Linux server management
Writeopia - Text editor with Ollama integration
QA-Pilot - GitHub code repository understanding
Raycast extension - Ollama in Raycast
Painting Droid - Painting app with AI integrations
Serene Pub - AI roleplaying app
Mayan EDMS - Document management with Ollama workflows
TagSpaces - File management with AI tagging

Observability & Monitoring

Opik - Debug, evaluate, and monitor LLM applications
OpenLIT - OpenTelemetry-native monitoring for Ollama and GPUs
Lunary - LLM observability with analytics and PII masking
Langfuse - Open source LLM observability
HoneyHive - AI observability and evaluation for agents
MLflow Tracing - Open source LLM observability

Database & Embeddings

pgai - PostgreSQL as a vector database (guide)
MindsDB - Connect Ollama with 200+ data platforms
chromem-go - Embeddable vector database for Go (example)
Kangaroo - AI-powered SQL client

Infrastructure & Deployment

Cloud

Google Cloud
Fly.io
Koyeb
Harbor - Containerized LLM toolkit with Ollama as default backend

Package Managers

Release History

Version	Changes	Urgency	Date
v0.32.3	## What's Changed - Fixed model downloads that stall before sending data. - Improved integrations: restored Claude Code Channels, fixed Anthropic thinking streams, and made Hermes Desktop respect `--force-build`. - Expanded GPU support with CUDA on Windows ARM64, B200 support through CUDA 12, and lower memory use on Linux CUDA/ROCm iGPUs. - Added chat, thinking, and tool calling support for Laguna 2.1 models, including a Metal inference fix. - Fixed GLM tool calls being silently dropped a	High	7/23/2026
v0.32.0	## What's Changed - New interactive agent experience: running `ollama` now launches an agent to help you code and delegate work ``` ❯ ollama Ollama 0.32.0 ▸ Chat, Code, & Work (glm-5.2:cloud) Chat with models, code, search the web, and delegate real work ``` - Renamed the Codex App integration to ChatGPT: use ollama launch chatgpt (and --restore to return to your usual ChatGPT profile) - Simplified integration selection: the ollama launch menu now only offers the most popu	High	7/11/2026
v0.31.1	## Faster Gemma 4 on Apple Silicon <img width="1037" height="485" alt="Screenshot 2026-06-30 at 5 25 29 PM" src="https://github.com/user-attachments/assets/547d5076-090f-43c4-a661-938e11abc955" /> Gemma 4 is now significantly faster in Ollama on Apple Silicon, generating tokens nearly 90% faster on average across a coding-agent benchmark by leveraging multi-token prediction (MTP). Ollama auto-tunes how many tokens to draft as it runs, so the speedup is on by default, requires no configurat	High	6/30/2026
v0.30.11	## What's Changed * launch: add thinking capability detection to opencode by @hoyyeva in https://github.com/ollama/ollama/pull/15434 * launch: auto-install Claude Code by @hoyyeva in https://github.com/ollama/ollama/pull/16802 * launch: auto-install opencode when missing by @hoyyeva in https://github.com/ollama/ollama/pull/16806 * discover: fix inverted iGPU/dGPU Vulkan classification on Windows hybrid graphics by @Sahil170595 in https://github.com/ollama/ollama/pull/16669 * mlxrunner: unif	High	6/25/2026
v0.30.10	## What's Changed * Command A and North family models now run on Apple Silicon with the MLX engine * Updated the underlying llama.cpp engine to build 9672 * Fixed build artifacts for MLX Full Changelog: https://github.com/ollama/ollama/compare/v0.30.9...v0.30.10	High	6/17/2026
v0.30.8	## What's Changed * Fixed `ollama launch` selecting the wrong provider in some cases * Improved prompt caching by decoupling it from context shift for better KV cache reuse * More stable MLX inference with hardened linear and embedding layers * MLX runner now creates snapshots during prompt processing and speculative decoding for improved reliability * Improved recurrent model support with per-boundary states from the gated-delta kernels Full Changelog: https://github.com/ollama/olla	High	6/12/2026
v0.30.7	Ollama Launch now supports Hermes Desktop, a native desktop interface for the Hermes agent. Run it alongside your Hermes agent to get a visual interface for managing conversations, integrations, and messaging apps. ``` ollama launch hermes-desktop ``` <img width="2556" height="1716" alt="image" src="https://github.com/user-attachments/assets/3b2292d8-9f94-4d32-9023-85772e6ab3f8" /> What's Changed - Hermes Desktop is now available via `ollama launch hermes-desktop` with native Windows	High	6/7/2026
v0.30.2	## What's Changed * feat(launch): show and auto-install Cline CLI by @hoyyeva in https://github.com/ollama/ollama/pull/16402 * log template details to aid troubleshooting by @dhiltgen in https://github.com/ollama/ollama/pull/16403 * cmd/launch: add Qwen code integration by @hoyyeva in https://github.com/ollama/ollama/pull/15900 * launch: fix opencode local model limits by @dhiltgen in https://github.com/ollama/ollama/pull/16425 * llm: include cached prompt tokens in llama-server counts by @	High	6/3/2026
v0.24.0	## Codex App Ollama 0.24 includes support for the Codex App, OpenAI's desktop experience for working on Codex threads in parallel with built-in worktree support and git functionality. ```bash ollama launch codex-app ``` <img width="2088" height="1404" alt="CleanShot 2026-05-14 at 15 04 18@2x" src="https://github.com/user-attachments/assets/53bd7997-19fd-4809-b8f2-b6ed284369c9" /> ### Built-in browser Codex can load local servers and sites in its built-in browser, enabling you to	High	5/14/2026
v0.23.4	## What's Changed * `ollama launch opencode` now supports vision models with image inputs * Fixed formatting of Claude tool results when using local image paths Full Changelog: https://github.com/ollama/ollama/compare/v0.23.3...v0.23.4	High	5/13/2026
v0.30.0	Ollama 0.30 is now available, with improved compatibility and performance using [llama.cpp](https://github.com/ggml-org/llama.cpp). This augments the MLX engine on Apple Silicon, bringing support to a wider range of hardware. This release brings support for a wider range of models, including GGUF-based models from Hugging Face and your own fine-tuned models along with faster performance on NVIDIA hardware. ## Known issues: * `laguna-xs.2` is not yet supported on Windows/Linux. * `llama	Medium	5/13/2026
v0.23.2	## What's Changed * `ollama launch` no longer includes Claude Desktop due to the third-party integration being limited to Anthropic models. * Use `ollama launch claude-desktop --restore` to restore Claude Desktop to its normal state. * `/api/show` responses are now cached, improving median latency by ~6.7x which will increase load speed for integrations like VS Code. * Improved backup workflow when managing launch integrations * Cleaner image generation layout in the MLX runner **	High	5/7/2026
v0.23.0	## Claude Desktop Claude Desktop is now supported with Ollama Launch. Claude Cowork and Claude Code are supported within the Claude Desktop App. ``` ollama launch claude-desktop ``` ### Claude Cowork <img width="1272" height="872" alt="ca1" src="https://github.com/user-attachments/assets/1d550e3f-0272-4429-8cb2-06d32344cb77" /> ### Claude Code <img width="1272" height="872" alt="ca2" src="https://github.com/user-attachments/assets/f2a5ed5f-3069-4975-bb22-ada82914a01c" />	High	5/3/2026
v0.22.0	## New models * NVIDIA's [Nemotron 3 Omni](https://ollama.com/library/nemotron3) * Poolside's first open-weight coding model - [Laguna XS.2](https://ollama.com/library/laguna-xs.2) Full Changelog: https://github.com/ollama/ollama/compare/v0.21.2...v0.22.0	High	4/28/2026
v0.21.2	## What's Changed * Improved reliability of the OpenClaw onboarding flow in `ollama launch` * Recommended models in `ollama launch` now appear in a fixed, canonical order * OpenClaw integration now bundles Ollama's web search plugin in OpenClaw ## New Contributors * @madflow made their first contribution in https://github.com/ollama/ollama/pull/15733 Full Changelog: https://github.com/ollama/ollama/compare/v0.21.1...v0.21.2	High	4/23/2026
v0.21.1	## What's Changed ### Kimi CLI You can now install and run the Kimi CLI through Ollama. ``` ollama launch kimi --model kimi-k2.6:cloud ``` Kimi CLI with Kimi K2.6 excels at long horizon agentic execution tasks through a multi-agent system. * MLX runner adds logprobs support for compatible models * Faster MLX sampling with fused top-P and top-K in a single sort pass, plus repeat penalties applied in the sampler * Improved MLX prompt tokenization by moving tokenizati	High	4/22/2026
v0.21.0	## Hermes Agent ``` ollama launch hermes ``` Hermes learns with you, automatically creating skills to better serve your workflows. Great for research and engineering tasks. <img width="1329" height="946" alt="image" src="https://github.com/user-attachments/assets/771d3383-95ed-4652-81e5-cf89514d25cc" /> ## What's Changed - Gemma 4 on MLX. Added support for running Gemma 4 via MLX on Apple Silicon, including a text-only MLX runtime for the model. The MLX backend also picked u	High	4/16/2026
v0.20.7	## What's Changed * Fix quality of gemma:e2b and gemma:e4b when thinking is disabled * ROCm: Update to ROCm 7.2.1 on Linux by @saman-amd in https://github.com/ollama/ollama/pull/15483 Full Changelog: https://github.com/ollama/ollama/compare/v0.20.6...v0.20.7	High	4/13/2026
v0.20.6	## What's Changed * Gemma 4 tool calling ability is improved and updated to use Google's latest post-launch fixes * Parallel tool calling improved for streaming responses * [Hermes agent](https://docs.ollama.com/integrations/hermes) Ollama integration guide is now available * Ollama app is updated to fix image attachment errors ## New Contributors @matteocelani made their first contribution in [#15272](https://github.com/ollama/ollama/pull/15272) Full Changelog: https://github	Medium	4/12/2026
v0.20.5	## OpenClaw channel setup with Ollama Launch <img width="2292" height="1694" alt="CleanShot 2026-04-09 at 15 45 10@2x" src="https://github.com/user-attachments/assets/3a6882c4-5c6e-4724-8f6e-56ff2df39f6f" /> ## What's Changed - OpenClaw channel setup: connect WhatsApp, Telegram, Discord, and other messaging channels through `ollama launch openclaw` - Enable flash attention for Gemma 4 on compatible GPUs - ollama launch openclaw now detects curl-based OpenCode installs at ~/.opencode	High	4/9/2026
v0.20.4	## What's Changed * mlx: Improve M5 performance with NAX * gemma4: enable flash attention Full Changelog: https://github.com/ollama/ollama/compare/v0.20.3...v0.20.4	High	4/7/2026
v0.20.3	## What's Changed * Gemma 4 Tool Calling improvements * Added latest models to Ollama App * OpenClaw fixes for launching TUI Full Changelog: https://github.com/ollama/ollama/compare/v0.20.2...v0.20.3	Medium	4/7/2026
v0.20.2	## What's Changed * app: default app home view to new chat instead of launch by @jmorganca in https://github.com/ollama/ollama/pull/15312 Full Changelog: https://github.com/ollama/ollama/compare/v0.20.1...v0.20.2	Medium	4/4/2026
v0.20.1	## What's Changed * bench: add prompt calibration, context size flag, and NumCtx reporting by @dhiltgen in https://github.com/ollama/ollama/pull/15158 * model/parsers: fix gemma4 arg parsing when quoted strings contain " by @drifkin in https://github.com/ollama/ollama/pull/15254 * ggml: skip cublasGemmBatchedEx during graph reservation by @jessegross in https://github.com/ollama/ollama/pull/15301 * gemma4: enable flash attention by @dhiltgen in https://github.com/ollama/ollama/pull/15296 *	Medium	4/3/2026
v0.20.0	<img width="3748" height="1290" alt="Gemma 4" src="https://github.com/user-attachments/assets/c4727579-47b1-4c7b-8aa2-28eda15b71f5" /> ## Gemma 4 Effective 2B (E2B) ``` ollama run gemma4:e2b ``` Effective 4B (E4B) ``` ollama run gemma4:e4b ``` 26B (Mixture of Experts model with 4B active parameters) ``` ollama run gemma4:26b ``` 31B (Dense) ``` ollama run gemma4:31b ``` ## What's Changed * docs: update pi docs by @ParthSareen in https://github.com/	Medium	4/2/2026
v0.19.0	<img width="480" alt="image" src="https://github.com/user-attachments/assets/1b5ca980-b9d5-490e-99b9-f0f7b9af2c32" /> ## Ollama is now powered by MLX on Apple Silicon in preview Ollama on Apple silicon [is now built](https://ollama.com/blog/mlx) on top of Apple’s machine learning framework, MLX, to take advantage of its unified memory architecture. https://github.com/user-attachments/assets/600297b0-3167-46a5-8e3a-fefda3a51b84 Read more: https://ollama.com/blog/mlx ## What's Chang	Medium	3/27/2026
v0.18.4-rc0	## What's Changed * ggml: force flash attention off for grok by @rick-github in https://github.com/ollama/ollama/pull/15050 * mlx: fix KV cache snapshot memory leak by @jessegross in https://github.com/ollama/ollama/pull/15065 * mlxrunner: schedule periodic snapshots during prefill by @jessegross in https://github.com/ollama/ollama/pull/15058 * doc: update vscode doc by @hoyyeva in https://github.com/ollama/ollama/pull/15064 Full Changelog: https://github.com/ollama/ollama/compare/v0.	Medium	3/26/2026
v0.18.3	## Visual Studio Code Microsoft Visual Studio Code now directly integrates with Ollama via GitHub Copilot. If you have Ollama installed, any local or cloud model from Ollama can be selected for use within visual studio code. <img width="3410" height="2076" alt="Ollama screenshot 2026-03-26 at 01 43 57@2x" src="https://github.com/user-attachments/assets/4f9edc73-af3d-475c-924e-93334a4d88c5" /> ## What's Changed * GLM parser improvements for tool calls * OpenClaw integration improv	Medium	3/25/2026
v0.18.2	## What's Changed * Add extra check to ensure `npm` and `git` are installed before installing OpenClaw * Claude Code will now be faster when run locally, due to preventing cache breakages * Fix to correctly support `ollama launch openclaw --model <model>` * Register Ollama's websearch package correctly for OpenClaw Full Changelog: https://github.com/ollama/ollama/compare/v0.18.1...v0.18.2	Low	3/18/2026
v0.18.1	### Web Search and Fetch in OpenClaw Ollama now ships with web search and web fetch plugin for OpenClaw. This allows Ollama's models (local or cloud) to search the web for the latest content and news. This also allows OpenClaw with Ollama to be able to fetch the web and extract readable content for processing. This feature does not execute JavaScript. When using local models with web search in OpenClaw, ensure you are signed into Ollama with `ollama signin` ``` ollama launch openclaw	Low	3/17/2026
v0.18.0	Ollama 0.18 includes improved performance for OpenClaw and Ollama’s [cloud models](https://ollama.com/search?c=cloud), including the new Nemotron-3-Super model by NVIDIA designed for high-performance agentic reasoning tasks. ### Improved OpenClaw performance with Kimi-K2.5 This release of Ollama improves performance of cloud models and their reliability. - Up to 2x faster speeds with Kimi-K2.5 - Tool calling accuracy has been improved ``` ollama launch openclaw --model kimi-k2.5 `	Low	3/14/2026
v0.17.8-rc4	## What's Changed * parsers: repair unclosed arg_value tags in GLM tool calls by @BruceMacD in https://github.com/ollama/ollama/pull/14656 * Reapply "don't require pulling stubs for cloud models" again by @jmorganca in https://github.com/ollama/ollama/pull/14608 * docs: format compat docs by @mxyng in https://github.com/ollama/ollama/pull/14678 * create: fix localhost handling by @dhiltgen in https://github.com/ollama/ollama/pull/14681 * build: smarter docker parallelism by @dhiltgen in htt	Low	3/10/2026
v0.17.7	## What's Changed * Allow thinking levels such as `"medium"` to correctly interpreted in Ollama's API for all thinking models * Add context length to support compaction when using `ollama launch` Full Changelog: https://github.com/ollama/ollama/compare/v0.17.6...v0.17.7	Low	3/5/2026
v0.17.6	## What's Changed * Fixed issue where GLM-OCR would not work due to incorrect prompt rendering * Fixed tool calling parsing and rendering for Qwen 3.5 models ## New Contributors * @Victor-Quqi made their first contribution in https://github.com/ollama/ollama/pull/14584 Full Changelog: https://github.com/ollama/ollama/compare/v0.17.5...v0.17.6	Low	3/4/2026
v0.17.5	## New models - [Qwen3.5](https://ollama.com/library/qwen3.5): the small Qwen 3.5 model series is now available in 0.8B, 2B, 4B and 9B parameter sizes. ## What's Changed * Fixed crash in Qwen 3.5 models when split over GPU & CPU * Fixed issue where Qwen 3.5 models would repeat themselves due to no presence penalty (note: you may have to redownload the `qwen3.5` models: `ollama pull qwen3.5:35b` for example) * `ollama run --verbose` will now show peak memory usage when using Ollama's MLX	Low	3/2/2026
v0.17.4	## New models - [Qwen 3.5](https://ollama.com/library/qwen3.5): a family of open-source multimodal models that delivers exceptional utility and performance. - [LFM 2](https://ollama.com/library/lfm2): LFM2 is a family of hybrid models designed for on-device deployment. LFM2-24B-A2B is the largest model in the family, scaling the architecture to 24 billion parameters while keeping inference efficient. > Note: for users on 0.17.1, this version will not automatically update. [Re-downloading](h	Low	2/27/2026
v0.17.3	## What's Changed * Fixed issue where tool calls in the Qwen 3 and Qwen 3.5 model families would not be parsed correctly if emitted during thinking Full Changelog: https://github.com/ollama/ollama/compare/v0.17.2...v0.17.3	Low	2/27/2026
v0.17.2	## What's Changed * Fixed issue where Ollama's app on Windows would crash when a new update has been downloaded Full Changelog: https://github.com/ollama/ollama/compare/v0.17.1...v0.17.2	Low	2/26/2026
v0.17.1	## What's Changed * Nemotron architecture support in Ollama's engine * MLX engine now has improved memory usage * Ollama's app will now allow models that support tools to use web search capabilities * Improved LFM2 and LFM2.5 models in Ollama's engine * `ollama create` will no longer default to affine quantization for unquantized models when using the MLX engine * Added configuration for disabling automatic update downloading Full Changelog: https://github.com/ollama/ollama/compar	Low	2/24/2026
v0.17.0	## OpenClaw OpenClaw can now be installed and configured automatically via Ollama, making it the easiest way to get up and running with OpenClaw with open models like Kimi-K2.5, GLM-5, and Minimax-M2.5. ### Get started `ollama launch openclaw` <img width="2368" height="1830" alt="oc1" src="https://github.com/user-attachments/assets/cb9443d6-92cc-4c13-b26b-87e5f6c09b4e" /> ### Web search in OpenClaw When using cloud models, websearch is enabled - allowing OpenClaw to search the	Low	2/21/2026
v0.16.3	## What's Changed * New `ollama launch cline` added for the Cline CLI * `ollama launch <integration>` will now always show the model picker * Added Gemma 3, Llama and Qwen 3 architectures to MLX runner ## New Contributors * @hellosaumil made their first contribution in https://github.com/ollama/ollama/pull/14271 Full Changelog: https://github.com/ollama/ollama/compare/v0.16.2...v0.16.3	Low	2/19/2026
v0.16.2	## What's Changed * `ollama launch claude` now supports searching the web when using `:cloud` models * Fixed rendering issue when running `ollama` in PowerShell * New setting in Ollama's app makes it easier to disable cloud models for sensitive and private tasks where data cannot leave your computer. For Linux or when running `ollama serve` manually, set `OLLAMA_NO_CLOUD=1`. * Fixed issue where experimental image generation models would not run in 0.16.0 and 0.16.1 Full Changelog: h	Low	2/14/2026
v0.16.1	## What's Changed * Installing Ollama via the `curl` install script on macOS will now only prompt for your password if its required * Installing Ollama via the `iem` install script in Windows will now show progress * Image generation models will now respect the `OLLAMA_LOAD_TIMEOUT` variable Full Changelog: https://github.com/ollama/ollama/compare/v0.16.0...v0.16.1	Low	2/12/2026
v0.16.0	## New models * [GLM-5](https://ollama.com/library/glm-5): A strong reasoning and agentic model from Z.ai with 744B total parameters (40B active), built for complex systems engineering and long-horizon tasks. * [MiniMax-M2.5](https://ollama.com/library/minimax-m2.5): a new state-of-the-art large language model designed for real-world productivity and coding tasks. ## New `ollama` The new `ollama` command makes it easy to launch your favorite apps with models using Ollama <img width="1	Low	2/12/2026
v0.15.6	## What's Changed * Fixed context limits when running `ollama launch droid` * `ollama launch` will now download missing models instead of erroring * Fixed bug where `ollama launch claude` would cause context compaction when providing images Full Changelog: https://github.com/ollama/ollama/compare/v0.15.5...v0.15.6	Low	2/7/2026
v0.15.5	## New models - [Qwen3-Coder-Next](https://ollama.com/library/qwen3-coder-next): a coding-focused language model from Alibaba's Qwen team, optimized for agentic coding workflows and local development. - [GLM-OCR](https://ollama.com/library/glm-ocr): GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture. ## Improvements to `ollama launch` * `ollama launch` can now be provided arguments, for example `ollama launch claude -- --re	Low	2/3/2026
v0.15.4	## What's Changed * `ollama launch openclaw` will now enter the standard OpenClaw onboarding flow if this has not yet been completed. Full Changelog: https://github.com/ollama/ollama/compare/v0.15.3...v0.15.4	Low	2/1/2026
v0.15.3	## What's Changed * Renamed `ollama launch clawdbot` to `ollama launch openclaw` to reflect the project's new name * Improved tool calling for Ministral models * docs: add clawdbot by @ParthSareen in https://github.com/ollama/ollama/pull/13925 * cmd/config: Use envconfig.Host() for base API in launch config packages by @gabe-l-hart in https://github.com/ollama/ollama/pull/13937 * `ollama launch` will now use the value of `OLLAMA_HOST` when running it ## New Contributors * @MBerguer made	Low	2/1/2026
v0.15.2	<img width="1916" height="976" alt="Ollama screenshot 2026-01-26 at 17 53 40@2x (1)" src="https://github.com/user-attachments/assets/7937e607-e16e-46f5-a280-5cf9ce088a68" /> ## What's Changed * New `ollama launch clawdbot` command for launching Clawdbot using Ollama models Full Changelog: https://github.com/ollama/ollama/compare/v0.15.1...v0.15.2	Low	1/27/2026
v0.15.1	## What's Changed * GLM-4.7-Flash performance and correctness improvements, fixing repetitive answers and tool calling quality * Fixed performance issues on macOS and arm64 Linux * Fixed issue where `ollama launch` would not detect `claude` and would incorrectly update `opencode` configurations ## New Contributors * @stillhart made their first contribution in https://github.com/ollama/ollama/pull/13855 Full Changelog: https://github.com/ollama/ollama/compare/v0.15.0...v0.15.1	Low	1/24/2026
v0.15.0	<img width="4502" height="2222" alt="An image of Ollama building rapidly on the computer. Build with Ollama!" src="https://github.com/user-attachments/assets/0810fb5c-6727-400a-b711-4ffc349d0bb5" /> ## `ollama launch` A new `ollama launch` command to use Ollama's models with Claude Code, Codex, OpenCode, and Droid without separate configuration. ## What's Changed * New `ollama launch` command for Claude Code, Codex, OpenCode, and Droid * Fixed issue where creating multi-line strings	Low	1/21/2026
v0.14.3	<img width="2904" height="1420" alt="Ollama screenshot 2026-01-20 at 23 41 54@2x" src="https://github.com/user-attachments/assets/ae16dbc5-5b2b-45fd-ae03-ba15f2721c3c" /> * [Z-Image Turbo](https://ollama.com/x/z-image-turbo): 6 billion parameter text-to-image model from Alibaba’s Tongyi Lab. It generates high-quality photorealistic images. * [Flux.2 Klein](https://ollama.com/x/flux2-klein): Black Forest Labs’ fastest image-generation models to date. ## New models * [GLM-4.7-Flash](https:	Low	1/16/2026
v0.14.2	## New models - [TranslateGemma](https://ollama.com/library/translategemma): A new collection of open translation models built on Gemma 3, helping people communicate across 55 languages. ## What's Changed * <kbd>Shift</kbd> + <kbd>Enter</kbd> (or <kbd>Ctrl</kbd> + <kbd>j</kbd>) will now enter a newline in Ollama's CLI * Improve `/v1/responses` API to better confirm to OpenResponses specification ## New Contributors * @yuhongsun96 made their first contribution in https://github.com/ol	Low	1/16/2026
v0.14.1	## Image generation models (experimental) Experimental image generation models are available for macOS and Linux (CUDA) in Ollama: ### Available models - [Z-Image-Turbo](https://ollama.com/x/z-image-turbo) ``` ollama run x/z-image-turbo ``` > Note: [`x`](https://ollama.com/x) is a username on ollama.com where experimental models are uploaded More models coming soon: 1. Qwen-Image-2512 2. Qwen-Image-Edit-2511 3. GLM-Image ## What's Changed * fix macOS auto	Low	1/14/2026
v0.14.0	## What's Changed * `ollama run --experimental` CLI will now open a new Ollama CLI that includes an agent loop and the `bash` tool * Anthropic API compatibility: support for the `/v1/messages` API * A new `REQUIRES` command for the `Modelfile` allows declaring which version of Ollama is required for the model * For older models, Ollama will avoid an integer underflow on low VRAM systems during memory estimation * More accurate VRAM measurements for AMD iGPUs * Ollama's app will now highlig	Low	1/10/2026
v0.13.5	## New Models * Google's [FunctionGemma](https://ollama.com/library/functiongemma) a specialized version of Google's Gemma 3 270M model fine-tuned explicitly for function calling. ## What's Changed * `bert` architecture models now run on Ollama's engine * Added built-in renderer & tool parsing capabilities for DeepSeek-V3.1 * Fixed issue where nested properties in tools may not have been rendered properly ## New Contributors * @familom made their first contribution in https://github.c	Low	12/18/2025
v0.13.4	## New Models * [Nemotron 3 Nano](https://ollama.com/library/nemotron-3-nano): A new Standard for Efficient, Open, and Intelligent Agentic Models * [Olmo 3](https://ollama.com/library/olmo-3) and [Olmo 3.1](https://ollama.com/library/olmo-3.1): A series of Open language models designed to enable the science of language models. These models are pre-trained on the Dolma 3 dataset and post-trained on the Dolci datasets. ## What's Changed * Enable Flash Attention automatically for models by d	Low	12/13/2025
v0.13.3	## New models * [Devstral-Small-2](https://ollama.com/library/devstral-small-2): 24B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents. * [rnj-1](https://ollama.com/library/rnj-1): Rnj-1 is a family of 8B parameter open-weight, dense models trained from scratch by Essential AI, optimized for code and STEM with capabilities on par with SOTA open-weight models. * [nomic-embed-text-v2](https://ollama.com/library/nomic-embed-text-	Low	12/9/2025
v0.13.2	## New models - [Qwen3-Next](https://ollama.com/library/qwen3-next): The first installment in the Qwen3-Next series with strong performance in terms of both parameter efficiency and inference speed. ## What's Changed * Flash attention is now enabled by default for vision models such as `mistral-3`, `gemma3`, `qwen3-vl` and more. This improves memory utilization and performance when providing images as input. * Fixed GPU detection on multi-GPU CUDA machines * Fixed issue where `deepseek-v3	Low	12/4/2025
v0.13.1	## New models - [Ministral-3](https://ollama.com/library/ministral-3): The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. - [Mistral-Large-3](https://ollama.com/library/mistral-large-3): A general-purpose multimodal mixture-of-experts model for production-grade tasks and enterprise workloads. ## What's Changed * `nomic-embed-text` will now use Ollama's engine by default * Tool calling support for `cogito-v2.1` * Fixed issues with CUDA	Low	11/27/2025
v0.13.0	## New models * [DeepSeek-OCR](https://ollama.com/library/deepseek-ocr): DeepSeek-OCR uses optical 2D mapping to compress long contexts, achieving high OCR precision with reduced vision tokens and demonstrating practical value in document processing. * [Cogito-V2.1](https://ollama.com/library/cogito-2.1): instruction tuned generative models, currently the best open-weight LLM by a US company ## DeepSeek-OCR DeepSeek-OCR is now available on Ollama. Example inputs: ``` ollama run deeps	Low	11/19/2025
v0.12.11	## Logprobs Ollama's API and OpenAI-compatible API now support [log probabilities](https://cookbook.openai.com/examples/using_logprobs). Log probabilities of output tokens indicate the likelihood of each token occurring in the sequence given the context. This is useful for different use cases: 1. Classification tasks 2. Retrieval (Q&A) evaluation 3. Autocomplete 4. Token highlighting and outputting bytes 5. Calculating perplexity To enable Logprobs, provide `"logprobs": true` to Ol	Low	11/12/2025
v0.12.10	## `ollama run` now works with embedding models `ollama run` can now run embedding models to generate vector embeddings from text: ``` ollama run embeddinggemma "Hello world" ``` Content can also be provided to `ollama run` via standard input: ``` echo "Hello world" \| ollama run embeddinggemma ``` ## What's Changed * Fixed errors when running `qwen3-vl:235b` and `qwen3-vl:235b-instruct` * Enable flash attention for Vulkan (currently needs to be built from source) * Add Vulk	Low	11/5/2025
v0.12.9	## What's Changed * Fix performance regression on CPU-only systems Full Changelog: https://github.com/ollama/ollama/compare/v0.12.8...v0.12.9	Low	10/31/2025
v0.12.8	<img width="512" height="512" alt="Ollama_halloween_background" src="https://github.com/user-attachments/assets/ac1f37c5-c81a-446f-8e99-97ef5ebd7d05" /> ## What's Changed * `qwen3-vl` performance improvements, including flash attention support by default * `qwen3-vl` will now output less leading whitespace in the response when thinking * Fixed issue where `deepseek-v3.1` thinking could not be disabled in Ollama's new app * Fixed issue where `qwen3-vl` would fail to interpret images with	Low	10/30/2025
v0.12.7	<img width="600" alt="Ollama screenshot 2025-10-29 at 13 56 55@2x" src="https://github.com/user-attachments/assets/4fea0b30-5d31-4da2-b99c-7f38606fc0a2" /> ## New models - [Qwen3-VL](https://ollama.com/library/qwen3-vl): Qwen3-VL is now available in all parameter sizes ranging from 2B to 235B - [MiniMax-M2](https://ollama.com/library/minimax-m2): a 230 Billion parameter model built for coding & agentic workflows available on Ollama's cloud ## Add files and adjust thinking levels in Oll	Low	10/29/2025

Dependencies & License Audit

Loading dependencies...

Similar Packages

helix♾️ Private Agent Fleet with Spec Coding. Each agent gets their own GPU-accelerated desktop. Run Claude, Codex, Gemini and open models on a full private AI Stack ♾️2.11.57

gorm-queryA strongly-typed query builder and generic repository libraryv1.4.5

vibescanSecurity scanner for AI-generated ("vibe-coded") code. Runs SAST, DAST, and sandboxed exploit simulation across 15+ languages using 30+ tools. Catches what LLMs introduce before it ships — wit0.0.0

vllmA high-throughput and memory-efficient inference and serving engine for LLMsv0.26.0

alefGenerate fully-typed, lint-clean language bindings for Rust libraries across 11 languagesv0.45.0

More in Uncategorized

TradingAgents-CN基于多智能体LLM的中文金融交易框架 - TradingAgents中文增强版

RAGENRAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

arthur-engineMake AI work for Everyone - Monitoring and governing for your AI/ML

gh-aw-firewallGitHub Agentic Workflows Firewall