# cyllama

> A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp

- **URL**: https://www.freshcrate.ai/projects/cyllama
- **Author**: shakfu
- **Category**: RAG & Memory
- **Latest version**: `0.3.1` (2026-06-04)
- **License**: MIT
- **Source**: https://github.com/shakfu/cyllama
- **Homepage**: https://shakfu.github.io/cyllama/
- **Language**: Python
- **GitHub**: 25 stars, 18 forks
- **Registry**: github
- **Tags**: `agents`, `cython`, `cython-wrapper`, `llama-cpp`, `python`, `python3`, `rag`, `stable-diffusion-cpp`, `whisper-cpp`

## Description

A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp

## Recent releases

| Version | Date | Urgency | Changes |
| --- | --- | --- | --- |
| `0.3.1` | 2026-06-04 | High | ## Changes since the last Release  ### Added  - **`VaeFormat` enum and `SDContextParams.vae_format` / `.stream_layers` properties** -- exposes the two `sd_ctx_params_t` fields added in stable-diffusion.cpp master-669-2d40a8b. `VaeFormat(IntEnum)` mirrors the C `sd_vae_format_t` (`AUTO=-1`, `FLUX=0`, `SD3=1`, `FLUX2=2`); `vae_format` forces the VAE format (default `AUTO` = auto-detect from the model) and `stream_layers` toggles residency+prefetch layer streaming (inert unless `max_vram` is se |
| `0.3.0` | 2026-05-28 | High | ## Changes since the last Release  ### Changed  - **BREAKING: distribution switched to abi3-only wheels; minimum Python raised to 3.12** -- from this release, published wheels are built against the CPython stable ABI (abi3) and tagged `cp312-abi3-<plat>`, so a single wheel per platform/backend covers CPython 3.12, 3.13, and 3.14. This collapses the former five-per-version wheel set (~5x fewer wheels) to keep the project under PyPI's 10 GB size limit. `requires-python` is raised `>=3.10` -> ` |
| `0.2.18` | 2026-05-17 | High | ## Changes since the last Release  ### Added  - **stable-diffusion.cpp updated (new C-surface fields and sample methods)** -- `src/cyllama/sd/stable_diffusion.pxd` and `src/cyllama/sd/stable_diffusion.pyx` mirror the upstream header changes: two new `sample_method_t` values (`EULER_CFG_PP_SAMPLE_METHOD`, `EULER_A_CFG_PP_SAMPLE_METHOD`); `sd_ctx_params_t` gains `backend` / `params_backend` (`const char *`) fields; `sd_sample_params_t` gains `extra_sample_args` (`const char *`); `new_upscaler_ |
| `0.2.17` | 2026-05-13 | High | This is bug-fix released quickly to correct a bug in 0.2.16 which causes GPU variants to break for `stable-diffusion.cpp`. The fix is very simple, see the [0.2.16 release](https://github.com/shakfu/cyllama/releases/tag/0.2.16) for work-arounds and a deeper treatment of the issue, or better, just install this release, which is 0.2.16 with the fix. |
| `0.2.15` | 2026-05-03 | High | cyllama is a no-dependencies Python library for local AI inference built on the `.cpp` inference stack:  - **[llama.cpp](https://github.com/ggml-org/llama.cpp)** - Text generation, chat, embeddings, and text-to-speech - **[whisper.cpp](https://github.com/ggerganov/whisper.cpp)** - Speech-to-text transcription and translation - **[stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp)** - Image and video generation  ## Changes since the last Release  ### Added  - **`cylla |
| `0.2.14` | 2026-04-27 | High | cyllama is a no-dependencies Python library for local AI inference built on the `.cpp` inference stack:  - **[llama.cpp](https://github.com/ggml-org/llama.cpp)** - Text generation, chat, embeddings, and text-to-speech  - **[whisper.cpp](https://github.com/ggerganov/whisper.cpp)** - Speech-to-text transcription and translation  - **[stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp)** - Image and video generation  ## Changes since the last Release  ### Added  - **st |
| `0.2.12` | 2026-04-23 | High | cyllama is a comprehensive no-dependencies Python library for local AI inference built on the state-of-the-art `.cpp` ecosystem:  - **[llama.cpp](https://github.com/ggml-org/llama.cpp)** - Text generation, chat, embeddings, and text-to-speech - **[whisper.cpp](https://github.com/ggerganov/whisper.cpp)** - Speech-to-text transcription and translation - **[stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp)** - Image and video generation  ## Changes since the last Release |
| `0.2.11` | 2026-04-19 | High | cyllama is a comprehensive no-dependencies Python library for local AI inference built on the state-of-the-art `.cpp` ecosystem:  - **[llama.cpp](https://github.com/ggml-org/llama.cpp)** - Text generation, chat, embeddings, and text-to-speech - **[whisper.cpp](https://github.com/ggerganov/whisper.cpp)** - Speech-to-text transcription and translation - **[stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp)** - Image and video generation  ## Changes since the last Release |
| `0.2.10` | 2026-04-17 | High | cyllama is a comprehensive no-dependencies Python library for local AI inference built on the state-of-the-art `.cpp` ecosystem:  - **[llama.cpp](https://github.com/ggml-org/llama.cpp)** - Text generation, chat, embeddings, and text-to-speech - **[whisper.cpp](https://github.com/ggerganov/whisper.cpp)** - Speech-to-text transcription and translation - **[stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp)** - Image and video generation  ## Changes since the last Release |
| `0.2.9` | 2026-04-16 | High | cyllama is a comprehensive no-dependencies Python library for local AI inference built on the state-of-the-art `.cpp` ecosystem:  - **[llama.cpp](https://github.com/ggml-org/llama.cpp)** - Text generation, chat, embeddings, and text-to-speech - **[whisper.cpp](https://github.com/ggerganov/whisper.cpp)** - Speech-to-text transcription and translation - **[stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp)** - Image and video generation  **NOTE**: In the last release, it w |

## Citation

- HTML: https://www.freshcrate.ai/projects/cyllama
- Markdown: https://www.freshcrate.ai/projects/cyllama.md
- Dependencies JSON: https://www.freshcrate.ai/api/projects/cyllama/deps

_Generated by freshcrate.ai. Indexes github releases for AI-agent ecosystem packages._
