Description
<a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a> # Cython BLIS: Fast BLAS-like operations from Python and Cython, without the tears This repository provides the [Blis linear algebra](https://github.com/flame/blis) routines as a self-contained Python C-extension. Currently, we only supports single-threaded execution, as this is actually best for our workloads (ML inference). [](https://github.com/explosion/cython-blis/actions/workflows/tests.yml) [](https://pypi.python.org/pypi/blis) [](https://anaconda.org/conda-forge/cython-blis) [](https://github.com/explosion/wheelwright/releases) ## Installation You can install the package via pip, first making sure that `pip`, `setuptools`, and `wheel` are up-to-date: ```bash pip install -U pip setuptools wheel pip install blis ``` Wheels should be available, so installation should be fast. If you want to install from source and you're on Windows, you'll need to install LLVM. ### Building BLIS for alternative architectures The provided wheels should work on x86_64 architectures. Unfortunately we do not currently know a way to provide different wheels for alternative architectures, and we cannot provide a single binary that works everywhere. So if the wheel doesn't work for your CPU, you'll need to specify source distribution, and tell Blis your CPU architecture using the `BLIS_ARCH` environment variable. #### a) Installing with generic arch support ```bash BLIS_ARCH="generic" pip install spacy --no-binary blis ``` #### b) Building specific support In order to compile Blis, `cython-blis` bundles makefile scripts for specific architectures, that are compiled by running the Blis build system and logging the commands. We do not yet have logs for every architecture, as there are some architectures we have not had access to. [See here](https://github.com/flame/blis/blob/0.5.1/config_registry) for list of architectures. For example, here's how to build support for the ARM architecture `cortexa57`: ```bash git clone https://github.com/explosion/cython-blis && cd cython-blis git pull && git submodule init && git submodule update && git submodule status python3 -m venv env3.6 source env3.6/bin/activate pip install -r requirements.txt ./bin/generate-make-jsonl linux cortexa57 BLIS_ARCH="cortexa57" python setup.py build_ext --inplace BLIS_ARCH="cortexa57" python setup.py bdist_wheel ``` Fingers crossed, this will build you a wheel that supports your platform. You could then [submit a PR](https://github.com/explosion/cython-blis/pulls) with the `blis/_src/make/linux-cortexa57.jsonl` and `blis/_src/include/linux-cortexa57/blis.h` files so that you can run: ```bash BLIS_ARCH=cortexa57 pip install --no-binary=blis ``` ## Usage Two APIs are provided: a high-level Python API, and direct [Cython](http://cython.org) access, which provides fused-type, nogil Cython bindings to the underlying Blis linear algebra library. Fused types are a simple template mechanism, allowing just a touch of compile-time generic programming: ```python cimport blis.cy A = <float*>calloc(nN * nI, sizeof(float)) B = <float*>calloc(nO * nI, sizeof(float)) C = <float*>calloc(nr_b0 * nr_b1, sizeof(float)) blis.cy.gemm(blis.cy.NO_TRANSPOSE, blis.cy.NO_TRANSPOSE, nO, nI, nN, 1.0, A, nI, 1, B, nO, 1, 1.0, C, nO, 1) ``` Bindings have been added as we've needed them. Please submit pull requests if the library is missing some functions you require. ## Development To build the source package, you should run the following command: ```bash ./bin/update-vendored-source ``` This populates the `blis/_src` folder for the various architectures, using the `flame-blis` submodule. ## Updating the build files In order to compile the Blis sources, we use jsonl files that provide the explicit compiler flags. We build these jsonl files by running Blis's build system, and then converting the log. This avoids us having to replicate the build system within Python: we just use the jsonl to make a bunch of subprocess calls. To support a new OS/architecture combination, we have to provide the jsonl file and the header. ### Linux The Linux build files need to be produced from within the manylinux1 docker container, so that they will be compatible with the wheel building process. First, install docker. Then do the following to start the container: sudo docker run -it quay.io/pypa/manylinux1_x86_64:latest Once within the container, the following commands should
Release History
| Version | Changes | Urgency | Date |
|---|---|---|---|
| 1.3.3 | Imported from PyPI (1.3.3) | Low | 4/21/2026 |
| release-v1.3.3 | Release release-v1.3.3 | Low | 11/17/2025 |
| release-v1.3.2 | Release release-v1.3.2 | Low | 11/12/2025 |
| release-v1.3.1 | Add wheels for Python 3.14 and drop support for Python 3.9. | Low | 11/12/2025 |
| release-v1.2.0 | * Resolve Windows instability by reverting back to the v0.7 version of the vendored Blis library we'd been using. This was surprisingly hard to do, because I had a lot of trouble getting it to compile due to conflicts between cpuid.h and intrin.h on Windows. * Widen the runtime numpy compatibility range to allow usage alongside numpy 1 and numpy 2. We still build against numpy 2 though. | Low | 1/13/2025 |
| release-v1.1.0 | The 0.9 version of Blis vendored in this package's v1.0 had memory access errors on Windows. I've had trouble rolling back to v0.7 to address this, as it isn't building on Github Actions (some environment/compiler toolchain version issue). I've therefore updated the vendored Blis to v1.0, which seems to address the crashes. I've also widened the numpy compatibility across v1 and v2. I had thought we had to use the same numpy version we built against, but this isn't the case. | Low | 12/16/2024 |
| prerelease-v1.1.0a0 | Experimental release updating vendored Blis to 1.0. | Low | 12/12/2024 |
| release-v1.0.2 | Release release-v1.0.2 | Low | 12/10/2024 |
| release-v1.0.1 | MacOS and Linux ARM wheels were missing from the v1.0.0 release, as we've migrated to a more streamlined process to build wheels using only Github-hosted Actions runners. This release restores support for MacOS ARM wheels. Support for Linux ARM runners is coming soon, but currently only available for private repos. Unfortunately QEMU emulation is too slow for the build. | Low | 9/13/2024 |
| release-v1.0.0 | Depend on numpy v2, instead of v1. This introduces binary incompatibilities, so libraries won't easily be able to depend across this version and previous. There's therefore no other changes. Incrementing to 1.0 as we've been stable a while, so no reason to be 0.x. | Low | 7/27/2024 |
| prerelease-v1.0.0a1 | Depend on numpy v2, instead of v1. This introduces binary incompatibilities, so libraries won't easily be able to depend across this version and previous. There's therefore no other changes. Incrementing to 1.0 as we've been stable a while, so no reason to be 0.x. | Low | 7/25/2024 |
| v0.7.11 | * Package updates and binary wheels for python 3.12. | Low | 9/22/2023 |
| v0.7.10 | - Restrict build to Cython 0.29.x due to issues with Cython 3 for windows. | Low | 7/21/2023 |
| v0.7.9 | * Package updates and binary wheels for python 3.11. | Low | 10/16/2022 |
| v0.9.1 | * Make BLIS symbols visible when compiled with gcc/clang (#71) * Use uninitialized array for `gemm` when beta is zero (#72) * Fix OOB reads in certain `sgemm` Haswell kernels (#73, #74) | Low | 8/4/2022 |
| v0.7.8 | * Add cdef nogil sgemm and saxpy functions (#70) * Make BLIS symbols visible when compiled with gcc/clang (#71) * Use uninitialized array for gemm when beta is zero (#72) | Low | 6/22/2022 |
| v0.9.0 | * Update to [BLIS v0.9.0](https://github.com/flame/blis/releases/tag/0.9.0) * Add `cdef nogil` `sgemm` and `saxpy` functions (#70) | Low | 5/10/2022 |
| v0.7.7 | - Add support for Windows `generic` target. Thanks to @nsait-linaro for this contribution! | Low | 3/22/2022 |
| v0.7.6 | * Fix issue #61: Support multi-token compiler options such as "ccache gcc". * Fix issue #65: Check shape compatibility in gemm. | Low | 3/1/2022 |
| v0.7.5 | * Fall back to Zen 2 kernels on Zen 3 CPU | Low | 10/27/2021 |
| v0.7.0 | This version updates the underlying Blis library to v0.7.0, which brings a number of performance improvements, including better support for AMD architectures. The new version requires a relatively recent compiler such as gcc v9 to compile the kernels for the more recent architectures. You can set `BLIS_ARCH=generic` or `BLIS_ARCH=x86_64_no_skx` to avoid the need for this. I haven't yet added build logs for ARM systems, so you'll need to use `BLIS_ARCH=generic` for now to build for ARM hard | Low | 8/15/2020 |
| v0.4.0 | * Add support for ARM-64 cortexa53, used in e.g. Raspberry Pi. To build a non-x86_64 architecture, set the environment variable `BLIS_ARCH=cortexa53` prior to installation. * Support for read-only numpy arrays, via const declarations. * Fix build system to allow building on conda-forge. | Low | 8/23/2019 |
| v0.3.1 | * Change license to BSD. * Update vendored `blis` to v0.5.1. | Low | 8/18/2019 |
