Description
<a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a> # spaCy: Industrial-strength NLP spaCy is a library for **advanced Natural Language Processing** in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. spaCy comes with [pretrained pipelines](https://spacy.io/models) and currently supports tokenization and training for **70+ languages**. It features state-of-the-art speed and **neural network models** for tagging, parsing, **named entity recognition**, **text classification** and more, multi-task learning with pretrained **transformers** like BERT, as well as a production-ready [**training system**](https://spacy.io/usage/training) and easy model packaging, deployment and workflow management. spaCy is commercial open-source software, released under the [MIT license](https://github.com/explosion/spaCy/blob/master/LICENSE). š« **Version 3.8 out now!** [Check out the release notes here.](https://github.com/explosion/spaCy/releases) [](https://github.com/explosion/spaCy/actions/workflows/tests.yml) [](https://github.com/explosion/spaCy/releases) [](https://pypi.org/project/spacy/) [](https://anaconda.org/conda-forge/spacy) [](https://github.com/explosion/wheelwright/releases) [](https://github.com/ambv/black) <br /> [](https://pypi.org/project/spacy/) [](https://anaconda.org/conda-forge/spacy) ## š Documentation | Documentation | | | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | āļø **[spaCy 101]** | New to spaCy? Here's everything you need to know! | | š **[Usage Guides]** | How to use spaCy and its features. | | š **[New in v3.0]** | New features, backwards incompatibilities and migration guide.
Release History
| Version | Changes | Urgency | Date |
|---|---|---|---|
| 3.8.14 | Imported from PyPI (3.8.14) | Low | 4/21/2026 |
| release-v3.8.14 | - Fix `spacy download` failing in environments where `pip` is not on PATH but is available as a Python module (e.g., some virtual environments and containers) | Medium | 3/29/2026 |
| release-v3.8.13 | The v3.8.12 release didn't update the confection pin, which meant that if you did an upgrade-install models wouldn't load. | Medium | 3/23/2026 |
| release-v3.8.12 | Use confection v1.3 and Thinc v8.3.13, which implement custom validation logic in place of Pydantic, allowing us to properly adopt Pydantic v2 and provide full Python 3.14 support. Our dependency tree used Pydantic v1 in unusual ways, and relied on behaviours that Pydantic v2 reformed. In the time since Pydantic v2 was released there were a few attempts to migrate over to it, but the task has been complicated by the fact that the confection library has a fairly tangled implementation and I ha | Medium | 3/23/2026 |
| release-v3.8.11 | Add wheels for Python 3.11, 3.12, 3.13 and 3.14 for Windows ARM. Windows ARM wheels for Python 3.10 and earlier are not available in numpy, so aren't provided. | Low | 11/17/2025 |
| release-v3.8.10 | Release release-v3.8.10 | Low | 11/17/2025 |
| release-v3.8.9 | Add wheels for Python 3.14 | Low | 11/13/2025 |
| release-v3.8.8 | * Fix deprecation warnings from click imports * Update requirements, including switch to `typer-slim` to reduce dependency footprint * Drop support for Python 3.9 (end-of-life) Other dependencies in spaCy's tree have also been updated to widen the numpy compatibility pin, which should reduce installation problems for some users. | Low | 11/7/2025 |
| release-v3.8.7 | In order to support Python 3.13, spaCy is now compiled with Cython 3. This brings a change to the way types are handled at runtime (Cython 3 uses the `from __future__ import annotations` semantics, which stores types as strings at runtime. This difference caused problems for components registered within Cython files, as we rely on building Pydantic models from factory function signatures to do validation. To support Python 3.13 we therefore create a new module, `spacy.pipeline.factories`, whi | Low | 5/23/2025 |
| release-v3.8.6 | Restores support for wheels for ARM platforms, while correctly noting compatibility range. | Low | 5/19/2025 |
| release-v3.8.3 | Fix bug in memory zones when non-transient strings were added to the StringStore inside a memory zone. This caused a bug in the morphological analyser that caused string not found errors when applied during a memory zone. | Low | 12/11/2024 |
| release-v3.8.2 | # Optional memory management for persistent services Support a new context manager method `Language.memory_zone()`, to allow long-running services to avoid growing memory usage from cached entries in the `Vocab` or `StringStore`. Once the memory zone block ends, spaCy will evict `Vocab` and `StringStore` entries that were added during the block, freeing up memory. `Doc` objects created inside a memory zone block should not be accessed outside the block. The current implementation disables | Low | 10/1/2024 |
| prerelease-v3.8.0.dev0 | Support a new context manager method `Language.memory_zone()`, to allow long-running services to avoid growing memory usage from cached entries in the `Vocab` or `StringStore`. Once the memory zone block ends, spaCy will evict `Vocab` and `StringStore` entries that were added during the block, freeing up memory. `Doc` objects created inside a memory zone block should not be accessed outside the block. The current implementation disables population of the tokenizer cache inside the memory zone | Low | 9/9/2024 |
| prerelease-v3.7.6a | Release prerelease-v3.7.6a | Low | 8/20/2024 |
| v3.7.5 | # ⨠New features and improvements * Sanitize direct download for `spacy download` (#13313). * Convert Cython properties to decorator syntax (#13390). * Bump Weasel pin to allow v0.4.x (#13409). * Improvements to the test suite (#13469, #13470). * Bump Typer pin to allow v0.10.0 and above (#13471). * Allow `typing-extensions<5.0.0` for Python < 3.8 (#13516). ## š“ Bug fixes * #13400: Fix `use_gold_ents` behaviour for EntityLinker. ## š Documentation and examples * Make the | Low | 6/5/2024 |
| v3.7.4 | ## ⨠New features and improvements * Improve NumPy 2.0 compatibility (#13103). * Added language extensions for Faroese and Norwegian Nynorsk (#13116). * Add new [`TextCatReduce.v1`](https://spacy.io/api/architectures#TextCatReduce) layer for text classification (#13181). * Add new [`TextCatParametricAttention.v1 `](https://spacy.io/api/architectures#TextCatParametricAttention) layer for text classification (#13201). * Use `build` module for creating model packages by default (#13109). * | Low | 2/15/2024 |
| v3.7.2 | ## ⨠New features and improvements - Update `__all__` fields (#13063). ## š“ Bug fixes - #13035: Remove Pathy requirement. - #13053: Restore `spacy.cli.project` API. - #13057: Support `Any` comparisons for `Token` and `Span`. ## š Documentation and examples - Many updates for `spacy-llm` including Azure OpenAI, PaLM, and Mistral support. - Various documentation corrections. ## š„ Contributors @adrianeboyd, @honnibal, @ines, @rmitsch, @svlandeg | Low | 10/16/2023 |
| v3.7.1 | ## š“ Bug fixes - Revert lazy loading of CLI module for `spacy.info` to fix availability of `spacy.cli` following `import spacy` (#13040). ## š„ Contributors @adrianeboyd, @honnibal, @ines, @svlandeg | Low | 10/5/2023 |
| v3.7.0 | This release drops support for Python 3.6 and adds support for Python 3.12. ## ⨠New features and improvements - Add support for Python 3.12 (#12979). - Use the new library [Weasel](https://github.com/explosion/weasel) for spaCy projects functionality (#12769). - All `spacy project` commands should run as before, just now they're using Weasel under the hood. - ā ļø Remote storage is not yet supported for Python 3.12. Use Python 3.11 or earlier for remote storage. - Extend to Thi | Low | 10/2/2023 |
| v3.6.1 | ## ⨠New features and improvements - Allow Pydantic v2 using transitional v1 support (#12888). - Add `find-function` CLI for finding locations of registered functions (#12757). - Add extra `spacy[cuda12x]` for `cupy-cuda12x` (#12890). - Extend tests for `init config` and `train` CLI (#12173). - Switch from `distutils` to `setuptools`/`sysconfig` (#12853). ## š“ Bug fixes - #12817: Escape annotated HTML tags in displaCy span renderer. - #12857: Display model's full base version stri | Low | 8/8/2023 |
| v3.6.0 | ## ⨠New features and improvements - **NEW**: [`span_finder` pipeline component](https://spacy.io/api/spanfinder) to identify overlapping, unlabeled spans (#12507). - Language updates: - Add initial support for Malay (#12602). - Update Latin defaults to support noun chunks, update lexical/tokenizer defaults and add example sentences (#12538). - Add option to return scores separately keyed by component name with `spacy evaluate --per-component`, `Language.evaluate(per_component=Tru | Low | 7/7/2023 |
| v3.5.4 | ## ⨠New features and improvements - Extend Typer support to v0.9 (#12631). ## š“ Bug fixes - #12701: Fix issues with component names and listeners for sourced components. - #12623: Support overrides for registered functions in configs. ## š„ Contributors @adrianeboyd, @bdura, @honnibal, @ines, @svlandeg | Low | 6/28/2023 |
| v3.2.6 | This bug fix release is primarily to address Pydantic incompatibility with `typing_extensions>=4.6.0`. ## ⨠New features and improvements - Huge speed improvements for `spancat`, in particular on GPU (~10x-30x faster) (#12577). ## š“ Bug fixes - Add `typing_extensions` requirement due to Pydantic incompatibility with `typing_extensions>=4.6.0`. - Remove `#egg` from download URLs due to future deprecation in `pip`. ## š„ Contributors @adrianeboyd, @honnibal, @ines, @kadarakos, | Low | 5/25/2023 |
| v3.3.3 | This bug fix release is primarily to address Pydantic incompatibility with `typing_extensions>=4.6.0`. ## ⨠New features and improvements - Huge speed improvements for `spancat`, in particular on GPU (~10x-30x faster) (#12577). ## š“ Bug fixes - Add `typing_extensions` requirement due to Pydantic incompatibility with `typing_extensions>=4.6.0`. - Remove `#egg` from download URLs due to future deprecation in `pip`. ## š„ Contributors @adrianeboyd, @honnibal, @ines, @kadarakos, | Low | 5/25/2023 |
| v3.5.3 | ## ⨠New features and improvements - Huge speed improvements for `spancat`, in particular on GPU (~10x-30x faster) (#12577). - Improve speed for child operators (`>+`, `>-`, `>++`, `>--`) for the dependency matcher (#12528). - Improve loading speed for tokenizers with a large number of exceptions (#12553). - Support `doc.spans` for displaCy output in `spacy benchmark accuracy` / `spacy evaluate` (#12575). - Add `MorphAnalysis.get(default=)` argument for user-provided default values simila | Low | 5/15/2023 |
| v3.5.2 | ## ⨠New features and improvements - Add support for floret vectors in `spacy pretrain` (#12435). - Save final model as `model-last.bin` for `spacy pretrain` (#12459). - Support `Span` input for `displacy.parse_deps` (#12477). - Extend support to CuPy 12.0 for `cupy` install extras. ## š“ Bug fixes - #12398: Fix entity linker failure on sentence-crossing entities. - #12405: Fix sentence indexing bug in `Span.sents`. - #12469: Fix scores attribute for `spancat_singlelabel`. - #1248 | Low | 4/12/2023 |
| v3.5.1 | š„ **We'd love to hear more about your experience with spaCy!** [Take our survey here.](https://form.typeform.com/to/aMel9q9f) ## ⨠New features and improvements - **NEW**: `spancat_singlelabel` pipeline component for multi-class and non-overlapping span classification. The `spancat_singlelabel` component predicts at most one label for each suggested span and adds a new setting `allow_overlap` to restrict the output to non-overlapping spans (#11365). - Extend to mypy v1.0 (#12245). - Use | Low | 3/10/2023 |
| v3.5.0 | ## ⨠New features and improvements - **NEW:** New `apply` [CLI command](https://spacy.io/api/cli#apply) to annotate new documents with a trained pipeline (#11376). - **NEW:** New `benchmark` [CLI command](https://spacy.io/api/cli#benchmark) to benchmark pipelines. The new `benchmark speed` subcommand measures the speed of a pipeline, the `benchmark accuracy` subcommand is a new alias for `evaluate` (#11902). - **NEW:** New `find-threshold` [CLI command](https://spacy.io/api/cli#find-th | Low | 1/20/2023 |
| v2.3.9 | This release addresses future compatibility with NumPy v1.24+. ## š“ Bug fixes - #11940: Update for compatibility with NumPy v1.24+ integer conversions. ## š„ Contributors @adrianeboyd, @honnibal, @ines, @svlandeg | Low | 12/16/2022 |
| v3.0.9 | This bug fix release is primarily to avoid deprecation warnings and future incompatibility with NumPy v1.24+. ## š“ Bug fixes - #11331, #11701: Clean up warnings in spaCy and its test suite. - #11845: Don't raise an error in displaCy for unset spans keys. - #11864: Add `smart_open` requirement and update deprecated options. - #11899: Fix `spacy init config --gpu` for environments without `spacy-transformers`. - #11933: Update for compatibility with NumPy v1.24+ integer conversions. - | Low | 12/16/2022 |
| v3.1.7 | This bug fix release is primarily to avoid deprecation warnings and future incompatibility with NumPy v1.24+. ## š“ Bug fixes - #10573: Remove Click pin following Typer updates. - #11331, #11701: Clean up warnings in spaCy and its test suite. - #11845: Don't raise an error in displaCy for unset spans keys. - #11860: Fix `spancat` for docs with zero suggestions. - #11864: Add `smart_open` requirement and update deprecated options. - #11899: Fix `spacy init config --gpu` for environment | Low | 12/16/2022 |
| v3.2.5 | This bug fix release is primarily to avoid deprecation warnings and future incompatibility with NumPy v1.24+. ## š“ Bug fixes - #10573: Remove Click pin following Typer updates. - #11331, #11701: Clean up warnings in spaCy and its test suite. - #11845: Don't raise an error in displaCy for unset spans keys. - #11860: Fix `spancat` for docs with zero suggestions. - #11864: Add `smart_open` requirement and update deprecated options. - #11899: Fix `spacy init config --gpu` for environment | Low | 12/16/2022 |
| v3.3.2 | This bug fix release is primarily to avoid deprecation warnings and future incompatibility with NumPy v1.24+. ## š“ Bug fixes - #10911, #11194: Improve speed in `precomputable_biaffine` by avoiding concatenation. - #11276, #11331, #11701: Clean up warnings in spaCy and its test suite. - #11845: Don't raise an error in displaCy for unset spans keys. - #11860: Fix `spancat` for docs with zero suggestions. - #11864: Add `smart_open` requirement and update deprecated options. - #11899: Fi | Low | 12/16/2022 |
| v3.4.4 | This bug fix release is primarily to avoid deprecation warnings and future incompatibility with NumPy v1.24+. ## š“ Bug fixes - #11845: Don't raise an error in displaCy for unset spans keys. - #11860: Fix `spancat` for docs with zero suggestions. - #11864: Add `smart_open` requirement and update deprecated options. - #11899: Fix `spacy init config --gpu` for environments without `spacy-transformers`. - #11933: Update for compatibility with NumPy v1.24+ integer conversions. - #11934: A | Low | 12/14/2022 |
| v3.4.3 | ## ⨠New features and improvements - Extend Typer support to v0.7.x (#11720). ## š“ Bug fixes - #11640: Handle docs with no entities in `EntityLinker`. - #11688: Restore custom doc extension values in `Doc.to_json()` for attributes set by getters. - #11706: Remove incorrect warning for `pipeline_package.load()`. - #11735: Improve `spacy project` requirements checks for unsupported specifiers and requirements lines. - #11745: Revert modifications to `spacy.load(disable=)` that could | Low | 11/10/2022 |
| v3.4.2 | ## ⨠New features and improvements - **NEW:** Luganda language support (#10847). - **NEW:** Latin language support (#11349). - **NEW:** `spacy.ConsoleLogger.v2` optionally saves training logs to JSONL (#11214). - **NEW:** New [operators](https://spacy.io/api/dependencymatcher#operators) for the `DependencyMatcher` to include matching parents or children to the left or the right of the node (#10371). - Prebuilt Python 3.11 wheels are now available for all spaCy dependencies distributed by | Low | 10/20/2022 |
| v2.3.8 | ## ⨠New features and improvements * Updates and binary wheels for Python 3.10 and 3.11. ## š„ Contributors @adrianeboyd, @honnibal, @ines | Low | 10/19/2022 |
| v3.4.1 | ## š“ Bug fixes - Fix issue #11137: Fix compatibility with CuPy v9.x. ## š Documentation and examples - spaCy universe additions: - [BERTopic](https://spacy.io/universe/project/bertopic): Leveraging BERT and c-TF-IDF to create easily interpretable topics. - [English Interpretation Sentence Pattern](https://spacy.io/universe/project/sent-pattern): English interpretation for accurate translation from English to Japanese. ## š„ Contributors @adrianeboyd, @danieldk, @honnib | Low | 7/26/2022 |
| v3.4.0 | ## ⨠New features and improvements - Support for mypy 0.950+ and pydantic v1.9 (#10786). - Prebuilt linux aarch64 wheels are now available for all spaCy dependencies distributed by [@explosion](https://github.com/explosion). - Min/max `{n,m}` operator for `Matcher` patterns (#10981). - Language updates: - Improve tokenization for Cyrillic combining diacritics (#10837). - Improve English tokenizer exceptions for contractions with this/that/these/those (#10873). - Improved speed o | Low | 7/12/2022 |
| v3.3.1 | ## ⨠New features and improvements * Add the [SpanRuler](https://spacy.io/api/spanruler) component. This component saves a list of matched spans to [`Doc.spans[spans_key]`](https://spacy.io/api/doc#spans). * Support for JSON [serialization](https://spacy.io/api/doc#to_json) and [deserialization](https://spacy.io/api/doc#from_json) of [`Doc`](https://spacy.io/api/doc) objects. * Add span analysis to [`debug data`](https://spacy.io/api/cli#debug-data). * Allow [data assets](https://spacy.io/ | Low | 6/7/2022 |
| v3.3.0 | ## ⨠New features and improvements - Improved speeds for many components, see [speed benchmarks for trained pipelines](https://spacy.io/usage/v3-3#speed): - Speed up parser and NER by using constant-time head lookups (#10048). - Support unnormalized softmax probabilities in `spacy.Tagger.v2` to speed up inference for the tagger, morphologizer, senter and trainable lemmatizer (#10197). - Speed up parser projectivization functions (#10241). - Replace `Ragged` with faster `Al | Low | 4/29/2022 |
| v3.1.6 | ## š“ Bug fixes * Fix issue #10564: Restrict supported Click versions as a workaround for incompatibilities between Click v8.1.0 and Typer v0.4.0. ## š„ Contributors @adrianeboyd, @honnibal, @ines | Low | 3/30/2022 |
| v3.2.4 | ## š“ Bug fixes * Fix issue #10564: Restrict supported Click versions as a workaround for incompatibilities between Click v8.1.0 and Typer v0.4.0. ## š„ Contributors @adrianeboyd, @honnibal, @ines | Low | 3/29/2022 |
| v3.2.3 | ## š“ Bug fixes * Fix issue #10324: Fix `Tok2Vec` for empty batches. ## š„ Contributors @adrianeboyd, @honnibal, @ines | Low | 3/1/2022 |
| v3.1.5 | ## š“ Bug fixes * Fix issue #9593: Use metaclass to subclass errors for easier pickling. * Fix issue #9654: Fix `spancat` for empty docs and zero suggestions. * Fix issue #9979: Fix type of `Lexeme.rank`. * Fix issue #10324: Fix `Tok2Vec` for empty batches. ## š„ Contributors @adrianeboyd, @BramVanroy, @brucewlee, @danieldk, @honnibal, @ines, @ljvmiranda921, @polm, @svlandeg, @vgautam, @xxyzz | Low | 3/1/2022 |
| v3.0.8 | ## š“ Bug fixes * Fix issue #10324: Fix `Tok2Vec` for empty batches. ## š„ Contributors @adrianeboyd, @danieldk, @honnibal, @ines | Low | 3/1/2022 |
| v3.2.2 | ## ⨠New features and improvements - Improved `parser` and `ner` speeds on long documents (see technical details in #10019). - Support for `spancat` components in `debug data`. - Support for `ENT_IOB` as a `Matcher` token pattern key. - Extended and improved types for many classes. ## š“ Bug fixes - Fix issue #9735: Make floret murmurhash endian-neutral. - Fix issue #9738: Support string IOB values for `ENT_IOB`. - Fix issue #9746: Updates to avoid "dictionary size changed dur | Low | 2/11/2022 |
| v3.2.1 | ## ⨠New features and improvements - **NEW**: `doc_cleaner` component for removing `doc.tensor`,`doc._._trf_data` or other `Doc` attributes at the end of the pipeline to reduce size of output docs. - **NEW**: `ENT_ID` and `ENT_KB_ID` to `Matcher` pattern attributes. - Support `kb_id` for entities in displaCy from `Doc` input. - Add `Span.sents` property for spans spanning over more than one sentence. - Add `EntityRuler.remove` to remove patterns by `id`. - Make the `Tagger` `neg_prefix` | Low | 12/7/2021 |
| v3.2.0 | ## ⨠New features and improvements - **NEW:** Registered scoring functions for each component in the config. - **NEW:** `nlp()` and `nlp.pipe()` accept `Doc` input, which simplifies setting custom tokenization or extensions before processing. - **NEW:** Support for [floret](https://github.com/explosion/floret) vectors, which combine fastText subwords with Bloom embeddings for compact, full-coverage vectors. - `overwrite` config settings for `entity_linker`, `morphologizer`, `tagger`, `sen | Low | 11/5/2021 |
| v3.1.4 | ## ⨠New features and improvements - **NEW:** Binary wheels for Python 3.10. - **NEW:** Improve performance on Apple M1 with [`AppleOps`](https://github.com/explosion/thinc-apple-ops): `pip install spacy[apple]`. - GPU profiling with [`spacy.models_with_nvtx_range.v1`](https://spacy.io/api/top-level#models_with_nvtx_range). - Full `mypy` integration in the CI and many type fixes across the code base. - Added custom `Protocol` classes in `ty.py` to define behavior of pipeline components. | Low | 10/29/2021 |
| v3.1.3 | ## ⨠New features and improvements * The `v3` of [`WandbLogger`](https://spacy.io/api/top-level#WandbLogger) now supports optional `run_name` and `entity` parameters. * Improved UX when providing invalid `pos` values for a `Doc` or `Token`. ## š“ Bug fixes * Fix issue #9001: Pass alignments to `Matcher` callbacks. * Fix issue #9009: Include component factories in third-party dependencies resolver. * Fix issue #9012: Correct type of `config` in `create_pipe`. * Fix issue #9014: Allow [`t | Low | 9/20/2021 |
