# transformers

> Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

- **URL**: https://www.freshcrate.ai/projects/transformers
- **Author**: The Hugging Face team with the help of all our contributors (https://github.com/hu
- **Category**: Frameworks
- **Latest version**: `v5.10.1` (2026-06-03)
- **License**: Apache 2.0 License
- **Source**: https://github.com/huggingface/transformers
- **Language**: Python
- **GitHub**: 159,705 stars, 32,961 forks
- **Registry**: pypi (`transformers`)
- **Tags**: `deep-learning`, `llm`, `machine-learning`, `nlp`, `pypi`, `python`, `pytorch`, `transformer`, `vlm`

## Description

<!---
Copyright 2020 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

<p align="center">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="https://huggingface.co/datasets/huggingface/documentation-images/raw/main/transformers-logo-dark.svg">
    <source media="(prefers-color-scheme: light)" srcset="https://huggingface.co/datasets/huggingface/documentation-images/raw/main/transformers-logo-light.svg">
    <img alt="Hugging Face Transformers Library" src="https://huggingface.co/datasets/huggingface/documentation-images/raw/main/transformers-logo-light.svg" width="352" height="59" style="max-width: 100%;">
  </picture>
  <br/>
  <br/>
</p>

<p align="center">
    <a href="https://huggingface.com/models"><img alt="Checkpoints on Hub" src="https://img.shields.io/endpoint?url=https://huggingface.co/api/shields/models&color=brightgreen"></a>
    <a href="https://circleci.com/gh/huggingface/transformers"><img alt="Build" src="https://img.shields.io/circleci/build/github/huggingface/transformers/main"></a>
    <a href="https://github.com/huggingface/transformers/blob/main/LICENSE"><img alt="GitHub" src="https://img.shields.io/github/license/huggingface/transformers.svg?color=blue"></a>
    <a href="https://huggingface.co/docs/transformers/index"><img alt="Documentation" src="https://img.shields.io/website/http/huggingface.co/docs/transformers/index.svg?down_color=red&down_message=offline&up_message=online"></a>
    <a href="https://github.com/huggingface/transformers/releases"><img alt="GitHub release" src="https://img.shields.io/github/release/huggingface/transformers.svg"></a>
    <a href="https://github.com/huggingface/transformers/blob/main/CODE_OF_CONDUCT.md"><img alt="Contributor Covenant" src="https://img.shields.io/badge/Contributor%20Covenant-v2.0%20adopted-ff69b4.svg"></a>
    <a href="https://zenodo.org/badge/latestdoi/155220641"><img src="https://zenodo.org/badge/155220641.svg" alt="DOI"></a>
</p>

<h4 align="center">
    <p>
        <b>English</b> |
        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_zh-hans.md">简体中文</a> |
        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_zh-hant.md">繁體中文</a> |
        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_ko.md">한국어</a> |
        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_es.md">Español</a> |
        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_ja.md">日本語</a> |
        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_hd.md">हिन्दी</a> |
        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_ru.md">Русский</a> |
        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_pt-br.md">Português</a> |
        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_te.md">తెలుగు</a> |
        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_fr.md">Français</a> |
        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_de.md">Deutsch</a> |
        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_it.md">Italiano</a> |
        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_vi.md">Tiếng Việt</a> |
        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_ar.md">العربية</a> |
        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_ur.md">اردو</a> |
        <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_bn.md">বাংলা</a> |
    </p>
</h4>

<h3 align="center">
    <p>State-of-the-art pretrained models for inference and training</p>
</h3>

<h3 align="center">
    <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/transformers_as_a_model_definition.png"/>
</h3>

Transformers acts as the model-definition framework for state-of-the-art machine learning with text, computer
vision, audio, video, and multimodal models, for both inference and training.

It centralizes the model definition so that this definition is agreed upon across the ecosystem. `transformers` is the
pivot across frameworks: if a model definition is supported, it will be compatible with the majority of training
frameworks (Axolotl, Unsloth, DeepSpeed, FSDP, PyTorch-Lightning, ...), inference engines (vLLM

## Recent releases

| Version | Date | Urgency | Changes |
| --- | --- | --- | --- |
| `v5.10.1` | 2026-06-03 | High | # Release v5.10.1 v5.10.0 was yanked as we publish on a corrupted branch. Sorry everyone, this happens when we rush a release!!!   ## New Model additions  ### Gemma4 unified+ Gemma4 MTP <img width="2000" height="400" alt="image" src="https://github.com/user-attachments/assets/5e3ee940-f78d-4343-ac7a-889930800aa6" />  Gemma 4 12B Unified is an **encoder-free** multimodal model with pretrained and instruction-tuned variants. Unlike [standard Gemma 4](./gemma4), which uses dedicated encoder |
| `v5.9.0` | 2026-05-20 | High | # Release v5.9.0   ## New Model additions  ### Cohere2Moe  Command A+ is a Mixture-of-Experts (MoE) language model from Cohere that features a hybrid attention pattern combining sliding window and full attention layers. The model incorporates both shared and routed experts and supports a very large context window for processing extensive text sequences.  **Links:** [Documentation](https://huggingface.co/docs/transformers/main/en/model_doc/cohere2_moe) * Add new cohere2_moe model (#4611 |
| `v5.8.1` | 2026-05-13 | High | # Patch release v5.8.1  This release is mainly to fix the Deepseek V4 integration!!!   <img width="714" height="774" alt="image" src="https://github.com/user-attachments/assets/0d85e891-a0ff-436e-a9d4-b6633096f2b5" />   * [fix] Add fatal_error to ContinuousBatchingManager so the serving... by @qgallouedec, @remi-or * Fix WeightConverter regex incorrectly matching shared_experts as experts by @silencelamb, @claude * Fix deepseek v4 by @ArthurZucker (#45892) * Deepseek v4 csa mask collaps |
| `v5.8.0` | 2026-05-05 | High | # Release v5.8.0   ## New Model additions  ### DeepSeek-V4  <img width="6604" height="3574" alt="image" src="https://github.com/user-attachments/assets/4c0fdb29-f770-463c-a97b-d24438896a4c" />  DeepSeek-V4 is the next-generation MoE (Mixture of Experts) language model from DeepSeek that introduces several architectural innovations over DeepSeek-V3. The architecture replaces Multi-head Latent Attention (MLA) with a hybrid local + long-range attention design, swaps residual connections fo |
| `v5.7.0` | 2026-04-28 | High | # Release v5.7.0   ## New Model additions  ### Laguna  <img width="699" height="176" alt="image" src="https://github.com/user-attachments/assets/d3bae269-bea7-4ddf-a53f-d4718befdb17" />  Laguna is Poolside's mixture-of-experts language model family that extends standard SwiGLU MoE transformers with two key innovations. It features per-layer head counts allowing different decoder layers to have different query-head counts while sharing the same KV cache shape, and implements a sigmoid Mo |
| `v5.6.2` | 2026-04-23 | High | # Patch release v5.6.2  Qwen 3.5 and 3.6 MoE (text-only) were broken when using with FP8. It should now work again with this :saluting_face:   * Fix configuration reading and error handling for kernels (https://github.com/huggingface/transformers/pull/45610) by @hmellor   **Full Changelog**: https://github.com/huggingface/transformers/compare/v5.6.1...v5.6.2 |
| `v5.6.0` | 2026-04-22 | High | # Release v5.6.0   ## New Model additions  ### OpenAI Privacy Filter  OpenAI Privacy Filter is a bidirectional token-classification model for personally identifiable information (PII) detection and masking in text. It is intended for high-throughput data sanitization workflows where teams need a model that they can run on-premises that is fast, context-aware, and tunable. The model labels an input sequence in a single forward pass, then decodes coherent spans with a constrained Viterbi pr |
| `5.5.4` | 2026-04-21 | Low | Imported from PyPI (5.5.4) |
| `v5.5.4` | 2026-04-13 | Medium | # Patch release v5.5.4  This is mostly some fixes that are good to have asap, mostly for tokenizers; ** Fix Kimi-K2.5 tokenizer regression and _patch_mistral_regex Attribute… (#45305) by ArthurZucker  For training: ** Fix #45305 + add regression test GAS (#45349) by florian6973, SunMarc ** Fix IndexError with DeepSpeed ZeRO-3 when kernels rotary is active (#…) by ArthurZucker  And for Qwen2.5-VL : ** Fix Qwen2.5-VL temporal RoPE scaling applied to still images (#45330) by Kash6, zucchi |
| `v5.5.3` | 2026-04-09 | Medium | Small patch release to fix `device_map` support for Gemma4! It contains the following commit:  - [gemma4] Fix device map auto (#45347) by @Cyrilvallez |

## Citation

- HTML: https://www.freshcrate.ai/projects/transformers
- Markdown: https://www.freshcrate.ai/projects/transformers.md
- Dependencies JSON: https://www.freshcrate.ai/api/projects/transformers/deps

_Generated by freshcrate.ai. Indexes pypi releases for AI-agent ecosystem packages._
