xgrammar

Efficient, Flexible and Portable Structured Generation

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

<div align="center" id="top"> <img src="https://raw.githubusercontent.com/mlc-ai/xgrammar/main/assets/logo.svg" alt="logo" width="400" margin="10px"></img> [![Documentation](https://img.shields.io/badge/docs-latest-green)](https://xgrammar.mlc.ai/docs/) [![License](https://img.shields.io/badge/license-apache_2-blue)](https://github.com/mlc-ai/xgrammar/blob/main/LICENSE) [![PyPI](https://img.shields.io/pypi/v/xgrammar)](https://pypi.org/project/xgrammar) [![PyPI Downloads](https://static.pepy.tech/badge/xgrammar)](https://pepy.tech/projects/xgrammar) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/mlc-ai/xgrammar) **Efficient, Flexible and Portable Structured Generation** [Get Started](#get-started) | [Documentation](https://xgrammar.mlc.ai/docs/) | [Blogpost](https://blog.mlc.ai/2024/11/22/achieving-efficient-flexible-portable-structured-generation-with-xgrammar) | [Technical Report](https://arxiv.org/abs/2411.15100) </div> ## News - [2025/12] XGrammar has been officially integrated into [Mirai](https://github.com/trymirai/uzu) - [2025/09] XGrammar has been officially integrated into [OpenVINO GenAI](https://github.com/openvinotoolkit/openvino.genai) - [2025/02] XGrammar has been officially integrated into [Modular's MAX](https://docs.modular.com/max/serve/structured-output) - [2025/01] XGrammar has been officially integrated into [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM). - [2024/12] XGrammar has been officially integrated into [vLLM](https://github.com/vllm-project/vllm). - [2024/12] We presented research talks on XGrammar at CMU, UC Berkeley, MIT, THU, SJTU, Ant Group, LMSys, Qingke AI, Camel AI. The slides can be found [here](https://docs.google.com/presentation/d/1iS7tu2EV4IKRWDaR0F3YD7ubrNqtGYUStSskceneelc/edit?usp=sharing). - [2024/11] XGrammar has been officially integrated into [SGLang](https://github.com/sgl-project/sglang). - [2024/11] XGrammar has been officially integrated into [MLC-LLM](https://github.com/mlc-ai/mlc-llm). - [2024/11] We officially released XGrammar v0.1.0! ## Overview XGrammar is an open-source library for efficient, flexible, and portable structured generation. It leverages constrained decoding to ensure **100% structural correctness** of the output. It supports general context-free grammar to enable a broad range of structures, including **JSON**, **regex**, **custom context-free grammar**, etc. XGrammar uses careful optimizations to achieve extremely low overhead in structured generation. It has achieved **near-zero overhead** in JSON generation, making it one of the fastest structured generation engines available. XGrammar features **universal deployment**. It supports: * **Platforms**: Linux, macOS, Windows * **Hardware**: CPU, NVIDIA GPU, AMD GPU, Apple Silicon, TPU, etc. * **Languages**: Python, C++, and JavaScript APIs * **Models**: Qwen, Llama, DeepSeek, Phi, Gemma, etc. XGrammar is very easy to integrate with LLM inference engines. It is the default structured generation backend for most LLM inference engines, including [**vLLM**](https://github.com/vllm-project/vllm), [**SGLang**](https://github.com/sgl-project/sglang), [**TensorRT-LLM**](https://github.com/NVIDIA/TensorRT-LLM), and [**MLC-LLM**](https://github.com/mlc-ai/mlc-llm), as well as many other companies. You can also try out their structured generation modes! ## Get Started Install XGrammar: ```bash pip install xgrammar ``` For use with MPS on Apple Silicon, install with: ```bash pip install "xgrammar[metal]" ``` Import XGrammar: ```python import xgrammar as xgr ``` Please visit our [documentation](https://xgrammar.mlc.ai/docs/) to get started with XGrammar. - [Installation](https://xgrammar.mlc.ai/docs/start/installation) - [Quick start](https://xgrammar.mlc.ai/docs/start/quick_start) ## Third-Party Bindings - **Rust**: [xgrammar-rs](https://github.com/trymirai/xgrammar-rs) — Community Rust bindings for XGrammar. ## Collaborators XGrammar has been widely adopted in industry, open-source projects, and academia. Our collaborators include: <div align="center"> [<img src="https://raw.githubusercontent.com/mlc-ai/XGrammar-web-assets/refs/heads/main/repo/nvidia.svg" height=50/>](https://github.com/NVIDIA/TensorRT-LLM) &emsp; [<img src="https://raw.githubusercontent.com/mlc-ai/XGrammar-web-assets/refs/heads/main/repo/databricks.svg" height=50/>](https://www.databricks.com/) &emsp; [<img src="https://raw.githubusercontent.com/mlc-ai/XGrammar-web-assets/refs/heads/main/repo/meta.svg" height=50/>](https://about.meta.com/) &emsp; [<img src="https://raw.githubusercontent.com/mlc-ai/XGrammar-web-assets/refs/heads/main/repo/google.svg" height=50/>](https://about.google/) &emsp; [<img src="https://raw.githubusercontent.com/mlc-ai/XGrammar-web-assets/refs/heads/main/repo/xai.png" height=50/>](https://github.com/NVIDIA/TensorRT-LLM) &emsp; [<img src="https://raw.githubusercontent.com/mlc-ai/XGrammar-web-assets/refs/heads/main/repo/deepseek.png" height=50/>](http

Release History

Version	Changes	Urgency	Date
v0.2.1	## What's Changed * format: fix the format of the test files. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/624 * fix: wrap Kimi auto tool calls in section markers by @JustinTong0323 in https://github.com/mlc-ai/xgrammar/pull/623 * feat: expose normalize_tool_choice and unify Qwen XML structural tag builders by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/625 * fix: stop emitting \/ for forward slashes in JSON output by @Ubospica in https://github.com/mlc-ai/xgrammar/p	High	5/17/2026
v0.2.0	## What's Changed * refactor: unify reasoning parameter and rename model keys by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/609 * fix: fix gpt-oss's tool-calling format. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/607 * docs: add Perplexity collaborator logo by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/612 * Expose draft tree traversal on GrammarMatcher by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/613 * perf: reduce structural-tag compile tim	High	5/1/2026
v0.1.34	## What's Changed * Reapply "refactor: migrate the binding logic into `tvm_ffi`. (#550)"" by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/576 * fix: accept {n, -1} as unbounded repeat in EBNF parser by @ushiromiya-lion in https://github.com/mlc-ai/xgrammar/pull/579 * fix: AnyTokensFormat with exclude_tokens should be treated as self-terminating by @ushiromiya-lion in https://github.com/mlc-ai/xgrammar/pull/578 * fix: fix the building of the website. by @Seven-Streams in https://	High	4/29/2026
0.1.33	Imported from PyPI (0.1.33)	Low	4/21/2026
v0.1.33	## What's Changed * refactor: simplify TagDispatch by removing stop_eos and stop_str by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/554 * feat: provide `PlusFormat`, `OptionalFormat`, `StarFormat` to enhance `StructuralTag`. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/557 * feat: support `structural_tag`-level cache. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/553 * feat: token-level grammar support with Token/ExcludeToken/TokenTagDispatch edges by	Medium	3/27/2026
v0.1.33	## What's Changed * refactor: simplify TagDispatch by removing stop_eos and stop_str by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/554 * feat: provide `PlusFormat`, `OptionalFormat`, `StarFormat` to enhance `StructuralTag`. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/557 * feat: support `structural_tag`-level cache. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/553 * feat: token-level grammar support with Token/ExcludeToken/TokenTagDispatch edges by	Medium	3/27/2026
v0.1.33	## What's Changed * refactor: simplify TagDispatch by removing stop_eos and stop_str by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/554 * feat: provide `PlusFormat`, `OptionalFormat`, `StarFormat` to enhance `StructuralTag`. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/557 * feat: support `structural_tag`-level cache. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/553 * feat: token-level grammar support with Token/ExcludeToken/TokenTagDispatch edges by	Medium	3/27/2026
v0.1.33	## What's Changed * refactor: simplify TagDispatch by removing stop_eos and stop_str by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/554 * feat: provide `PlusFormat`, `OptionalFormat`, `StarFormat` to enhance `StructuralTag`. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/557 * feat: support `structural_tag`-level cache. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/553 * feat: token-level grammar support with Token/ExcludeToken/TokenTagDispatch edges by	Medium	3/27/2026
v0.1.33	## What's Changed * refactor: simplify TagDispatch by removing stop_eos and stop_str by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/554 * feat: provide `PlusFormat`, `OptionalFormat`, `StarFormat` to enhance `StructuralTag`. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/557 * feat: support `structural_tag`-level cache. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/553 * feat: token-level grammar support with Token/ExcludeToken/TokenTagDispatch edges by	Low	3/27/2026
v0.1.33	## What's Changed * refactor: simplify TagDispatch by removing stop_eos and stop_str by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/554 * feat: provide `PlusFormat`, `OptionalFormat`, `StarFormat` to enhance `StructuralTag`. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/557 * feat: support `structural_tag`-level cache. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/553 * feat: token-level grammar support with Token/ExcludeToken/TokenTagDispatch edges by	Low	3/27/2026
v0.1.32	## What's Changed * Third-party rust support integration by @eugenebokhan in https://github.com/mlc-ai/xgrammar/pull/531 * feat: support crossing-grammar cache. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/526 * refactor: refactor the structure of structural_tag for better extensibility. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/528 * fix: fix the doc building. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/533 * refactor: refactor the struc	Low	3/4/2026
v0.1.31	v0.1.30 will be yanked soon for the issues in `apply_token_bitmask_inplace` and the Windows OS compatibility of crossing-grammar caching. Please use v0.1.31 instead. ## What's Changed * fix the apply_token_bit_mask with auto. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/521 * Revert "feat: support crossing-grammar cache. (#508)" by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/522 * Bump to v0.1.31 by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/523	Low	1/19/2026
v0.1.30	## What's Changed * [Feature] Allow empty separator string for `tags_with_separator` by @ricohasgit in https://github.com/mlc-ai/xgrammar/pull/503 * feat: provide excluded strs for any_text and tagdispatch. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/502 * chore: update cibuildwheel to build more wheels. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/509 * [BugFix] Fix error `TypeError: apply_token_bitmask_inplace_cpu(): incompatible function arguments` by @wju	Low	1/17/2026
v0.1.29	## What's Changed * Add TraverseDraftTree for speculative decoding bitmask generation by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/490 * [Web] Support structural tag in web-xgrammar package, upgrade version to 0.1.27 by @akaashrp in https://github.com/mlc-ai/xgrammar/pull/482 * Fix warp_size in triton kernel for AMD GPUs by @divakar-amd in https://github.com/mlc-ai/xgrammar/pull/476 * [Fix] Fix the tokenizer VocabType dection for Kimi-K2-Instruct. (#483) by @eraser00 in https://gi	Low	12/19/2025
v0.1.28	## What's Changed * Fix web emscripten bindings, bump to 0.1.26 by @loganzartman in https://github.com/mlc-ai/xgrammar/pull/456 * fix(build): Disable LTO on riscv64 to fix linker error by @ihb2032 in https://github.com/mlc-ai/xgrammar/pull/458 * remove ninja from dependencies by @dotlambda in https://github.com/mlc-ai/xgrammar/pull/459 * [CI] Add a ci to run unit tests every day. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/463 * [Fix] Fix the efficiency issue in repetition	Low	12/9/2025
v0.1.27	## What's Changed * [Fix] Fix JsonSchemaConverter for numbers "-0. ..." by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/462 * [test] Add a build test for Web-XGrammar. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/457 * [CI] Fix web test. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/464 * Bump to v0.1.27 by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/466 Full Changelog: https://github.com/mlc-ai/xgrammar/compare/v0.1.26...v0.1.27	Low	11/4/2025
v0.1.26	v0.1.26 brings a series of batched methods for token mask generation. This version also fixed several issues with improvements in efficiency. ## What's Changed * Update the REMDME section. by @Ubospica * [Refac] Refactor the pipeline. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/428 * [Feature] support openGPT-x tokenizer. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/446 * [Refac] Refactor repetition expressions to reduce the uncertainty. by @Seven-Streams	Low	10/20/2025
v0.1.25	v0.1.25 brings the structural tag. It also fixes a couple of problems regarding JSON schema and py.typed. It enhances the speed of grammar compilation and mask generation. ## What's Changed * Json schema generation, limit the number of whitespaces by @ExtReMLapin in https://github.com/mlc-ai/xgrammar/pull/414 * [Fix] Fix the failure to build the website. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/415 * Update README.md to announce that OpenVINO GenAI uses XGrammar by @apa	Low	9/21/2025
v0.1.24	This verision brings a lot of bugfixes. It also optimizes the speed for the repeat grammar, especially the repeat number is large. ## What's Changed * [Fix] Fix the print function of AdaptiveTokenMask. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/404 * [Fix]fix the broken link of the website. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/408 * [Optim] Optim repetition expressions when the max repetition time is unbounded. by @Seven-Streams in https://github.c	Low	9/4/2025
v0.1.23	## Highlights * Significant speedup for grammars with repeat, e.g., `a{1, 100}`. Now the preprocessing of it is O(1) instead of O(n) * Release the new serialization library * Fix bugs about max_rollback_tokens * Fix bugs in cuda kernels * Refactor: migrate the grammar backend to FSMs ## What's Changed * [Refactor] JSON serializer and MemorySize by @DarkSharpness in https://github.com/mlc-ai/xgrammar/pull/380 * [Feature] Add a new expression to represent repetition to speed up. by @Sev	Low	8/15/2025
v0.1.22	## New Version Highlights * Enhanced Earley Parser with FSM Support - Added FSM support to the Earley parser and improved TagDispatch intrinsic * Support incontiguous Logits and Bitmask * JSON Schema Enhancements: Support `minProperties`, `maxProperties`, `patternProperties`, `propertyNames` * Earley Parser Bug Fixes - Resolved boundary check issues and behavioral inconsistencies in the Earley parser that could affect parsing accuracy * Improved Debugging Experience ## What's Changed *	Low	7/27/2025
v0.1.21	## Summary v0.1.21 brings significant performance improvements and resolves previous issues with infinite loops and system freezes. It introduces the Earley parser. The preprocessing time is now up to six times faster than before (See #308), and it no longer suffers from the previous problems of exponential growth in the number of states or infinite (or very deep) recursion. It also fixes the cache corruption problem that occurs when the input is invalid in GrammarCompiler. ## What's Ch	Low	7/10/2025
v0.1.20	## Summary - Fix Windows build flags for Torch extensions - Refactor the Parser and FSM libraries for improved maintainability - Expose accept_string and print_internal_states with configurable max recursion depth - Correct acceptance of certain invalid characters in JSON strings - Make tiktoken and sentencepiece optional dependencies - Support int64 values in JSON Schema conversion - Add a C++ reflection-based JSON serialization method ## What's Changed * Fix torch extension build cf	Low	6/30/2025
v0.1.19	## What's Changed * Support minItems, maxItems for array by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/296 * [fix]fix the handling for '\?'. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/305 * [Ci] Add a check in benchmark.yaml. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/301 * [CI Fix] Fix the check for benchmark. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/309 * [Feature] Rewrite the FSM.h to support some regex grammar. by @Seven-	Low	5/8/2025
v0.1.18	## Highlights * Provides a MLX kernel to support mac devices * Fix bugs in json schema converter * Fix the compilation failure issue of torch kernels * Supports float range, format in json schema * Adds a function testing._is_single_token_bitmask for spec decoding * Releases GIL for most of the methods ## What's Changed * Bump to v0.1.17 by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/271 * [Feature] LRU cache for grammar compiler by @DarkSharpness in https://github.com/ml	Low	4/8/2025
v0.1.17	## What's Changed * Update README.md by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/246 * [Feature] Add CI to close inactive issues and check formats. by @Seven-Streams in https://github.com/mlc-ai/xgrammar/pull/241 * Enable editable installation by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/247 * Cleanup useless build scripts by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/248 * fix(python): use correct name for pybind11 FindPython mode by @henryiii in https://gi	Low	3/25/2025
v0.1.16	## What's Changed * [Fix] Postpone cuda import to the calling site by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/231 * [Feature] Support model vocab size being less than tokenizer by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/237 * [Style] Remove unused headers by @DarkSharpness in https://github.com/mlc-ai/xgrammar/pull/219 * Fallback to triton if we fail to compile for CUDA by @zbowling in https://github.com/mlc-ai/xgrammar/pull/223 * [Feature] Build and run C++ Python	Low	3/15/2025
v0.1.15	## What's Changed * Update README.md by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/225 * [Feature] Expose Grammar.union by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/227 * [Feature] Support Deepseek R series models by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/226 * Bump to v0.1.15 by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/228 Full Changelog: https://github.com/mlc-ai/xgrammar/compare/v0.1.14...v0.1.15	Low	3/5/2025
v0.1.14	## Highlights This version supports XGrammar on Linux Arm64, removes the restriction of glibc 2.28, and adds a source distribution. It fixes the self-recursion error, extends apply_token_mask_inplace to support some corner cases. It changes the API of StructuralTag, renaming field "start" to "begin". ## What's Changed * update GenerateRangeRegex by @zanderjiang in https://github.com/mlc-ai/xgrammar/pull/182 * [Fix] Fix compatibility for apply_token_bitmask_inplace by @Ubospica in h	Low	2/26/2025
v0.1.13	## Highlight This version enhances the compatibility of XGrammar on various platforms. It also provides full support for regex. Now most features are supported. It also enhances the efficiency of the token bitmask application kernel. ## What's Changed * [Fix] Fix popcount for windows by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/167 * [Fix] Rollback safely when token acceptance fails by @benchislett in https://github.com/mlc-ai/xgrammar/pull/164 * fix(fsm): fix error of dangli	Low	2/13/2025
v0.1.11	## Highlight In this PR we supported the structural tag. This is a new feature that can support strict function calling (and many more flexible patterns). ## What's Changed * [Feature] Structural tag by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/162 * [Fix] Fix #162 by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/163 * [Fix] Fix broken rollback char count after accepting stop token by @benchislett in https://github.com/mlc-ai/xgrammar/pull/161 * [Feature] Optional Token	Low	2/13/2025
v0.1.10	## Highlight In this version we enhanced the ability of json schema, ebnf, and provided APIs for grammar concat and union. ## What's Changed * [Feature] Support regex and repetition range by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/144 * [Refactor] Rename internal classes for better structure by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/145 * [FunctionCalling] Support TagDispatch by @Ubospica in https://github.com/mlc-ai/xgrammar/pull/146 * [Fix] Fix doc dependen	Low	2/13/2025
v0.1.8	## Features - Enhance JSON Schema converter by @Ubospica in #134 - Enhance EBNF Parser by @observerw in #125 - Support sentencepiece tokenizer by @zanderjiang in #120 - Enhance ApplyMask kernels to provide better support in mix-structured-and-unstructured cases by @Ubospica in #128	Low	12/25/2024
v0.1.6	## Features - The ability of the JSON Schema converter is enhanced to support integer range and regex pattern. Thanks @joennlae ## Bug Fixes - Solves the problem of strict JSON format degrading the LLM output quality	Low	12/7/2024
v0.1.4	This is the stable release version of XGrammar. It provides efficient and portable API for LLM structured generation.	Low	11/25/2024

Dependencies & License Audit

Loading dependencies...

Similar Packages

apache-tvm-ffitvm ffiv0.1.11

magikaA tool to determine the content type of a file with deep learningcli/v1.1.0

outlines-coreStructured Text Generation in Rust0.2.14

nvidia-cuda-cupti-cu12CUDA profiling tools runtime libs.12.9.79

orbax-checkpointOrbax Checkpointv0.12.0

More in RAG & Memory

vllmA high-throughput and memory-efficient inference and serving engine for LLMs

spiceaiA portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.

awesome-opensource-aiCurated list of the best truly open-source AI projects, models, tools, and infrastructure.

antflyNo description