freshcrate
Home > Developer Tools > pymupdf-layout

pymupdf-layout

PyMuPDF Layout turns PDFs into structured data 10× faster than vision-based tools using AI trained on PDF internals, not images. CPU-only. No GPU required.

Description

# PyMuPDF Layout **PyMuPDF Layout** is a fast and lightweight layout analysis Python package integrated with PyMuPDF for clean, structured data output from PDF. It's fast, accurate and doesn't need GPUs like vision-based models. While other tools train machine learning models on rendered page images, PyMuPDF Layout trains Graph Neural Networks directly on PDF internals. This gives us accuracy at 10× the speed utilizing CPU-only resources. [![License PolyForm Noncommercial](https://img.shields.io/badge/license-Polyform_Noncommercial-purple)](https://polyformproject.org/licenses/noncommercial/1.0.0/) [![Python version](https://img.shields.io/badge/python-3.11+-blue)](https://pypi.org/project/pymupdf-layout/) [![Docs](https://img.shields.io/badge/docs-RTD-green)](https://polyformproject.org/licenses/noncommercial/1.0.0/) [![Discord](https://img.shields.io/discord/770681584617652264?color=6A7EC2&logo=discord&logoColor=ffffff)](https://discord.gg/ppTFv8uJ46) ## Features - 📚 Structured data extraction from your documents in Markdown, JSON or TXT format - 🧐 Advanced document page layout understanding, including semantic markup for titles, headings, headers, footers, tables, images and text styling - 🔍 Detect and isolate header and footer patterns on each page ## Usage **PyMuPDF Layout** works alongside PyMuDF4LLM's `to_markdown` method. Once PyMuPDF Layout is activated just use `to_markdown` and PyMuPDF Layout will work behind the scenes to analyse documents and deliver improved results. You can also get a `JSON` or `TXT` format of the data with `to_json` or `to_text`. ### Extract Structured data ``` python import pymupdf.layout import pymupdf4llm source = "your.pdf" doc = pymupdf.open(source) md = pymupdf4llm.to_markdown(doc) json = pymupdf4llm.to_json(doc) txt = pymupdf4llm.to_text(doc) ``` ## Try It! Try **PyMuPDF Layout** on [our PyMuPDF website](https://pymupdf.io). ## Documentation See the [PyMuPDF Layout documentation page](https://pymupdf.readthedocs.io/en/latest/pymupdf-layout/index.html) for more.

Release History

VersionChangesUrgencyDate
1.27.2.2Imported from PyPI (1.27.2.2)Low4/21/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

azure-coreMicrosoft Azure Core Library for Pythonazure-template_0.1.0b6187637
azure-mgmt-coreMicrosoft Azure Management Core Library for Pythonazure-template_0.1.0b6187637
azure-monitor-opentelemetry-exporterMicrosoft Azure Monitor Opentelemetry Exporter Client Library for Pythonazure-template_0.1.0b6187637
azure-servicebusMicrosoft Azure Service Bus Client Library for Pythonazure-template_0.1.0b6187637
azure-monitor-opentelemetryMicrosoft Azure Monitor Opentelemetry Distro Client Library for Pythonazure-template_0.1.0b6187637