freshcrate
Skin:/
Home > RAG & Memory > MinerU

MinerU

Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

README

MinerU — High-accuracy document parsing engine for LLM · RAG · Agent workflows Converts PDF · Word · PPT · Images · Web pages into structured Markdown / JSON · VLM+OCR dual engine · 109 languages
MCP Server · LangChain / Dify / FastGPT native integration · 10+ domestic AI chip support

🔍 Core Parsing Capabilities

  • Formulas → LaTeX · Tables → HTML, accurate layout reconstruction
  • Supports scanned docs, handwriting, multi-column layouts, cross-page table merging
  • Output follows human reading order with automatic header/footer removal
  • VLM + OCR dual engine, 109-language OCR recognition

🔌 Integration

Use Case Solution
AI Coding Tools MCP Server — Cursor · Claude Desktop · Windsurf
RAG Frameworks LangChain · LlamaIndex · RAGFlow · RAG-Anything · Flowise · Dify · FastGPT
Development Python / Go / TypeScript SDK · CLI · REST API · Docker
No-Code mineru.net online · Gradio WebUI · Desktop client

🖥️ Deployment (Private · Fully Offline)

Inference Backend Best For
pipeline Fast & stable, no hallucination, runs on CPU or GPU
vlm-engine High accuracy, supports vLLM / LMDeploy / mlx ecosystem
hybrid-engine High accuracy, native text extraction, low hallucination

Domestic AI chips: Ascend · Cambricon · Enflame · MetaX · Moore Threads · Kunlunxin · Iluvatar · Hygon · Biren · T-Head

Changelog

  • 2026/03/29 3.0.0 Released

    This release delivers a systematic upgrade centered on parsing capability, system architecture, and engineering usability. The main updates include:

    • Native DOCX parsing
      • Official support for native DOCX parsing, delivering high-precision results without hallucinations.
      • Compared with the traditional workflow of first converting DOCX to PDF and then parsing it, end-to-end speed is improved by tens of times, making it better suited for scenarios with high requirements for both accuracy and throughput.
    • pipeline backend upgrade
      • The pipeline backend achieves a score of 86.2 on OmniDocBench (v1.5), surpassing the accuracy of the previous-generation mainstream VLM MinerU2.0-2505-0.9B.
      • Added support for parsing images/formulas inside tables, seal text recognition, vertical text support, and interline formula numbering recognition, continuously improving parsing quality for complex document scenarios.
      • While maintaining high accuracy, it keeps resource usage extremely low and continues to support inference in pure CPU environments.
    • API / CLI / Router orchestration upgrade
      • mineru now runs as an orchestration client based on mineru-api; when --api-url is not provided, it will automatically start a local temporary service.
      • mineru-api adds a new asynchronous task endpoint POST /tasks, supporting task submission, status querying, and result retrieval; meanwhile, it retains the synchronous parsing endpoint POST /file_parse for compatibility with legacy plugins.
      • Added mineru-router, designed for unified entry deployment and task routing across multiple services and multiple GPUs; its interfaces are fully compatible with mineru-api and support automatic task load balancing.
    • Deployment and usability improvements
      • Resolved compatibility issues with torch >= 2.8; the base image has been upgraded to vllm0.11.2 + torch2.9.0, unifying installation paths across different Compute Capabilities.
      • Optimized the parsing pipeline with a sliding-window mechanism, significantly reducing peak memory usage in long-document scenarios, so documents with tens of thousands of pages no longer need to be split manually.
      • Batch inference in pipeline now supports streaming writes to disk, allowing completed parsing results to be written out in time and further improving the experience for long-running tasks.
      • Completed thread-safety optimization and now fully supports multi-threaded concurrent inference; together with mineru-router, this enables one-click multi-GPU deployment and makes it easy to build high-concurrency, high-throughput parsing systems.
      • Completely removed the use of two AGPLv3 models (doclayoutyolo and mfd_yolov8) and one CC-BY-NC-SA 4.0 model (layoutreader).

    This update is not just a set of feature enhancements, but a key leap forward in MinerU's overall system capabilities. We specifically addressed the peak memory usage issue in long-document parsing. Through optimizations such as sliding windows and streaming writes to disk, ultra-long document parsing has moved from “requiring manual splitting and careful handling” to being “stable, scalable, and ready for production workloads.” At the same time, we completed thread-safety optimization and fully enabled multi-threaded concurrent inference, further improving single-machine resource utilization and runtime stability under high-concurrency workloads. On top of this, with mineru-router and the new API / CLI orchestration framework, MinerU now supports one-click multi-GPU deployment, unified access across multiple services, and automatic task load balancing, significantly reducing the difficulty of large-scale deployment. As a result, MinerU is evolving from a standalone data production tool into a large-scale document parsing foundation for high-concurrency and high-throughput scenarios, providing enterprise-grade document data processing with infrastructure that is more stable, more efficient, and easier to scale.

📝 View the complete Changelog for more historical version information

MinerU

Project Introduction

MinerU is a document parsing tool that converts PDF, image, and DOCX inputs into machine-readable formats such as Markdown and JSON for downstream retrieval, extraction, and processing. MinerU was born during the pre-training process of InternLM. We focus on solving symbol conversion issues in scientific literature and hope to contribute to technological development in the era of large models. Compared to well-known commercial products, MinerU is still young. If you encounter any issues or if the results are not as expected, please submit an issue on issue and attach the relevant document or sample file.

pdf_zh_cn.mp4

Key Features

  • Support PDF, image, and DOCX inputs.
  • Remove headers, footers, footnotes, page numbers, etc., to ensure semantic coherence.
  • Output text in human-readable order, suitable for single-column, multi-column, and complex layouts.
  • Preserve the structure of the original document, including headings, paragraphs, lists, etc.
  • Extract images, image descriptions, tables, table titles, and footnotes.
  • Automatically recognize and convert formulas in the document to LaTeX format.
  • Automatically recognize and convert tables in the document to HTML format.
  • Automatically detect scanned PDFs and garbled PDFs and enable OCR functionality.
  • OCR supports detection and recognition of 109 languages.
  • Supports multiple output formats, such as multimodal and NLP Markdown, JSON sorted by reading order, and rich intermediate formats.
  • Supports various visualization results, including layout visualization and span visualization, for efficient confirmation of output quality.
  • Built-in CLI, FastAPI, Gradio WebUI, for local orchestration and multi-service deployment.
  • Supports running in a pure CPU environment, and also supports GPU(CUDA)/NPU(CANN)/MPS acceleration
  • Compatible with Windows, Linux, and Mac platforms.

Quick Start

If you encounter any installation issues, please first consult the FAQ.
If the parsing results are not as expected, refer to the Known Issues.

Online Experience

Official online web application

The official online version has the same functionality as the client, with a beautiful interface and rich features, requires login to use

  • OpenDataLab

Gradio-based online demo

A WebUI developed based on Gradio, with a simple interface and only core parsing functionality, no login required

  • ModelScope
  • HuggingFace

Local Deployment

Warning

Pre-installation Notice—Hardware and Software Environment Support

To ensure the stability and reliability of the project, we only optimize and test for specific hardware and software environments during development. This ensures that users deploying and running the project on recommended system configurations will get the best performance with the fewest compatibility issues.

By focusing resources on the mainline environment, our team can more efficiently resolve potential bugs and develop new features.

In non-mainline environments, due to the diversity of hardware and software configurations, as well as third-party dependency compatibility issues, we cannot guarantee 100% project availability. Therefore, for users who wish to use this project in non-recommended environments, we suggest carefully reading the documentation and FAQ first. Most issues already have corresponding solutions in the FAQ. We also encourage community feedback to help us gradually expand support.

Parsing Backend pipeline *-auto-engine *-http-client
hybrid vlm hybrid vlm
Backend Features Good Compatibility High Hardware Requirements For OpenAI Compatible Servers2
Accuracy1 86+ 90+
Operating System Linux3 / Windows4 / macOS5
Pure CPU Support
GPU Acceleration Volta and later architecture GPUs or Apple Silicon Not Required
Min VRAM 4GB 8GB 8GB 2GB
RAM Min 16GB, Recommended 32GB or more Min 16GB
Disk Space Min 20GB, SSD Recommended Min 2GB
Python Version 3.10-3.13

1 Accuracy metrics are the End-to-End Evaluation Overall scores from OmniDocBench (v1.5), based on the latest version of MinerU.
2 Servers compatible with OpenAI API, such as local model servers or remote model services deployed via inference frameworks like vLLM/SGLang/LMDeploy.
3 Linux only supports distributions from 2019 and later.
4 Since the key dependency ray does not support Python 3.13 on Windows, only versions 3.10~3.12 are supported.
5 macOS requires version 14.0 or later.

Install MinerU

Install MinerU using pip or uv

pip install --upgrade pip
pip install uv
uv pip install -U "mineru[all]"

Install MinerU from source code

git clone https://github.com/opendatalab/MinerU.git
cd MinerU
uv pip install -e .[all]

Tip

mineru[all] includes all core features, compatible with Windows / Linux / macOS systems, suitable for most users. If you need to specify the inference framework for the VLM model, or only intend to install a lightweight client on an edge device, please refer to the documentation Extension Modules Installation Guide.


Deploy MinerU using Docker

MinerU provides a convenient Docker deployment method, which helps quickly set up the environment and solve some tricky environment compatibility issues. You can get the Docker Deployment Instructions in the documentation.


Using MinerU

If your device meets the GPU acceleration requirements in the table above, you can use a simple command line for document parsing:

mineru -p <input_path> -o <output_path>

If your device does not meet the GPU acceleration requirements, you can specify the backend as pipeline to run in a pure CPU environment:

mineru -p <input_path> -o <output_path> -b pipeline

mineru currently supports local PDF, image, and DOCX file or directory inputs, and can be used for document parsing through the CLI, API, WebUI, and mineru-router. For detailed instructions, please refer to the Usage Guide.

TODO

  • Reading order based on the model
  • Recognition of index and list in the main text
  • Table recognition
  • Heading Classification
  • Handwritten Text Recognition
  • Vertical Text Recognition
  • Latin Accent Mark Recognition
  • Code block recognition in the main text
  • Chemical formula recognition(mineru.net)
  • Geometric shape recognition

Known Issues

  • Reading order is determined by the model based on the spatial distribution of readable content, and may be out of order in some areas under extremely complex layouts.
  • Limited support for vertical text.
  • Tables of contents and lists are recognized through rules, and some uncommon list formats may not be recognized.
  • Code blocks are not yet supported in the layout model.
  • Comic books, art albums, primary school textbooks, and exercises cannot be parsed well.
  • Table recognition may result in row/column recognition errors in complex tables.
  • OCR recognition may produce inaccurate characters in PDFs of lesser-known languages (e.g., diacritical marks in Latin script, easily confused characters in Arabic script).
  • Some formulas may not render correctly in Markdown.

FAQ

  • If you encounter any issues during usage, you can first check the FAQ for solutions.
  • If your issue remains unresolved, you may also use DeepWiki to interact with an AI assistant, which can address most common problems.
  • If you still cannot resolve the issue, you are welcome to join our community via Discord or WeChat to discuss with other users and developers.

All Thanks To Our Contributors

License Information

LICENSE.md

The source code in this repository is licensed under AGPLv3.

Acknowledgments

Citation

@article{dong2026minerudiffusion,
  title={MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding},
  author={Dong, Hejun and Niu, Junbo and Wang, Bin and Zeng, Weijun and Zhang, Wentao and He, Conghui},
  journal={arXiv preprint arXiv:2603.22458},
  year={2026}
}

@article{niu2025mineru2,
  title={Mineru2. 5: A decoupled vision-language model for efficient high-resolution document parsing},
  author={Niu, Junbo and Liu, Zheng and Gu, Zhuangcheng and Wang, Bin and Ouyang, Linke and Zhao, Zhiyuan and Chu, Tao and He, Tianyao and Wu, Fan and Zhang, Qintong and others},
  journal={arXiv preprint arXiv:2509.22186},
  year={2025}
}

@article{wang2024mineru,
  title={Mineru: An open-source solution for precise document content extraction},
  author={Wang, Bin and Xu, Chao and Zhao, Xiaomeng and Ouyang, Linke and Wu, Fan and Zhao, Zhiyuan and Xu, Rui and Liu, Kaiwen and Qu, Yuan and Shang, Fukai and others},
  journal={arXiv preprint arXiv:2409.18839},
  year={2024}
}

@article{he2024opendatalab,
  title={Opendatalab: Empowering general artificial intelligence with open datasets},
  author={He, Conghui and Li, Wei and Jin, Zhenjiang and Xu, Chao and Wang, Bin and Lin, Dahua},
  journal={arXiv preprint arXiv:2407.13773},
  year={2024}
}

Star History

Star History Chart

Links

Release History

VersionChangesUrgencyDate
mineru-3.2.2-released## What's Changed * #5033 fix: Enhance PDF processing and improve concurrency management by @myhloli in https://github.com/opendatalab/MinerU/pull/5062 * #5061 fix: add functionality to skip broken PDF pages during rewrite process by @myhloli in https://github.com/opendatalab/MinerU/pull/5064 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-3.2.1-released...mineru-3.2.2-releasedHigh6/2/2026
mineru-3.2.0-released## What's Changed MinerU 3.2.0 版本现已发布,本次更新主要聚焦于界面体验、依赖管理、VLM 模型升级以及稳定性修复。 - 优化 Gradio 界面交互与展示效果,提升文件上传、结果查看和整体使用体验。 - 优化项目依赖管理,精简不必要依赖,降低安装与运行环境维护成本。 - 更新 VLM 模型至 2605 版本,提升视觉语言模型相关解析能力与稳定性。 - 修复若干已知问题,提升整体稳定性与兼容性。 MinerU 3.2.0 is now available. This release focuses on UI improvements, dependency optimization, VLM model updates, and general stability fixes. - Improved the Gradio interface for a smoother upload, preview, and result-viewing experience. - Optimized dependency manageHigh5/26/2026
mineru-3.1.15-released## What's Changed * Improved Gradio preview and upload experience, including Office source-file preview links, clipboard file upload, clearer processing status, better i18n rendering, and extracted Gradio CSS/JS/header resources. * Fixed Gradio Markdown/HTML image previews to use served file URLs instead of embedded base64, improving preview compatibility without changing exported artifacts. * Improved Office parsing robustness, including DOCX table alignment, safer XML tag-name handling, eHigh5/19/2026
mineru-3.1.14-released## What's Changed - Accuracy improvements: - Optimized the `pdf_classify` classification pipeline. - Tuned the contrast threshold boundary for span OCR. **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-3.1.13-released...mineru-3.1.14-releasedHigh5/15/2026
mineru-3.1.11-released## What's Changed * perf: optimize table parsing performance in pipeline mode **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-3.1.10-released...mineru-3.1.11-releasedHigh5/9/2026
mineru-3.1.7-released## What's Changed * feat: add Windows CUDA acceleration troubleshooting section to documentation * feat: add MINERU_TASK_RESULT_TIMEOUT_SECONDS for configurable task process timeout * feat: add MINERU_TASK_RESULT_DOWNLOAD_TIMEOUT_SECONDS for configurable task result download timeout * feat: add Ascend NPU support for router multi-card deployment * feat: add sheet title as text in markdown output for xlsx multi-sheets #4897 ## New Contributors * @crescenth made their first contributiHigh5/6/2026
mineru-3.1.6-released## What's Changed - fix some office docs bugs **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-3.1.5-released...mineru-3.1.6-releasedHigh4/28/2026
mineru-3.1.5-released## What's Changed * feat: implement asynchronous model retrieval and enhance timeout handling in API client by @myhloli in https://github.com/opendatalab/MinerU/pull/4857 * fix: specify maximum version for mlx dependency in pyproject.toml by @myhloli in https://github.com/opendatalab/MinerU/pull/4860 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-3.1.4-released...mineru-3.1.5-releasedHigh4/27/2026
mineru-3.1.2-released## What's Changed fix: prevent abnormal server termination caused by excessively long PDF rendering time in router mode. **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-3.1.1-released...mineru-3.1.2-releasedHigh4/22/2026
mineru-3.1.1-released## What's Changed * fix: Mitigate potential inference hangs on Ascend NPU platforms. by @myhloli in https://github.com/opendatalab/MinerU/pull/4821 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-3.1.0-released...mineru-3.1.1-releasedHigh4/20/2026
mineru-3.1.0-released## What's Changed - 2026/04/18 3.1.0 Released This release focuses on **licensing openness, parsing accuracy, and full-format native support**. The main updates include: - License upgrade - MinerU has officially moved from `AGPLv3` to the [MinerU Open Source License](https://github.com/opendatalab/MinerU/blob/master/LICENSE.md), a custom license based on `Apache 2.0`. - This change significantly reduces adoption friction for both community users and commercial deployments,High4/17/2026
mineru-3.0.9-released## What's Changed * fix #4742: add function to identify disallowed control Unicode characters by @myhloli in https://github.com/opendatalab/MinerU/pull/4743 * fix #4744 by @myhloli in https://github.com/opendatalab/MinerU/pull/4745: * enhance table merging logic with improved row metrics and state management * add aspect ratio checks and character count limits for PDF processing **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-3.0.8-released...mineru-High4/7/2026
mineru-3.0.8-released## What's Changed * fix: #4728 #4730 implement process management and shutdown mechanisms for MinerU by @myhloli in https://github.com/opendatalab/MinerU/pull/4731 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-3.0.7-released...mineru-3.0.8-releasedMedium4/3/2026
mineru-3.0.7-released## What's Changed * fix: strip newline characters from paragraph text in office_middle_json_mkcontent by @myhloli in https://github.com/opendatalab/MinerU/pull/4717 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-3.0.6-released...mineru-3.0.7-releasedMedium4/1/2026
mineru-3.0.6-released## What's Changed #4708: - feat: add underscore thematic break escaping to Markdown processing - fix: correct paragraph text extraction by removing unnecessary stripping - feat: enhance paragraph text extraction to include inline content controls **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-3.0.5-released...mineru-3.0.6-releasedMedium4/1/2026
mineru-3.0.5-released## What's Changed - fix: improve shutdown handling for FastAPI child process on Windows at 3.0.4 - fix: add custom JSON schema for file upload in Swagger UI to support `fastapi>=0.130.0` - fix: update the `sys_platform` identifier for Windows in `pyproject.toml` to resolve the issue where installing `[all]` on Windows does not automatically install `lmdeploy` - feat: add albumentations dependency to pyproject.toml #4701 **Full Changelog**: https://github.com/opendatalab/MinerU/compaMedium3/31/2026
mineru-3.0.4-released## What's Changed * feat: add --enable-vlm-preload option to CLI for VLM model preloading during startup by @myhloli in https://github.com/opendatalab/MinerU/pull/4693 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-3.0.3-released...mineru-3.0.4-releasedMedium3/30/2026
mineru-3.0.3-released## What's Changed * fix: enhance PDF rendering with persistent executor and recycling logic by @myhloli in https://github.com/opendatalab/MinerU/pull/4688 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-3.0.1-released...mineru-3.0.3-releasedMedium3/30/2026
mineru-3.0.1-released## What's Changed * fix: refactor OCR processing to improve span handling and reduce code duplication by @myhloli in https://github.com/opendatalab/MinerU/pull/4675 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-3.0.0-released...mineru-3.0.1-releasedMedium3/29/2026
mineru-3.0.0-released## What's Changed - 2026/03/29 3.0.0 Released This release delivers a systematic upgrade centered on **parsing capability, system architecture, and engineering usability**. The main updates include: - Native `DOCX` parsing - Official support for native `DOCX` parsing, delivering high-precision results without hallucinations. - Compared with the traditional workflow of first converting `DOCX` to `PDF` and then parsing it, end-to-end speed is improved by tens of times, makMedium3/28/2026
mineru-2.7.6-released## What's Changed - 2026/02/06 2.7.6 Release - Added support for the domestic computing platforms Kunlunxin and Tecorigin. - 2026/02/06 2.7.6 发布 - 新增国产算力平台昆仑芯、太初元碁的适配支持,目前已由官方和厂商适配并支持的国产算力平台包括: - [昇腾 Ascend](https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/Ascend) - [平头哥 T-Head](https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/THead) - [沐曦 METAX](https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/METAX) - [海光 Hygon]Low2/6/2026
mineru-2.7.5-released## What's Changed - Fix the issue where PDF rendering timeout detection fails under certain conditions. **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-2.7.4-released...mineru-2.7.5-releasedLow2/2/2026
mineru-2.7.4-released## What's Changed - 2026/01/30 2.7.4 Release - Added support for domestic computing platforms IluvatarCorex and Cambricon. - 2026/01/30 2.7.4 发布 - 新增国产算力平台天数智芯、寒武纪的适配支持,目前已由官方适配并支持的国产算力平台包括: - [昇腾 Ascend](https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/Ascend) - [平头哥 T-Head](https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/THead) - [沐曦 METAX](https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/METAX) - [海光 Hygon](htLow1/30/2026
mineru-2.7.3-released## What's Changed - Fix bug : #4415 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-2.7.2-released...mineru-2.7.3-releasedLow1/26/2026
mineru-2.7.2-released## What's Changed - 2026/01/23 2.7.2 Release - Cross-page table merging optimization, improving merge success rate and merge quality - 2026/01/23 2.7.2 发布 - 新增国产算力平台海光、燧原、摩尔线程的适配支持,目前已由官方适配并支持的国产算力平台包括: - [昇腾 Ascend](https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/Ascend) - [平头哥 T-Head](https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/THead) - [沐曦 METAX](https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/METAX) - Low1/23/2026
mineru-2.7.1-released## What's Changed - 2026/01/06 2.7.1 Release - fix bug: #4300 - Updated pdfminer.six dependency version to resolve [CVE-2025-64512](https://github.com/advisories/GHSA-wf5f-4jwr-ppcp) - Support automatic correction of input image exif orientation to improve OCR recognition accuracy #4283 - 2026/01/06 2.7.1 发布 - fix bug: #4300 - 更新pdfminer.six的依赖版本以解决 [CVE-2025-64512](https://github.com/advisories/GHSA-wf5f-4jwr-ppcp) - 支持输入图像的exif方向自动校正,提升OCR识别效果 #4283 ## New ContriLow1/6/2026
mineru-2.7.0-released## What's Changed - 2025/12/30 2.7.0 Release - Simplified installation process. No need to separately install `vlm` acceleration engine dependencies. Using `uv pip install mineru[all]` during installation will install all optional backend dependencies. - Added new `hybrid` backend, which combines the advantages of `pipeline` and `vlm` backends. Built on vlm, it integrates some capabilities of pipeline, adding extra extensibility on top of high accuracy: - Directly extracts text froLow12/30/2025
mineru-2.6.8-released## What's Changed - Bug Fix: #4189 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-2.6.7-released...mineru-2.6.8-releasedLow12/15/2025
mineru-2.6.7-released## What's Changed - Bug fix: #4168 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-2.6.6-released...mineru-2.6.7-releasedLow12/12/2025
mineru-2.6.6-released## What's Changed - 2025/12/02 2.6.6 Release - `mineru-api` tool optimizations - Added descriptive text to `mineru-api` interface parameters to improve API documentation readability. - You can use the environment variable `MINERU_API_ENABLE_FASTAPI_DOCS` to control whether the auto-generated interface documentation page is enabled (enabled by default). - Added concurrency configuration options for the `vlm-vllm-async-engine`, `vlm-lmdeploy-engine`, and `vlm-http-client` backLow12/1/2025
mineru-2.6.5-released## What's Changed - 2025/11/26 2.6.5 Release - Added support for a new backend vlm-lmdeploy-engine. Its usage is similar to vlm-vllm-(async)engine, but it uses lmdeploy as the inference engine and additionally supports native inference acceleration on Windows platforms compared to vllm. - 2025/11/26 2.6.5 发布 - 增加新后端`vlm-lmdeploy-engine`支持,使用方式与`vlm-vllm-(async)engine`类似,但使用`lmdeploy`作为推理引擎,与`vllm`相比额外支持Windows平台原生推理加速。 - 新增国产算力平台`昇腾/npu`、`平头哥/ppu`、`沐曦/maca`的适配支持,用户可在对应平台上使用`pipelineLow11/26/2025
mineru-2.6.4-released## What's Changed - 2025/11/04 2.6.4 Release - Added timeout configuration for PDF image rendering, default is 300 seconds, can be configured via environment variable `MINERU_PDF_RENDER_TIMEOUT` to prevent long blocking of the rendering process caused by some abnormal PDF files. - Added CPU thread count configuration options for ONNX models, default is the system CPU core count, can be configured via environment variables `MINERU_INTRA_OP_NUM_THREADS` and `MINERU_INTER_OP_NUM_THREADS` to Low11/4/2025
mineru-2.6.3-released## What's Changed - 2025/10/31 2.6.3 Release - Added support for a new backend `vlm-mlx-engine`, enabling MLX-accelerated inference for the MinerU2.5 model on Apple Silicon devices. Compared to the `vlm-transformers` backend, `vlm-mlx-engine` delivers a 100%–200% speed improvement. - Bug fixes: #3849, #3859 <table> <thead> <tr> <th rowspan="2">Parsing Backend</th> <th rowspan="2">pipeline <br> (Accuracy<sup>1</sup> 82+)</th> <th colsLow10/31/2025
mineru-2.6.2-released## What's Changed - 2025/10/24 2.6.2 Release - `pipeline` backend optimizations - Added experimental support for Chinese formulas, which can be enabled by setting the environment variable `export MINERU_FORMULA_CH_SUPPORT=1`. This feature may cause a slight decrease in MFR speed and failures in recognizing some long formulas. It is recommended to enable it only when parsing Chinese formulas is needed. To disable this feature, set the environment variable to `0`. - `OCR` speed sigLow10/24/2025
mineru-2.5.4-released## What's Changed - Fixed an issue where some `PDF` files were mistakenly identified as `AI` files, causing parsing failures. #3605 #3583 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-2.5.3-released...mineru-2.5.4-releasedLow9/25/2025
mineru-2.5.3-released## What's Changed - 2025/09/20 2.5.3 发布 - 依赖版本范围调整,使得Turing及更早架构显卡可以使用vLLM加速推理MinerU2.5模型。 - `pipeline`后端对torch 2.8.0的一些兼容性修复。 - 降低vLLM异步后端默认的并发数,降低服务端压力以避免高压导致的链接关闭问题。 - 更多兼容性相关内容详见[公告](https://github.com/opendatalab/MinerU/discussions/3547) - 2025/09/20 2.5.3 Released - Dependency version range adjustment to enable Turing and earlier architecture GPUs to use vLLM acceleration for MinerU2.5 model inference. - `pipeline` backend compatibility fixes for torch 2.8.0. - RLow9/20/2025
mineru-2.5.2-released## What's Changed * Fix formatting in vlm_middle_json_mkcontent.py to ensure proper line breaks in list items by @myhloli in https://github.com/opendatalab/MinerU/pull/3532 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-2.5.1-released...mineru-2.5.2-releasedLow9/19/2025
mineru-2.5.1-released## What's Changed Fix the issue where VLM tends to produce infinite repetitive outputs in certain cases. #3524 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-2.5.0-released...mineru-2.5.1-releasedLow9/19/2025
mineru-2.5.0-released## What's Changed - 2025/09/19 2.5.0 Released We are officially releasing MinerU2.5, currently the most powerful multimodal large model for document parsing. With only 1.2B parameters, MinerU2.5's accuracy on the OmniDocBench benchmark comprehensively surpasses top-tier multimodal models like Gemini 2.5 Pro, GPT-4o, and Qwen2.5-VL-72B. It also significantly outperforms leading specialized models such as dots.ocr, MonkeyOCR, and PP-StructureV3. The model has been released on [HugginLow9/19/2025
mineru-2.2.2-released## What's Changed - Fixed the issue where the new table recognition model would affect the overall parsing task when some table parsing failed #3446 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-2.2.1-released...mineru-2.2.2-releasedLow9/10/2025
mineru-2.2.1-released## What's Changed * fix: add new models to download list by @myhloli in https://github.com/opendatalab/MinerU/pull/3435 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-2.2.0-released...mineru-2.2.1-releasedLow9/8/2025
mineru-2.2.0-released## What's Changed - 2025/09/05 2.2.0 Released - Major Updates - In this version, we focused on improving table parsing accuracy by introducing a new [wired table recognition model](https://github.com/RapidAI/TableStructureRec) and a brand-new hybrid table structure parsing algorithm, significantly enhancing the table recognition capabilities of the `pipeline` backend. - We also added support for cross-page table merging, which is supported by both `pipeline` and `vlm` backends, fLow9/5/2025
mineru-2.1.11-released## What's Changed - The current batch det logic is incompatible with PyTorch 2.8.0, causing a significant performance drop. To address this issue, batch inference for det has been temporarily disabled in this update for PyTorch 2.8.0 and later versions. #3285 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-2.1.10-released...mineru-2.1.11-releasedLow8/14/2025
mineru-2.1.10-released## What's Changed - Fixed an issue in the `pipeline` backend where block overlap caused the parsing results to deviate from expectations #3232 ## New Contributors * @SirlyDreamer made their first contribution in https://github.com/opendatalab/MinerU/pull/3222 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-2.1.9-released...mineru-2.1.10-releasedLow8/1/2025
mineru-2.1.9-released## What's Changed - `transformers` 4.54.1 version adaptation **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-2.1.8-released...mineru-2.1.9-releasedLow7/30/2025
mineru-2.1.8-released## What's Changed - 2025/07/28 2.1.8 发布 - `sglang` 0.4.9.post5 版本适配 - 2025/07/28 version 2.1.8 Released - `sglang` 0.4.9.post5 version adaptation **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-2.1.7-released...mineru-2.1.8-releasedLow7/28/2025
mineru-2.1.7-released## What's Changed - 2025/07/27 2.1.7发布 - `transformers` 4.54.0 版本适配 - 2025/07/27 version 2.1.7 Released - `transformers` 4.54.0 version adaptation **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-2.1.6-released...mineru-2.1.7-releasedLow7/27/2025
mineru-2.1.6-released## What's Changed - 2025/07/26 2.1.6发布 - 修复`vlm`后端解析部分手写文档时的表格异常问题 - 修复文档旋转时可视化框位置漂移问题 #3175 - 2025/07/26 2.1.6 Released - Fixed table parsing issues in handwritten documents when using `vlm` backend - Fixed visualization box position drift issue when document is rotated #3175 ## New Contributors * @jinghuan-Chen made their first contribution in https://github.com/opendatalab/MinerU/pull/3175 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-2.1.Low7/25/2025
mineru-2.1.5-released## What's Changed - 2025/07/24 2.1.5发布 - `sglang` 0.4.9 版本适配,同步升级dockerfile基础镜像为sglang 0.4.9.post3 - 2025/07/24 2.1.5 Released - `sglang` 0.4.9 version adaptation, synchronously upgrading the dockerfile base image to sglang 0.4.9.post3 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/mineru-2.1.4-released...mineru-2.1.5-releasedLow7/24/2025
mineru-2.1.4-released## What's Changed - 2025/07/23 2.1.4发布 - bug修复 - 修复`pipeline`后端中`MFR`步骤在某些情况下显存消耗过大的问题 #2771 - 修复某些情况下`image`/`table`与`caption`/`footnote`匹配不准确的问题 #3129 - 2025/07/23 2.1.4 Released - Bug Fixes - Fixed the issue of excessive memory consumption during the `MFR` step in the `pipeline` backend under certain scenarios #2771 - Fixed the inaccurate matching between `image`/`table` and `caption`/`footnote` under certain conditions #3129 ## New Contributors * @huazZengLow7/23/2025
mineru-2.1.1-released## What's Changed - 2025/07/16 2.1.1发布 - bug修复 - 修复`pipeline`在某些情况可能发生的文本块内容丢失问题 #3005 - 修复`sglang-client`需要安装`torch`等不必要的包的问题 #2968 - 更新`dockerfile`以修复linux字体缺失导致的解析文本内容不完整问题 #2915 - 易用性更新 - 更新`compose.yaml`,便于用户直接启动`sglang-server`、`mineru-api`、`mineru-gradio`服务 - 启用全新的[在线文档站点](https://opendatalab.github.io/MinerU/zh/),简化readme,提供更好的文档体验 - 2025/07/16 2.1.1 Released - Bug fixes - Fixed text block content loss issue that could occur in certain `pipelineLow7/16/2025
mineru-2.1.0-released## What's Changed - 2025/07/05 2.1.0发布 - 这是 MinerU 2 的第一个大版本更新,包含了大量新功能和改进,包含众多性能优化、体验优化和bug修复,具体更新内容如下: - 性能优化: - 大幅提升某些特定分辨率(长边2000像素左右)文档的预处理速度 - 大幅提升`pipeline`后端批量处理大量页数较少(<10)文档时的后处理速度 - `pipline`后端的layout分析速度提升约20% - 体验优化: - 内置开箱即用的`fastapi服务`和`gradio webui`,详细使用方法请参考[文档](https://github.com/opendatalab/MinerU/blob/mineru-2.1.0-released/README_zh-CN.md#3-api-%E8%B0%83%E7%94%A8-%E6%88%96-%E5%8F%AF%E8%A7%86%E5%8C%96%E8%B0%83%E7%94%A8) - `sglang`适配`0.4.8`Low7/4/2025
mineru-2.0.6-released## What's Changed - 2025/06/20 2.0.6发布 - 修复vlm模式下,某些偶发的无效块内容导致解析中断问题 #2687 #2749 - 修复vlm模式下,某些不完整的表结构导致的解析中断问题 #2690 - 2025/06/20 2.0.6 Released - Fixed occasional parsing interruptions caused by invalid block content in `vlm` mode #2687 #2749 - Fixed parsing interruptions caused by incomplete table structures in `vlm` mode #2690 ## New Contributors * @Carkham made their first contribution in https://github.com/opendatalab/MinerU/pull/2729 **Full Changelog**: https://gLow6/20/2025
mineru-2.0.5-released## What's Changed - 2025/06/17 2.0.5发布 - 修复了`sglang-client`模式下依然需要下载模型的问题 - 修复了`sglang-client`模式需要依赖`torch`等实际运行不需要的包的问题 - 修复了同一进程内尝试通过多个url启动多个`sglang-client`实例时,只有第一个生效的问题 - 2025/06/17 2.0.5 Released - Fixed the issue where models were still required to be downloaded in the `sglang-client` mode - Fixed the issue where the `sglang-client` mode unnecessarily depended on packages like `torch` during runtime. - Fixed the issue where only the first instance would take effeLow6/17/2025
mineru-2.0.3-released## What's Changed - 2025/06/15 2.0.3发布 - 修复了当下载模型类型设置为`all`时,配置文件出现键值更新错误的问题 #2643 - 修复了命令行模式下公式和表格功能开关不生效导致功能无法关闭的问题 #2641 - 修复了sglang-engine模式下,0.4.7版本sglang的兼容性问题 #2651 - 更新了sglang环境下部署完整版MinerU的Dockerfile和相关安装文档 - 2025/06/15 2.0.3 released - Fixed a configuration file key-value update error that occurred when downloading model type was set to `all` #2643 - Fixed the issue where the formula and table feature toggle switches were not working in `command line mode`, cLow6/15/2025
mineru-2.0.0-released- 2025/06/13 2.0.0发布 - MinerU 2.0 是一次从架构到功能的全面重构与升级,带来了更简洁的设计、更强的性能以及更灵活的使用体验。 - **全新架构**:MinerU 2.0 在代码结构和交互方式上进行了深度重构,显著提升了系统的易用性、可维护性与扩展能力。 - **去除第三方依赖限制**:彻底移除对 `pymupdf` 的依赖,推动项目向更开放、合规的开源方向迈进。 - **开箱即用,配置便捷**:无需手动编辑 JSON 配置文件,绝大多数参数已支持命令行或 API 直接设置。 - **模型自动管理**:新增模型自动下载与更新机制,用户无需手动干预即可完成模型部署。 - **离线部署友好**:提供内置模型下载命令,支持完全断网环境下的部署需求。 - **代码结构精简**:移除数千行冗余代码,简化类继承逻辑,显著提升代码可读性与开发效率。 - **统一中间格式输出**:采用标准化的 `middle_json` 格式,兼容多数基于该格式的二次开发场景,确保生态业务Low6/13/2025
magic_pdf-1.3.12-released## What's Changed - 2025/05/24 1.3.12 Released - Added support for ppocrv5 model, updated `ch_server` model to `PP-OCRv5_rec_server` and `ch_lite` model to `PP-OCRv5_rec_mobile` (model update required) - In testing, we found that ppocrv5(server) shows some improvement for handwritten documents, but slightly lower accuracy than v4_server_doc for other document types. Therefore, the default ch model remains unchanged as `PP-OCRv4_server_rec_doc`. - Since ppocrv5 enhances recognitioLow5/24/2025
magic_pdf-1.3.11-released## What's Changed - Limit Python version `<3.14` - Support `torch==2.7` - Update `pdfminer-six` to the latest version ## New Contributors * @CharlesKeeling65 made their first contribution in https://github.com/opendatalab/MinerU/pull/2411 **Full Changelog**: https://github.com/opendatalab/MinerU/compare/magic_pdf-1.3.10-released...magic_pdf-1.3.11-releasedLow5/14/2025
magic_pdf-1.3.10-released## What's Changed - 2025/04/29 1.3.10 Released - Support for custom formula delimiters can be achieved by modifying the `latex-delimiter-config` item in the `magic-pdf.json` file under the user directory. - Pinned `pdfminer.six` to version `20250324` to prevent parsing failures caused by new versions. - 2025/04/29 1.3.10 发布 - 支持使用自定义公式标识符,可通过修改用户目录下的`magic-pdf.json`文件中的`latex-delimiter-config`项实现。 - 锁定`pdfminer.six`至`20250324`版本,以避免新版本导致的解析失败问题。 **Full Changelog**: https://gLow4/29/2025

Dependencies & License Audit

Loading dependencies...

Similar Packages

ComfyUI-LoaderUtils🔄 Optimize model loading in ComfyUI with flexible node connections and controlled sequences for better performance and memory management.main@2026-06-06
ragtable-extractExtract tables precisely from PDFs and convert them to clean HTML for RAG pipelines, running fast on CPU without external dependencies.main@2026-06-04
Mini-o3🧠 Enhance visual search with Mini-o3, providing state-of-the-art multi-turn reasoning and easy-to-use training code for advanced AI applications.main@2026-06-06
awesome-opensource-aiCurated list of the best truly open-source AI projects, models, tools, and infrastructure.main@2026-06-06
vllmA high-throughput and memory-efficient inference and serving engine for LLMsv0.22.1

More in RAG & Memory

vllmA high-throughput and memory-efficient inference and serving engine for LLMs
spiceaiA portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.
awesome-opensource-aiCurated list of the best truly open-source AI projects, models, tools, and infrastructure.
antflyNo description