Tag: #pdf
9 packages • ⭐ 124,320 total stars
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for powering downstream workflows such as gen AI applications.
A wrapper around the pdftoppm and pdftocairo command line tools to convert PDF to a PIL Image list.
PDF file reader/writer library
Tools for stamping and signing PDF files
The fastest PDF library for Python and Rust. Text extraction, image extraction, markdown conversion, PDF creation & editing. 0.8ms mean, 5× faster than industry leaders, 100% pass rate on 3,830 PDFs.
这是一个为 AstrBot 设计的 Office 助手插件。它赋予大语言模型(LLM)直接操作文件的能力,支持读取并分析多种格式文件,以及生成 Office 文档和office互转pdf的功能
Open-source Cloudflare Browser Rendering proxy — 10 MCP tools for Claude Code (content, screenshot, PDF, markdown, scrape, JSON AI extraction, links, a11y, crawl)
Extract tables precisely from PDFs and convert them to clean HTML for RAG pipelines, running fast on CPU without external dependencies.
