| v0.3.57 | ### Added - **`TextChar::rendered_advance`** โ per-glyph cursor advance to the next character's origin, including character spacing (Tc) and word spacing (Tw) per the PDF Tx formula, distinct from the shape-only `advance_width`. Enables accurate word-boundary detection and cursor reconstruction. Thanks @haberman. (#602) - **Separation plate rendering** โ `render_separations(page, dpi)` / `render_separation(page, ink_name, dpi)` (Rust + Python) emit one grayscale image per ink, pixel value = ink | High | 5/30/2026 |
| v0.3.54 | ### Fixed - **Hebrew / RTL visual-vs-logical detection ([#537](https://github.com/yfedoseev/pdf_oxide/issues/537))** โ Hebrew PDFs that store text in visual order (the PDF content stream draws glyphs left-to-right even though the script reads right-to-left) now extract in correct logical order. New per-RTL-run X-coordinate-monotonicity detector gates the existing UAX #9 `bidi::reorder_visual_to_logical` pass; logical-order PDFs (the pdfium `hebrew_mirrored.pdf` test fixture and | High | 5/23/2026 |
| v0.3.49 | ### Fixed - **Linearized PDFs with a non-zero `%PDF-` header offset ([#509](https://github.com/yfedoseev/pdf_oxide/issues/509))** โ files whose `%PDF-` header is preceded by leading bytes (e.g. a captive- portal HTML redirect injected ahead of a Linearized PDF) are now read instead of rejected with `Trailer missing /Root entry`. The xref- offset shift for header-offset PDFs no longer requires the final trailer to carry `/Root`; xref reconstruction now rejects a parsed- but-`/Root` | High | 5/16/2026 |
| v0.3.46 | ### Added - **Raw RGBA pixel buffer, SIMD downscaling, and thread-safe rendering ([#446](https://github.com/yfedoseev/pdf_oxide/issues/446), [#481](https://github.com/yfedoseev/pdf_oxide/issues/481))** โ `page.render_pixmap()` (Python), `renderToPixmap()` (Node.js / Go), and `Page.RenderToRgba()` (C#) expose the premultiplied RGBA8888 buffer directly from `tiny_skia::Pixmap::data()`, eliminating the encodeโdecode roundtrip for callers that need raw pixels (PIL, sharp, `System.Draw | High | 5/11/2026 |
| v0.3.44 | ### Highlights - **`pdf_oxide::crypto::CryptoProvider` trait** โ new abstraction that decouples PDF encryption and signature paths from any one cryptography crate. Two providers ship out of the box: - **`RustCryptoProvider`** (default): pure-Rust stack as before (`sha2`, `aes`, `rsa`, `p256`, `p384`, `getrandom`, `md-5`, `sha1`). Permits every algorithm PDF specs reference, including the legacy MD5+RC4 path required by ISO 32000-1 Rโค4 documents. - **`AwsLcProvider`** (opt-i | High | 5/6/2026 |
| v0.3.40 | ### Community contributors This release exists because of the community. Special thanks to: - **[@sparkyandrew](https://github.com/sparkyandrew)** โ six detailed bug reports (#382, #385, #386, #397, #401, #425) that drove the CJK font subsetter, encryption, font-name handling, and now the image rendering overhaul. Every report came with a reproduction case. Issue #425 specifically identified four separate rendering bugs and raised the API design question that led to `ImageContent::f | High | 4/29/2026 |
| v0.3.38 | This release closes the "Rust-only `DocumentBuilder` gap": the fluent write-side builder, embedded fonts, the HTML+CSS pipeline, annotations, form-field creation, and low-level graphics primitives are now reachable from **Python, WASM, C#, Go, and Node/TypeScript** โ the Rust implementation is the single source of truth and every binding is a thin translation layer. On top of that it lands the first cryptographic signature-verification path (RSA-PKCS#1 v1.5) across every binding and a pdf.js-par | High | 4/23/2026 |
| v0.3.37 | ### API โ `Pdf::from_html_css` (#248) ```rust let font = std::fs::read("DejaVuSans.ttf")?; let mut pdf = Pdf::from_html_css( "<h1>Hello</h1><p>World</p>", "h1 { color: blue; font-size: 24pt }", font, )?; pdf.save("out.pdf")?; ``` The whole feature: pass HTML + CSS + font bytes, get a paginated PDF back. Pure Rust, MIT/Apache only (no MPL transitive deps), `extract_text` round-trips byte-equal so produced PDFs participate in the existing test infrastructure. End-to-end test suite a | High | 4/21/2026 |
| v0.3.36 | ### Markdown structural extraction (#377) The headline change of this release. `to_markdown()` previously consumed only the MCID *order* from `/StructTreeRoot` and then re-derived heading levels from font-size heuristics and list markers from glyph detection. For Word/Acrobat tagged PDFs whose body and heading text share a point size, this dropped every heading; for tagged lists where `LI โ LBody โ MCR` nests the actual content under a Span/P, this dropped every bullet; for tagged paragraphs wh | High | 4/20/2026 |
| v0.3.35 | ### Text extraction correctness - **Adjacent narrow-glyph doublets no longer collapsed at small font sizes (#378, PR #379).** `TextExtractor::deduplicate_overlapping_chars` and `deduplicate_overlapping_spans` used a hardcoded 2 pt absolute threshold to detect duplicate glyphs from stroke+fill render passes. For narrow glyphs (`l`, `r`, `I`, `i`) in compact fonts at small sizes the per-glyph advance width drops to โค 2 pt (Helvetica `l` โ 2.5 pt at 9 pt), so legitimate adjacent double | High | 4/19/2026 |
| v0.3.34 | ### API โ Page abstraction (#371) All four language bindings now expose a page object so callers can iterate a document and call extraction methods on the page directly. Named consistently as `Page` in Python, Node.js, C#, and Go. ```python with PdfDocument("paper.pdf") as doc: for page in doc: # len(doc), doc[i], doc[-1] also work text = page.text md = page.markdown(detect_headings=True) ``` - **Python** โ `Page` with lazy properties: `text`, `chars`, `words`, | High | 4/18/2026 |
| v0.3.33 | ### Text extraction correctness - **ToUnicode CMap miss returns U+FFFD instead of ASCII ciphertext (#363).** Subset Type0 fonts whose ToUnicode CMap doesn't cover a CID now emit the replacement character instead of falling through to the Identity-H `cid-as-Unicode` path that produced strings like `%B+$%8A//$2*%01*1%6APP`. - **Intra-word TJ kerning no longer splits words (#365).** Letter-pair kerning of 0.10โ0.20 em inside single words (`[(diffe) -150 (rent)]`) no longer triggers space insertion | High | 4/17/2026 |
| v0.3.32 | ### Release pipeline - **Fix `x86_64-pc-windows-gnu` native-lib build failing the v0.3.31 release.** The new `scripts/shrink-staticlib.sh` introduced in v0.3.31 ran `objcopy --strip-debug` on every archive member. The MinGW cross-compile toolchain emits split-debug `.dwo` members that contain *only* DWARF sections; after stripping those sections the member has no sections left and objcopy aborted the whole archive with `'...rcgu.dwo' has no sections`, failing the job that produces the Go Window | High | 4/16/2026 |
| v0.3.30 | --- ### Installation **Rust (crates.io)** ```bash cargo add pdf_oxide ``` **Python (PyPI)** ```bash pip install pdf_oxide ``` **JavaScript/WASM (npm)** ```bash npm install pdf-oxide-wasm ``` **CLI (Homebrew)** ```bash brew install yfedoseev/tap/pdf-oxide ``` **CLI (Scoop โ Windows)** ```powershell scoop bucket add pdf-oxide https://github.com/yfedoseev/scoop-pdf-oxide scoop install pdf-oxide ``` **CLI (Shell installer)** ```bash curl -fsSL https://raw.githubusercontent.com/yfedoseev/pdf_ | High | 4/12/2026 |
| v0.3.29 | --- ### Installation **Rust (crates.io)** ```bash cargo add pdf_oxide ``` **Python (PyPI)** ```bash pip install pdf_oxide ``` **JavaScript/WASM (npm)** ```bash npm install pdf-oxide-wasm ``` **CLI (Homebrew)** ```bash brew install yfedoseev/tap/pdf-oxide ``` **CLI (Scoop โ Windows)** ```powershell scoop bucket add pdf-oxide https://github.com/yfedoseev/scoop-pdf-oxide scoop install pdf-oxide ``` **CLI (Shell installer)** ```bash curl -fsSL https://raw.githubusercontent.com/yfedoseev/pdf_ | Medium | 4/12/2026 |
| v0.3.28 | --- ### Installation **Rust (crates.io)** ```bash cargo add pdf_oxide ``` **Python (PyPI)** ```bash pip install pdf_oxide ``` **JavaScript/WASM (npm)** ```bash npm install pdf-oxide-wasm ``` **CLI (Homebrew)** ```bash brew install yfedoseev/tap/pdf-oxide ``` **CLI (Scoop โ Windows)** ```powershell scoop bucket add pdf-oxide https://github.com/yfedoseev/scoop-pdf-oxide scoop install pdf-oxide ``` **CLI (Shell installer)** ```bash curl -fsSL https://raw.githubusercontent.com/yfedoseev/pdf_ | Medium | 4/12/2026 |
| v0.3.27 | ### Language Bindings - **Go: migrate from cdylib to staticlib for self-contained binaries (#334)** โ `pdf_oxide` now produces `libpdf_oxide.a` alongside the cdylib (new `staticlib` entry in `Cargo.toml`'s `crate-type`), and `go/pdf_oxide.go` links the archive directly via per-platform `#cgo ... LDFLAGS` with the exact system-library list rustc needs. The resulting Go binary is fully self-contained โ no `LD_LIBRARY_PATH` / `DYLD_LIBRARY_PATH` / `PATH` configuration required. Windows x64 is prod | Medium | 4/12/2026 |
| v0.3.24 | This release ships official bindings for JavaScript/TypeScript, Go, and C#, built on a shared C FFI layer. 100% Rust FFI parity across all three. ### Features - **JavaScript / TypeScript bindings** (`pdf-oxide` on npm) โ N-API native module with `Buffer`/`Uint8Array` input, `openWithPassword()`, worker thread pool, `Symbol.dispose`, rich error hierarchy, and complete API coverage: document editor, forms, rendering, signatures/TSA, compliance, annotations, extraction with bbox. Full TypeScript | Medium | 4/11/2026 |
| v0.3.23 | ### Bug Fixes - **Text extraction: SIGABRT on pages with degenerate CTM coordinates (#308)** โ extracting text from certain rotated dvips-generated pages (e.g., arXiv papers with `Page rot: 90`) caused a 38 petabyte allocation and SIGABRT. Degenerate CTM transforms produced text spans with bounding boxes ~19 quadrillion points wide, which blew up the column detection histogram in `detect_page_columns()`. Per PDF 32000-1:2008 ยง8.3.2.3, the visible page region is defined by MediaBox/CropBox, not | High | 4/10/2026 |
| v0.3.22 | ### Breaking Changes None. All changes are backward-compatible. ### Features - **Thread-safe `PdfDocument` โ Send + Sync (#302)** โ replaced all 16 `RefCell<T>` with `Mutex<T>` and `Cell<usize>` with `AtomicUsize`. `PdfDocument` can now safely cross thread boundaries. Removes `unsendable` from `PdfDocument`, `FormField`, and `PdfPage` Python classes. Enables `asyncio.to_thread()`, free-threaded Python (cp314t), and thread pool usage without `RuntimeError`. Reported by @FireMasterK (#298). - * | Medium | 4/9/2026 |
| v0.3.21 | ### Bug Fixes - **Log level now fully respected in Python (#283)** โ `extract_log_debug!` / `extract_log_trace!` / etc. were printing to stderr directly via `eprintln!`, bypassing the `log` crate and therefore ignoring `pdf_oxide.set_log_level(...)` and Python's `logging.basicConfig(level=...)`. Messages like `[DEBUG] Parsing content stream for text extraction` and `[TRACE] Detected document script: Latin` leaked through at ERROR level. The macros now forward to `log::debug!` / `log::trace!` / | High | 4/6/2026 |
| v0.3.20 | ### Table Extraction Engine Major rewrite of the table detection system, implementing the universal `Edges โ Snap/Merge โ Intersections โ Cells โ Groups` pipeline โ the gold-standard approach used by Tabula, pdfplumber, and PyMuPDF, now in pure Rust. #### New Detection Capabilities - **Intersection-based table detection** โ Finds HรV line crossings, builds cells from 4-corner rectangles, groups into tables via union-find. The gold-standard approach used by Tabula/pdfplumber/PyMuPDF, now in pur | Medium | 4/4/2026 |
| v0.3.19 | ### Features - **`extract_page_text()` Single-Call DTO** (#268) โ New `PageText` struct returns spans, characters, and page dimensions from a single extraction pass, eliminating redundant content stream parsing. Available across Rust, Python, and WASM. - **Column-Aware Reading Order** (#270) โ New `extract_spans_with_reading_order()` method accepts a `ReadingOrder` parameter. `ReadingOrder::ColumnAware` uses XY-Cut spatial partitioning to detect columns and read each column top-to-bottom, fixin | Medium | 4/3/2026 |
| v0.3.18 | ### Rendering Engine โ Visual Parity Major rendering improvements achieving near-perfect visual fidelity across academic papers, government documents, CJK content, presentations, forms, and complex multi-layer PDFs. #### Font Rendering - **Correct Character Spacing** โ Fixed proportional width resolution for CID, CFF, and TrueType subset fonts. Documents that previously rendered with monospace-like spacing now display with correct kerning and proportional widths. - **Embedded Font Support** โ | Medium | 4/2/2026 |
| v0.3.17 | ### Features - **Refined Table Detection** โ The spatial table detector now requires at least **2 columns** to identify a region as a table. This significantly reduces false positives where single-column lists or bullet points were incorrectly wrapped in ASCII boxes. - **Optimized Text Extraction** โ Refactored the internal extraction pipeline to eliminate redundant work when processing Tagged PDFs. The structure tree and page spans are now extracted once and shared across the detection and ren | Low | 3/9/2026 |
| v0.3.16 | ### Features - **Smart Hybrid Table Extraction** (#206) โ Introduced a robust, zero-config visual detection engine that handles both bordered and borderless tables. - **Localized Grid Detection:** Uses Union-Find clustering to group vector paths into discrete table regions, enabling multiple tables per page. - **Visual Line Analysis:** Detects cell boundaries from actual drawing primitives (lines and rectangles), significantly improving accuracy for untagged PDFs. - **Visual Spans:* | Low | 3/8/2026 |
| v0.3.15 | ### Features - **PDF Header/Footer Management API** (#207) โ Added a dedicated API for managing page artifacts across Rust, Python, and WASM. - **Add:** Ability to insert custom headers and footers with styling and placeholders via `PageTemplate`. - **Remove:** Heuristic detection engine to automatically identify and strip repeating artifacts. Includes modular methods: `remove_headers()`, `remove_footers()`, and `remove_artifacts()`. Prioritizes ISO 32000 spec-compliant `/Artifact` tags | Low | 3/6/2026 |
| v0.3.14 | ### Features - **High-Level Rendering API** (#185, #190) โ added `Pdf::render_page()` to Rust, Python, and WASM. Supports rendering any page to `Image` (Png/Jpeg). Restored backward compatibility for Rust by maintaining the 1-argument `render_page` and adding `render_page_with_options`. - **Word and Line Extraction** (#185, #189) โ added `extract_words()` and `extract_text_lines()` to all bindings. Provides semantic grouping of characters with bounding boxes, font info, and styling (parity with | Low | 3/4/2026 |
| v0.3.13 | ### Bug Fixes โ Character Extraction (#186) Reported by **@cole-dda** โ garbled output when using `extract_chars()` on PDFs with multi-byte encodings (CJK text, Type0 fonts). - **Multi-byte decoding in show_text** โ fixed `extract_chars()` to correctly handle 2-byte and variable-width encodings (Identity-H/V, Shift-JIS, etc.). Previously, characters were processed byte-by-byte, causing multi-byte characters to be split and garbled. Now uses the same robust decoding logic as `extract_spans()`. | Low | 3/3/2026 |
| v0.3.12 | ### Bug Fixes โ Text Extraction (#181) Reported by **@Goldziher** โ systematic evaluation across 10 PDFs covering word merging, encoding failures, and RTL text. - **CID font width calculation** โ fixed text-to-user space conversion for CID fonts. Glyph widths were not correctly scaled, causing word boundary detection to merge adjacent words (`destinationmachine` โ `destination machine`, `helporganizeas` โ `help organize as`). - **Font-change word boundary detection** โ when PDF font changes m | Low | 3/2/2026 |
| v0.3.11 | ### New Features - **CLI with 22 subcommands and interactive REPL** (#176) โ standalone `pdf-oxide` binary with text/markdown/html extraction, merge, split, compress, encrypt/decrypt, search, images, rotate, crop, watermark, forms, bookmarks, and more. Interactive REPL with session persistence and autocomplete. OS-specific install scripts: `curl -fsSL oxide.fyi/install.sh | sh` (Linux/macOS) and `irm oxide.fyi/install.ps1 | iex` (Windows). - **MCP server for AI assistants** (#177) โ `pdf-oxide | Low | 3/1/2026 |
| v0.3.10 | ### New Features - **WASM build support** (#151) โ WebAssembly bindings via wasm-bindgen. New `PdfDocument::open_from_bytes()` constructor enables browser-side PDF extraction. Internal reader changed from `BufReader<File>` to `BufReader<Cursor<Vec<u8>>>` for portability. - **Parallel page extraction** (#168) โ New `parallel` feature flag with rayon-based multi-threaded extraction. `ParallelExtractor` distributes pages across worker threads, each opening its own PdfDocument instance. Global fon | Low | 2/28/2026 |
| v0.3.9 | ### Performance - **O(nยฒ) string concat fix** (#135) โ Replaced `String::push_str()` accumulation in span merging with pre-allocated `Vec<&str>` joined at the end. Eliminates quadratic growth on pages with thousands of merged spans. - **Image-only content stream parser** (#113) โ New `parse_content_stream_images_only()` fast path that only extracts image operators (`Do`, `BI`), skipping text and graphics entirely. Used by `extract_images()` for 3-5ร faster image extraction. - **Fingerprint-ba | Low | 2/25/2026 |
| v0.3.8 | ### Performance - **Text-only content stream parser** (#110) โ New `parse_content_stream_text_only()` fast path skips graphics operators outside BT/ET blocks using byte-level scanning instead of full nom parsing. Only text-affecting operators are returned. - **Byte-level graphics scanner** (#112) โ Replaced nom-based operand loop with raw index arithmetic in `scan_graphics_region()`. Processes digits, dots, and whitespace at near-memcpy speed, skipping path coordinates without constructing any | Low | 2/21/2026 |
| v0.3.7 | ### Verified โ 3,829-PDF Corpus (v0.3.6 โ v0.3.7) | Metric | v0.3.6 | v0.3.7 | Change | |--------|--------|--------|--------| | **Clean rate** | 95.7% | **99.6%** | 3,812 of 3,829 PDFs | | **Dirty PDFs** | 165 | **17** | **-90%** | Systematic benchmark testing across 3,829 real-world PDFs identified and fixed 13 text extraction issues. ### Added โ Parser & Decoders - **BrotliDecode stream filter** (PDF 2.0, ISO 32000-2:2020) โ New `BrotliDecoder` for PDFs using Brotli-compressed streams (#95 | Low | 2/20/2026 |
| v0.3.6 | ### Performance - **Bulk page tree cache** โ On first page access, the entire page tree is walked once and all pages are cached. Previously `get_page()` traversed from root for every uncached page โ O(n) per page, O(nยฒ) total for sequential access. Now O(1) per page after a single O(n) walk. - **isartor-6-1-12-t01-fail-a.pdf (10,000 pages): 55,667ms โ 332ms (168ร faster)** - Eliminates the last >5s PDF in the entire 3,830-file corpus - **Scan-for-object offset cache** (#44) โ When objects | Low | 2/16/2026 |
| v0.3.5 | ## v0.3.5 Release This release delivers **major performance improvements**, **100% pass rate on 3,830 PDFs**, comprehensive error recovery for 28+ real-world PDF failures, and spec-correct rendering โ the biggest stability release to date. ### โก Performance - **Font caching across pages** โ Document-level cache keyed by `ObjectRef` avoids re-parsing shared fonts on every page. For a 1000-page document sharing 20 fonts, this reduces font parsing from 40,000 operations to 20 - **Page object cach | Low | 2/16/2026 |
| v0.3.4 | ## v0.3.4 Release This release delivers **PDF parsing robustness** for real-world malformed PDFs, **character-level text extraction** API, and **XObject path extraction** โ driven by community bug reports. ### ๐ง Fixed โ PDF Parsing Robustness (Issue #41) - **Header offset support** โ PDFs with binary prefixes or BOM headers now open successfully - Searches first 1024 bytes for `%PDF-` marker (PDF spec compliant) - Supports UTF-8 BOM, email headers, and other leading binary data - `parse | Low | 2/13/2026 |
| v0.3.1 | ## v0.3.1 Release This release delivers **95% form field coverage** across Read/Create/Modify operations, comprehensive multimedia annotation support, and Python 3.8-3.14 compatibility via ABI3. ### ๐ฏ Form Field Coverage (95%) - **Hierarchical Fields**: Parent/child field structures (`address.street`, `address.city`) - `add_parent_field()`, `add_child_field()`, `add_form_field_hierarchical()` - Property inheritance between parent and child fields (FT, V, DV, Ff, DA, Q) - **Property Modifi | Low | 1/14/2026 |
| v0.3.0 | ## v0.3.0 Release This release introduces the **Unified Pdf API** - one seamless interface for extracting, creating, and editing PDFs. Plus comprehensive PDF creation capabilities with tables, graphics, forms, and security. ### ๐ฏ Unified Pdf API (Extract + Create + Edit) - **Single API for all operations** - `Pdf::open("input.pdf")` - Open existing PDF for reading and editing - `Pdf::from_markdown(content)` - Create new PDF from Markdown - `Pdf::from_html(content)` - Create new PDF from | Low | 1/12/2026 |
| v0.2.6 | ## v0.2.6 Release This release adds comprehensive CJK (Chinese, Japanese, Korean) language support and enhances structure tree handling for Tagged PDFs. ### ๐ CJK Language Support - **Predefined CMap support for CJK fonts** (PDF Spec Section 9.7.5.2) - Adobe-GB1 (Simplified Chinese) - ~500 common character mappings - Adobe-Japan1 (Japanese) - Hiragana, Katakana, Kanji mappings - Adobe-CNS1 (Traditional Chinese) - Bopomofo and CJK mappings - Adobe-Korea1 (Korean) - Hangul and Hanja map | Low | 1/10/2026 |
| v0.2.5 | ## v0.2.5 Release This release adds flexible image handling with embedding and export capabilities for HTML and Markdown conversion. ### ๐ผ๏ธ Image Handling Features - **Image Embedding** - Embed images as base64 in output (default) - HTML: `<img src="data:image/png;base64,...>">` - Markdown: `` (works in Obsidian, Typora, VS Code, Jupyter) - Portable - no external file dependencies - **Image File Export** - Save images as separate files - Set `embed_ima | Low | 1/10/2026 |
| v0.2.4 | ## v0.2.4 Release This release fixes a critical text positioning bug and adds formula extraction capabilities. ### ๐ Bug Fixes - **CTM (Current Transformation Matrix) handling** - Issue #11 - CTM now correctly applied to text positions per PDF Spec Section 9.4.4 - This fix affects text positioning across the entire library - Critical for production use with complex PDFs ### โจ New Features - **Structure Tree Enhancements** - `/Alt` (alternate description) parsing for accessibility tex | Low | 1/10/2026 |
| v0.2.3 | ## v0.2.3 Release This release fixes critical text positioning bugs and adds intelligent text processing for better extraction quality. ### ๐ Bug Fixes - **BT/ET matrix reset** - Per PDF spec Section 9.4.1 (PR #10 by @drahnr) - Text matrices weren't being reset between text blocks, causing positions to accumulate - Now correctly resets transformation matrix at text block boundaries - **Geometric spacing detection** - Markdown converter now uses proper spacing analysis (#5) - **Verbose log | Low | 1/7/2026 |
| v0.2.2 | ## v0.2.2 Release This release improves package discoverability with optimized keywords and metadata. ### ๐ Discoverability Improvements - **Crate Keywords Optimization** - Better search results on crates.io - Enhanced metadata for common PDF operations - Improved categorization - Better findability for PDF extraction use cases ### โ
Verification - Metadata validation passed - Search indexing updated - Package discovery improved ### ๐ฅ Installation **Rust (crates.io)** ```bash cargo | Low | 12/15/2025 |
| v0.2.1 | ## v0.2.1 Release This release fixes critical encrypted PDF support issues and improves CI/CD pipeline reliability. ### ๐ Bug Fixes - **Encrypted stream decoding** (PR #2 and #3) - Fixed decryption ordering - must happen before decompression - Fixed encryption handler initialization timing - Added Form XObject encryption support - PDFOxide now works with password-protected PDFs in production - **CI/CD pipeline fixes** - Improved build reliability ### ๐ Community Contributors ๐ฅ **@t | Low | 12/15/2025 |
| v0.1.4 | ## v0.1.4 Release This release improves encrypted PDF handling and fixes documentation issues. ### ๐ Encryption Improvements - **Encrypted stream decoding refinements** (PR #2) - Improved stream cipher handling - Better compatibility with various PDF encryption methods - **Documentation and doctest fixes** - All examples updated and verified - docs.rs build fixed ### โ
Verification - All encryption tests pass - Documentation verified - Cross-platform compatibility confirmed ### ๐ฅ I | Low | 12/12/2025 |
| v0.1.3 | ## v0.1.3 Release This release refines encrypted PDF handling with additional improvements for stream decryption. ### ๐ Encryption Refinements - **Encrypted stream decoding improvements** - Enhanced algorithm for stream object decryption - Better compatibility with different PDF security handlers - Improved error handling for malformed encryption dictionaries ### โ
Verification - Encrypted PDF tests validated - Stream decryption accuracy verified - Multiple encryption method support co | Low | 12/12/2025 |
| v0.1.2 | ## v0.1.2 Release This release adds Python 3.13 support and includes GitHub sponsor configuration. ### ๐ Python Support - **Python 3.13 Support** - Extended language version compatibility - PyO3 bindings updated for Python 3.13 - Full feature parity with previous Python versions - Comprehensive testing on new Python version ### ๐ค Community - **GitHub Sponsor Configuration** - Support the project - Sponsor link added to repository - Community funding mechanism available ### โ
Veri | Low | 11/27/2025 |
| v0.1.1 | ## v0.1.1 Release This release introduces cross-platform binary builds for Linux, macOS, and Windows. ### ๐ Platform Support - **Cross-Platform Binaries** - Pre-built executables for all major platforms - **Linux**: x86_64 (glibc and musl), ARM64 support - **macOS**: Intel (x86_64) and Apple Silicon (ARM64) - **Windows**: x86_64 support - One-click installation - no build required ### ๐ฆ Binary Tools - `export_to_markdown` - PDF to Markdown conversion - `export_to_text` - PDF to plai | Low | 11/26/2025 |
| v0.1.0 | ## v0.1.0 Release - Initial Release Welcome to PDFOxide - **The Complete PDF Toolkit for Rust**. This initial release brings spec-compliant PDF text extraction with intelligent reading order detection, Python bindings, and support for encrypted PDFs. ### ๐ Core Features - **PDF Text Extraction** - Spec-compliant Unicode mapping per PDF Section 9.10 - Intelligent reading order detection - Character-level positioning metadata - Support for embedded fonts and encoding - **Form Field Extrac | Low | 11/6/2025 |