freshcrate
Home > RAG & Memory > quarkus-docling

quarkus-docling

Docling simplifies document processing, parsing diverse formats — including advanced PDF understanding — and providing seamless integrations with the gen AI ecosystem

Description

Docling simplifies document processing, parsing diverse formats — including advanced PDF understanding — and providing seamless integrations with the gen AI ecosystem

README

This is a Quarkus extension for the Docling project. Docling simplifies document processing, parsing diverse formats — including advanced PDF understanding — and providing seamless integrations with the gen AI ecosystem.

Docling

Docling Features

  • 🗂️ Parsing of multiple document formats incl. PDF, DOCX, XLSX, HTML, images, and more
  • 📑 Advanced PDF understanding incl. page layout, reading order, table structure, code, formulas, image classification, and more
  • 🧬 Unified, expressive DoclingDocument representation format
  • ↪️ Various export formats and options, including Markdown, HTML, and lossless JSON
  • 🔒 Local execution capabilities for sensitive data and air-gapped environments
  • 🤖 Plug-and-play integrations incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI
  • 🔍 Extensive OCR support for scanned PDFs and images
  • 🥚 Support of several Visual Language Models (SmolDocling)
  • 💻 Simple and convenient CLI

Quarkus Docling Features

Currently, this extension is a set of wrappers around the Docling Java project, which communicates with a Docling Serve instance via a REST API. This extension also provides a Dev Service and Dev UI integrations.

The eventual goal is to unify the DoclingDocument format with LangChain4j's Document abstraction so that Docling can be used in a LangChain4j RAG pipeline for ingesting data.

Take a look at the documentation for more information.

Or you can see an example with a video at: https://github.com/lordofthejars-ai/mission-impossible-rag

Contributors ✨

Thanks goes to these wonderful people (emoji key):

Eric Deandrea
Eric Deandrea

💻 🚧 ⚠️ 🤔 🖋 📖
Alex Soto
Alex Soto

💻 🚧 🖋 📖 🤔
Alina Yurenko
Alina Yurenko

🐛

This project follows the all-contributors specification. Contributions of any kind welcome!

Release History

VersionChangesUrgencyDate
1.3.0## What's Changed * Update to docling-java 0.5.0 by @edeandrea in https://github.com/quarkiverse/quarkus-docling/pull/109 * Release 1.3.0 by @edeandrea in https://github.com/quarkiverse/quarkus-docling/pull/110 **Full Changelog**: https://github.com/quarkiverse/quarkus-docling/compare/1.2.3...1.3.0Medium3/20/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

pdf_oxideThe fastest PDF library for Python and Rust. Text extraction, image extraction, markdown conversion, PDF creation & editing. 0.8ms mean, 5× faster than industry leaders, 100% pass rate on 3,830 PDFs. v0.3.37
Medical-ResearchSearch and analyze medical literature across PubMed, ClinicalTrials.gov, and Europe PMC using AI to support clinical and research decisions.main@2026-04-21
Discord-AlternativesExplore alternatives to Discord with a curated list of early-stage apps, evaluating features, hosting, and encryption to guide your choice.main@2026-04-21
awesome-opensource-aiCurated list of the best truly open-source AI projects, models, tools, and infrastructure.main@2026-04-20
vespaAI + Data, online. https://vespa.aiv8.675.23