Home > MCP Servers > jdocmunch-mcp

jdocmunch-mcp

The leading, most token-efficient MCP server for documentation exploration and retrieval via structured section indexing

claude claude-code docs documentation llm markdown mcp mcp-server python

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

The leading, most token-efficient MCP server for documentation exploration and retrieval via structured section indexing

README

Stop Feeding Documentation Trees to Your AI

Most AI agents still explore documentation the expensive way:

open file → skim hundreds of irrelevant paragraphs → open another file → repeat

That burns tokens, floods context windows with noise, and forces models to reason through a lot of text they never needed in the first place.

jDocMunch-MCP lets AI agents navigate documentation by section instead of reading files by brute force.
It indexes a documentation set once, then retrieves exactly the section the agent actually needs, with byte-precise extraction from the original file.

Task	Traditional approach	With jDocMunch
Find a configuration section	~12,000 tokens	~400 tokens
Browse documentation structure	~40,000 tokens	~800 tokens
Explore a full doc set	~100,000 tokens	~2,000 tokens

Index once. Query cheaply forever.
Precision context beats brute-force context.

jDocMunch MCP

AI-native documentation navigation for serious agents

Commercial licenses

jDocMunch-MCP is free for non-commercial use.

Commercial use requires a paid license.

jDocMunch-only licenses

Builder — $29 — 1 developer

Studio — $99 — up to 5 developers

Platform — $499 — org-wide internal deployment

Want both code and docs retrieval?

Munch Duo Builder Bundle — $89

Munch Duo Studio Bundle — $399

Munch Duo Platform Bundle — $2,249

Stop dumping documentation files into context windows. Start navigating docs structurally.

jDocMunch indexes documentation once by heading hierarchy and section structure, then gives MCP-compatible agents precise access to the explanations they actually need instead of forcing them to brute-read files.

It is built for workflows where token efficiency, context hygiene, and agent reliability matter.

Why this exists

Large context windows do not fix bad retrieval.

Agents waste money and reasoning bandwidth when they:

open entire documents to find one configuration block
repeatedly re-read headings, boilerplate, and unrelated sections
lose important explanations inside oversized context payloads
consume documentation as flat text instead of structured knowledge

jDocMunch fixes that by changing the unit of access from file to section.

Instead of handing an agent an entire document, it can retrieve exactly:

an installation section
a configuration section
an API explanation
a troubleshooting section
a specific subtree of related headings

That makes documentation exploration cheaper, faster, and more stable.

What makes it different

Section-first retrieval

Search and retrieve documentation by section, not just file path or keyword match.

Byte-precise extraction

Full content is pulled on demand from exact byte offsets into the original file.

Stable section IDs

Sections retain durable identities across re-indexing when path, heading text, and heading level remain unchanged.

Local-first architecture

Indexes and raw docs are stored locally. No hosted dependency required.

MCP-native workflow

Works with Claude Desktop, Claude Code, Google Antigravity, and other MCP-compatible clients.

What gets indexed

Every section stores:

title and heading level
one-line summary
extracted tags and references
SHA-256 content hash for drift detection
byte offsets into the original file

This allows agents to discover documentation structurally, then request only the specific section they need.

Why agents need this

Traditional doc retrieval methods all break in different ways:

File scanning loads far too much irrelevant text
Keyword search finds terms but often loses context
Chunking breaks authored hierarchy and separates explanations from examples

jDocMunch preserves the structure the human author intended:

heading hierarchy
parent/child relationships
section boundaries
coherent explanatory units

Agents do not need bigger context windows.
They need better navigation.

How it works

jDocMunch implements jMRI-Full — the open specification for structured retrieval MCP servers. jMRI-Full covers the full stack: discover, search, retrieve, and metadata operations with batch retrieval, hash-based drift detection, byte-offset addressing, and a complete _meta envelope on every call.

Discovery GitHub API or local directory walk
Security filtering Traversal protection, secret exclusion, binary detection
Parsing Format-aware section splitting: heading-based (Markdown/MDX/HTML/RST/AsciiDoc), structure-based (OpenAPI tags, JSON keys, XML elements), or cell-based (Jupyter)
Hierarchy wiring Parent/child relationships established
Summarization Heading text → AI batch summaries → title fallback
Storage JSON index + raw files stored locally under ~/.doc-index/
Retrieval O(1) byte-offset seeking via stable section IDs

Stable section IDs

{repo}::{doc_path}::{ancestor-chain/slug}#{level}

The slug is prefixed with the ancestor heading chain, making IDs both readable and stable. A new heading inserted in one branch of a document never renumbers IDs in another branch.

Examples:

owner/repo::docs/install.md::installation#1
owner/repo::docs/install.md::installation/prerequisites#3
owner/repo::README.md::usage/configuration/advanced-configuration#4
local/myproject::guide.md::configuration#2

IDs remain stable across re-indexing when the file path, heading text, heading level, and parent heading chain do not change.

Installation

Prerequisites

Python 3.10+
pip

Install

pip install jdocmunch-mcp

Verify:

jdocmunch-mcp --help

Configure an MCP client

PATH note: MCP clients often run with a restricted environment where jdocmunch-mcp may not be found even if it works in your shell. Using uvx is the recommended approach because it resolves the package on demand without relying on your system PATH. If you prefer pip install, use the absolute path to the executable instead.

Common executable paths

Linux: /home/<username>/.local/bin/jdocmunch-mcp
macOS: /Users/<username>/.local/bin/jdocmunch-mcp
Windows: C:\\Users\\<username>\\AppData\\Roaming\\Python\\Python3xx\\Scripts\\jdocmunch-mcp.exe

Claude Desktop / Claude Code

Config file location:

OS	Path
macOS	`~/Library/Application Support/Claude/claude_desktop_config.json`
Linux	`~/.config/claude/claude_desktop_config.json`
Windows	`%APPDATA%\Claude\claude_desktop_config.json`

Minimal config

{
  "mcpServers": {
    "jdocmunch": {
      "command": "uvx",
      "args": ["jdocmunch-mcp"]
    }
  }
}

With optional AI summaries and GitHub auth

{
  "mcpServers": {
    "jdocmunch": {
      "command": "uvx",
      "args": ["jdocmunch-mcp"],
      "env": {
        "GITHUB_TOKEN": "ghp_...",
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

For Anthropic or Gemini, the base uvx jdocmunch-mcp command is enough once the corresponding API key is present. For OpenAI-compatible providers such as OpenAI, MiniMax, or GLM-5, include the optional dependency in the launcher command:

{
  "mcpServers": {
    "jdocmunch": {
      "command": "uvx",
      "args": ["--with", "openai", "jdocmunch-mcp"],
      "env": {
        "MINIMAX_API_KEY": "mx-...",
        "JDOCMUNCH_SUMMARIZER_PROVIDER": "minimax"
      }
    }
  }
}

After saving the config, restart Claude Desktop / Claude Code.

Claude Code hooks (recommended)

jDocMunch ships enforcement hooks that keep your agent honest:

PreToolUse — warns when Claude tries to Read a large doc file, suggesting search_sections + get_section
PostToolUse — auto-reindexes doc files after Edit/Write so the index never goes stale
PreCompact — injects a session snapshot before context compaction so doc orientation survives

Install everything in one command:

jdocmunch-mcp init

This detects your MCP clients, patches their config, installs a Doc Exploration Policy into CLAUDE.md, sets up enforcement hooks, and indexes your current directory. Use --dry-run to preview, --demo for a benefit summary, or --yes for non-interactive mode.

For hooks only:

jdocmunch-mcp init --hooks

If you also use jCodeMunch, run both:

jcodemunch-mcp init
jdocmunch-mcp init

CLI subcommands

Subcommand	Purpose
`serve` (default)	Run the MCP server (stdio)
`init`	One-command onboarding: detect clients, write config, install policy, hooks, index
`claude-md`	Print or install the Doc Exploration Policy (`--install global\|project`)
`index-local --path <dir>`	Index a local folder (CLI, no MCP session needed)
`index-file <path>`	Re-index a single file within an existing index
`hook-pretooluse`	PreToolUse hook handler (reads JSON from stdin)
`hook-posttooluse`	PostToolUse hook handler (reads JSON from stdin)
`hook-precompact`	PreCompact hook handler (reads JSON from stdin)

Google Antigravity

Open the Agent pane
Click the ⋯ menu → MCP Servers → Manage MCP Servers
Click View raw config to open mcp_config.json
Add the entry below, save, then restart the MCP server

{
  "mcpServers": {
    "jdocmunch": {
      "command": "uvx",
      "args": ["jdocmunch-mcp"]
    }
  }
}

OpenClaw

Option A — CLI (one command):

openclaw mcp set jdocmunch '{"command":"uvx","args":["jdocmunch-mcp"]}'

Option B — Edit config directly:

Add the entry to ~/.openclaw/openclaw.json under mcpServers:

{
  "mcpServers": {
    "jdocmunch": {
      "command": "uvx",
      "args": ["jdocmunch-mcp"],
      "transport": "stdio"
    }
  }
}

With optional AI summaries:

{
  "mcpServers": {
    "jdocmunch": {
      "command": "uvx",
      "args": ["jdocmunch-mcp"],
      "transport": "stdio",
      "env": {
        "ANTHROPIC_API_KEY": "${ANTHROPIC_API_KEY}"
      }
    }
  }
}

Restart the gateway and verify:

openclaw gateway restart
openclaw mcp list

Per-agent routing (optional):

{
  "agents": {
    "researcher": {
      "mcpServers": ["jdocmunch", "brave-search", "fetch"]
    }
  }
}

Tell your OpenClaw agent to use it

Without explicit instructions, your agent will ignore jDocMunch even though it's connected. Create a system prompt file (e.g. ~/.openclaw/agents/researcher.md) with:

## Documentation Policy
Always use jDocMunch-MCP tools for documentation exploration.
- Before reading a doc file: use search_sections or get_toc
- To retrieve specific content: use get_section with the section ID
- To index local docs: use index_local with the docs folder path
- Never open documentation files directly — navigate by section.

Point your agent at it in ~/.openclaw/openclaw.json:

{
  "agents": {
    "named": {
      "researcher": {
        "systemPromptFile": "~/.openclaw/agents/researcher.md"
      }
    }
  }
}

Usage examples

index_local:          { "path": "/path/to/docs" }
index_repo:           { "url": "owner/repo" }

get_toc:              { "repo": "owner/repo" }
get_toc_tree:         { "repo": "owner/repo" }
get_document_outline: { "repo": "owner/repo", "doc_path": "docs/config.md" }
search_sections:      { "repo": "owner/repo", "query": "authentication" }
get_section:          { "repo": "owner/repo", "section_id": "owner/repo::docs/config.md::authentication#1" }

Tool surface

Tool	Purpose
`index_local`	Index a local documentation folder
`index_repo`	Index a GitHub repository’s docs
`list_repos`	List indexed documentation sets
`get_toc`	Flat section list in document order
`get_toc_tree`	Nested section tree per document
`get_document_outline`	Section hierarchy for one document
`search_sections`	Weighted search returning summaries only
`get_section`	Full content of one section
`get_sections`	Batch content retrieval
`get_section_context`	Section + ancestor headings + child summaries
`delete_index`	Remove a doc index
`get_broken_links`	Detect internal links/anchors that no longer resolve
`get_doc_coverage`	Which jcodemunch symbols have matching doc sections

Search and retrieval tools include a _meta envelope with timing, token savings, and cost avoided.

Example:

"_meta": {
  "latency_ms": 12,
  "sections_returned": 5,
  "tokens_saved": 1840,
  "total_tokens_saved": 94320,
  "cost_avoided": { "claude_opus": 0.0276, "gpt5_latest": 0.0184 },
  "total_cost_avoided": { "claude_opus": 1.4148, "gpt5_latest": 0.9432 }
}

total_tokens_saved and total_cost_avoided accumulate across tool calls and persist to ~/.doc-index/_savings.json.

Check your token savings

Every jDocMunch tool response includes a _meta block with tokens_saved (this call) and total_tokens_saved (lifetime). To check your cumulative savings, ask your agent to call any jDocMunch tool (e.g. get_toc or search_sections) and look at the _meta envelope. Lifetime stats persist in ~/.doc-index/_savings.json across sessions.

Supported formats

Format	Extensions	Notes
Markdown	`.md`, `.markdown`	ATX (`# Heading`) and setext headings
MDX	`.mdx`	JSX tags, frontmatter, import/export stripped before parsing
Plain text	`.txt`	Paragraph-block section splitting
reStructuredText	`.rst`	Adornment-based heading detection
AsciiDoc	`.adoc`	`=` and `==` heading hierarchy
Jupyter Notebook	`.ipynb`	Markdown cells used as sections; code cells attached as content
HTML	`.html`	`<h1>`–`<h6>` headings; boilerplate stripped
OpenAPI / Swagger	`.yaml`, `.yml`, `.json`, `.jsonc`	OpenAPI 3.x and Swagger 2.x; operations grouped by tag as sections
JSON / JSONC	`.json`, `.jsonc`	Top-level keys as sections; JSONC comments stripped before parsing
XML / SVG / XHTML	`.xml`, `.svg`, `.xhtml`	Element hierarchy used for section structure

See ARCHITECTURE.md for parser details.

Security

Built-in protections include:

path traversal prevention
symlink escape protection
secret file exclusion (.env, *.pem, and similar)
binary file detection
configurable file size limits
storage path injection prevention via _safe_content_path()
atomic index writes

See SECURITY.md for details.

Best use cases

agent-driven documentation exploration
finding configuration and API reference sections
onboarding to unfamiliar frameworks
token-efficient multi-agent documentation workflows
large documentation sets with dozens of files

Not intended for

source code symbol indexing (use jCodeMunch for that)
real-time file watching
cross-repository global search
semantic/vector similarity search as a standalone product (semantic search is supported as an enhancement when embeddings are enabled via use_embeddings=true, but the core workflow is structure-first)

Environment variables

Variable	Purpose	Required
`GITHUB_TOKEN`	GitHub API auth	No
`ANTHROPIC_API_KEY`	Section summaries via Claude Haiku	No
`GOOGLE_API_KEY`	Section summaries via Gemini Flash; also Gemini embeddings	No
`OPENAI_API_KEY`	OpenAI embeddings (text-embedding-3-small)	No
`JDOCMUNCH_EMBEDDING_PROVIDER`	Force provider: `gemini`, `openai`, `sentence-transformers`, `none`	No
`JDOCMUNCH_ST_MODEL`	sentence-transformers model (default: `all-MiniLM-L6-v2`)	No
`DOC_INDEX_PATH`	Custom cache path	No
`JDOCMUNCH_SHARE_SAVINGS`	Set to `0` to disable anonymous community token savings reporting	No

Community savings meter

Each tool call can contribute an anonymous delta to a live global counter at j.gravelle.us. Only two values are sent:

tokens saved
a random anonymous install ID

No content, file paths, repo names, or identifying material are sent.

The anonymous install ID is generated once and stored in ~/.doc-index/_savings.json.

To disable reporting, set:

JDOCMUNCH_SHARE_SAVINGS=0

Contributing

PRs welcome! All contributors must sign the Contributor License Agreement before their PR can be merged — CLA Assistant will prompt you automatically. See CONTRIBUTING.md for details.

Documentation

License (dual use)

This repository is free for non-commercial use under the terms below. Commercial use requires a paid commercial license.

Star History

Copyright and license text

1. Non-commercial license grant (free)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to use, copy, modify, merge, publish, and distribute the Software for personal, educational, research, hobby, or other non-commercial purposes, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
Any modifications made to the Software must clearly indicate that they are derived from the original work, and the name of the original author (J. Gravelle) must remain intact. He's kinda full of himself.
Redistributions of the Software in source code form must include a prominent notice describing any modifications from the original version.

2. Commercial use

Commercial use of the Software requires a separate paid commercial license from the author.

“Commercial use” includes, but is not limited to:

use of the Software in a business environment
internal use within a for-profit organization
incorporation into a product or service offered for sale
use in connection with revenue generation, consulting, SaaS, hosting, or fee-based services

For commercial licensing inquiries: j@gravelle.us https://j.gravelle.us

Until a commercial license is obtained, commercial use is not permitted.

3. Disclaimer of warranty

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT.

IN NO EVENT SHALL THE AUTHOR OR COPYRIGHT HOLDER BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT, OR OTHERWISE, ARISING FROM, OUT OF, OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Release History

Version	Changes	Urgency	Date
v1.105.0	New optional extra: pip install jdocmunch-mcp[office] adds PDF, Word, PowerPoint, and EPUB ingestion to local indexing (index_local / index-file / the watch daemon). Documents are converted to Markdown on-machine at index time via Microsoft's MIT-licensed markitdown, then sectioned, searched, and health-checked like any other doc. Disclosure, in full: - Conversion is 100% local. markitdown's optional cloud converters (LLM image description, Azure Document Intelligence, YouTube transcription) ar	High	7/19/2026
v1.93.0	Every tool advertises ToolAnnotations(readOnlyHint=...) at the list_tools chokepoint, so MCP clients that gate execution (Claude Code plan mode) run jDoc's query tools silently while still prompting on the handful that index / mutate / delete a doc index. Write-set: index_local, doc_index_repo, delete_index, define_repo_group, tune_weights, check_embedding_drift. Additive and 1.x-compatible (new tools/list field only; no tool add/rename/removal). Suite parity with jcodemunch-mcp and jdatamunch-m	High	7/7/2026
v1.92.0	## v1.92.0 - live-source freshness compares in the indexed (preprocessed) domain (#74) Reported by @mmashwani — a regression in v1.91.0's #71 live-source mode. The index stores `file_hashes`, section `content_hash`, and byte offsets over preprocessed content (transformed formats — `.json` / `.jsonc` / `.svg` / `.xml` / `.html` / `.mdx` / `.ipynb` / `.tscn` / `.tres` — are converted by `preprocess_content` before storage; the cached "raw files" mirror is really the preprocessed representati	Medium	6/19/2026
v1.86.0	## Cross-suite repo identity: `local/` name round-trip + bridge handle clarity (#67, #68) Two reports from @mmashwani — the last two friction points in heavy dual-suite (jCodeMunch + jDocMunch) autonomous use. ### #67 — `index_local(name="local/<name>")` round-trip `doc_list_repos` returns local handles as `local/<name>`, but `index_local(name=...)` validated `name` as a single storage component, so reusing a discovered handle as the refresh name raised `Invalid name: 'local/example-docs'` eve	High	6/18/2026
v1.70.0	tune_weights now learns from a recency window of the ranking ledger (new max_age_days parameter, default 90 days) instead of the lifetime history. Stale events no longer anchor semantic_weight proposals to a query distribution that no longer exists; set max_age_days=0 to restore the lifetime read. ranking_db_query gains a window_seconds filter. Mirrors jcodemunch-mcp v1.108.53. Additive per the 1.x contract: new defaulted kwargs and response keys only. Full notes in CHANGELOG.md.	High	6/11/2026
v1.69.0	## v1.69.0 — GitHub `ref` selection for versioned doc snapshots Adds an optional `ref` argument to `doc_index_repo`/`index_repo` so a GitHub doc index can be built from a specific branch, tag, or commit-ish. Without `ref`, behavior is unchanged (HEAD). - Resolves explicit refs to a concrete 40-hex commit SHA before fetching tree/content; `ref` is selection input only and is never persisted. - Durable lookup/citation handles stay commit-SHA based (`repo_at_sha`, `source_repo_at_sha`). The SHA	High	6/4/2026
v1.67.0	Adds an immutable `owner/repo@40hexsha` handle so downstream workflows can cite the exact documentation snapshot used for retrieval. Contributed by @DevItBetter (PR #23, closes #22; follow-up to #17). ## What's new - New `DocIndex` metadata: `head_sha`, `source_dirty`, `sha_certified`, `source_root`, plus a derived `repo_at_sha` handle (never stored; emitted only when the SHA is 40-hex, the corpus is clean, and the index is certified). - Surfaced in `list_repos`, `search_sections`, `get_doc_hea	High	6/2/2026
v1.66.3	Patch release. Closes the silent-corruption window in the openai-compatible provider shipped in v1.66.0. ## The bug `_OpenAICompatibleProvider` returned `(f"{url}::{model}", None)` from `_provider_identity()` because the embedding dim was unknown without calling the endpoint. The on-disk cache treats `dim=None` as a wildcard. That composes correctly when the backing model stays put. But if a user keeps `JDOCMUNCH_OPENAI_COMPAT_MODEL` constant and swaps the backing model behind the same endpoi	High	5/16/2026
v1.60.0	## Phase-1 sibling-parity batch — section safety + dedup Three new tools, all inspired by jcodemunch-mcp's leverage patterns ported into the jDoc idiom. Each composes existing primitives — no new persisted state, no INDEX_VERSION bump, fully 1.x-compatible. ### `check_section_delete_safe` Composite preflight: can I safely remove this section? Fuses four channels into a single verdict plus up to five ranked blockers and a one-line `recommended_action`. - Tutorial-path membership — secti	High	5/11/2026
v1.59.1	## Fixed Per-section responses no longer leak the raw embedding vector (#11). Reported by @tetiz123. When the index was built with embeddings, five tools were passing the 384-dim float vector straight through to callers, inflating each response by ~2,000 tokens — 5–20× the size of the section content itself, directly contradicting the token-savings purpose of the server. Affected tools (now fixed): - get_section - get_sections - describe_section - get_section_summary - get_section_summar	High	5/3/2026
v1.58.0	## Highlights New `get_doc(repo, doc_path)` MCP tool — single-doc detail view pairing with v1.55's `list_docs` (cross-doc inventory). ```python get_doc(repo='docs', doc_path='api/auth.md') # → {repo, doc_path, format, byte_size, section_count, # sections: [{id, title, level, byte_start, byte_end}, ...], # role_distribution: [{role, section_count}, ...], # tag_distribution: [{tag, section_count}, ...], # indexed_at} ``` The 'tell me everything about this one doc' answer in a single	High	4/27/2026
v1.9.0	## Hybrid BM25 + semantic search `search_sections` now fuses lexical and semantic scores when the index has embeddings. New params match jcodemunch-mcp's shape: - `semantic` — `null`/omit (auto — hybrid when embeddings exist), `true` (force hybrid), `false` (force lexical-only) - `semantic_only` — skip lexical entirely; rank purely by embedding cosine similarity - `semantic_weight` — 0.0–1.0 weight of the semantic channel in fusion (default 0.5) Each channel is min-max-normalized	High	4/20/2026
v1.8.1	### Documentation - Added "Works with" section to README with Hermes Agent integration config - Submitted optional skill PR to [NousResearch/hermes-agent#10413](https://github.com/NousResearch/hermes-agent/pull/10413)	High	4/15/2026
v1.8.0	Three new tools for the LLM Wiki pattern (inspired by Karpathy's llm-wiki): - get_backlinks -- inverse reference graph: given a doc_path, find every section that links to it. When a source changes, instantly discover which wiki pages need updating. - get_stale_pages -- frontmatter-based source provenance. Wiki pages declare their sources in YAML frontmatter (sources: [raw/article.md, ...]). This tool flags pages whose sources have been modified, deleted, or are untracked. - **get_wiki	High	4/12/2026
v1.7.1	### New features - `meta_fields` support — control which `_meta` fields appear in tool responses via `JDOCMUNCH_META_FIELDS` env var. Matches jcodemunch-mcp's `meta_fields` affordance. - Unset / `[]` = strip `_meta` entirely (default, maximum token savings) - `null` / `all` / `*` = include all fields - Comma-separated list = include only those fields (e.g. `timing_ms,powered_by`) ### Tests - 11 new tests (358 total) `pip install --upgrade jdocmunch-mcp`	High	4/9/2026
v1.7.0	## What's New ### Full `init` onboarding `jdocmunch-mcp init` now matches jcodemunch-mcp's one-command UX: - Detects installed MCP clients (Claude Code, Claude Desktop, Cursor, Windsurf, Continue) - Patches each client's config JSON to register jdocmunch as an MCP server - Installs a Doc Exploration Policy into CLAUDE.md (global or project scope) - Installs Cursor rules and Windsurf rules - Installs enforcement hooks (PreToolUse, PostToolUse, PreCompact) - Indexes the current working directory	Medium	4/9/2026
v1.6.0	## CLI hook system for Claude Code Adds full hook parity with [jcodemunch-mcp](https://github.com/jgravelle/jcodemunch-mcp), adapted for documentation files. ### New CLI subcommands - `hook-pretooluse` — warns when Claude tries to `Read` a large doc file, suggesting `search_sections` + `get_section` - `hook-posttooluse` — auto-reindexes doc files after `Edit`/`Write` (fire-and-forget background process) - `hook-precompact` — injects a session snapshot (indexed repos, doc/section c	Medium	4/9/2026
v1.5.3	### Changed - Switch MCP tool responses from pretty-printed JSON to compact JSON — saves 30-40% tokens per response (jcodemunch-mcp#219) PyPI: https://pypi.org/project/jdocmunch-mcp/1.5.3/	Medium	4/7/2026
v1.5.2	### Added - `contrib/build-deb.sh` — Community-contributed Debian/Ubuntu packaging script for Proxmox and other Linux deployments. Includes venv isolation, systemd unit, and streamable HTTP wrapper. Contributed by @Tikilou. Closes #7. PyPI: https://pypi.org/project/jdocmunch-mcp/1.5.2/	Medium	4/6/2026
v1.5.1	## What's new `max_files` parameter on `index_local` The 500-file cap on `index_local` is now configurable. Pass `max_files` to raise (or lower) the limit for your doc tree. MCP tool call: \`\`\`json { "path": "/your/docs", "max_files": 2000 } \`\`\` Python: \`\`\`python index_local("/your/docs", max_files=2000) \`\`\` The default remains 500 — no behaviour change for existing users.	Medium	4/4/2026
v1.5.0	## New tools get_broken_links(repo) Scan all indexed doc sections for internal cross-references that no longer resolve. - Checks markdown [text](target) links with relative paths - Checks RST :ref: and :doc: directives - Checks anchor-only links (#heading) within the same document - External links (http/https/mailto) are skipped - Each broken entry: source_file, source_section, target, reason - reason: 'file_not_found' \| 'section_not_found' \| 'anchor_not_found' Pure index scan -- no re-read	Medium	4/2/2026
v1.4.6	## Housekeeping - Added `LICENSE` file (dual-use: free for non-commercial, paid for commercial) Full changelog: https://github.com/jgravelle/jdocmunch-mcp/blob/master/CHANGELOG.md	Medium	3/31/2026
v1.4.5	## What's new ### Multi-provider AI summarization The summarizer module has been refactored and extended with three new providers: \| Provider \| Env var \| Extra \| \|----------\|---------\|-------\| \| Anthropic Claude Haiku \| `ANTHROPIC_API_KEY` \| `pip install jdocmunch-mcp[anthropic]` \| \| Google Gemini 2.0 Flash \| `GOOGLE_API_KEY` \| `pip install jdocmunch-mcp[gemini]` \| \| OpenAI (GPT-4o-mini) \| `OPENAI_API_KEY` \| `pip install jdocmunch-mcp[openai]` \| \| MiniMax \| `MINIMAX_API_KEY` \| `pip install jd	Medium	3/29/2026
v1.4.4	## What's New ### Fix: index_local name collision (reported by Kristof) When two libraries both have a folder named 'docs', both were indexed as `local/docs` and the second would overwrite the first. Fix: pass the new optional `name` parameter to give each a distinct identifier. Before (both clobber each other): ``` index_local path="/path/to/requests/docs" index_local path="/path/to/flask/docs" ``` After (separate indices): ``` index_local path="/path/to/requests/docs" name="requests-docs"	Medium	3/23/2026
v1.4.3	## What's new ### Fuzzy/prefix matching in lexical search The scoring algorithm now recognizes prefix matches (minimum 3 chars) alongside exact word matches. Partial queries like 'authenticat' now hit 'authentication', 'config' hits 'configuration', etc. Improves recall on truncated or in-progress queries. ### search_mode + tip in _meta search_sections now always reports search_mode: 'semantic' or 'lexical' in _meta. When lexical mode is active (no embeddings indexed), a tip is included pointi	Medium	3/23/2026
v1.4.2	Adds mcp-name verification comment to README and server.json for submission to the official MCP Registry (registry.modelcontextprotocol.io).	Low	3/21/2026
v1.4.1	## Security Supply-chain integrity check: jdocmunch-mcp now verifies at startup that it is running from the official distribution. If the code is found to be running under a differently-named package (e.g. a re-published fork), a SECURITY WARNING is printed to stderr with instructions to install from the canonical source. This is a direct response to the class of attack where MCP servers are mass-forked and republished under different names — dangerous given that MCP servers have filesystem an	Low	3/19/2026
v1.3.1	## What's new Godot scene and resource file support (.tscn, .tres) — closes #4, requested by @TrustNoOneElse. Agents can now call `get_document_outline` or `search_sections` on Godot projects instead of brute-reading entire scene files. ### What gets indexed - Scene Tree — nodes with name, type, and hierarchy depth (h3-h6 matching nesting level) - External Resources — script, texture, audio, and other ext_resource entries with paths and IDs - Sub-Resources — inline resources	Low	3/12/2026
v1.3.0	## Semantic Search jdocmunch-mcp now supports meaning-based section search via embeddings. ### What was broken Short-titled sections (e.g. a heading like 'Emotional Consequences' followed by a bullet list) would not match queries like 'what are the emotional consequences' because the lexical scorer found no token overlap. ### What changed index_local / index_repo: new `use_embeddings` parameter (default: false). When true, each section gets a vector embedding at index time. The section t	Low	3/11/2026
v1.2.1	## What's New - Windsurf compatibility — added empty `resources` and `prompts` handlers to prevent connection errors in Windsurf and other strict MCP clients See [CHANGELOG.md](CHANGELOG.md) for full history.	Low	3/10/2026
v1.2.0	## What's New - JSON / JSONC indexing — `.json` and `.jsonc` files are now indexed as structured documents (previously only OpenAPI-sniffed JSON was supported) - XML / SVG / XHTML indexing — `.xml`, `.svg`, and `.xhtml` files indexed via element hierarchy parsing - Brings supported formats to 11 formats / 14 extensions See [CHANGELOG.md](CHANGELOG.md) for full history.	Low	3/10/2026
v1.1.0	## What's new OpenAPI 3.x / Swagger 2.x parser jDocMunch now indexes API reference specs (`.yaml`, `.yml`, `.json`). Files are content-sniffed — only files containing an `openapi:` or `swagger:` key are indexed; plain YAML/JSON config files are skipped automatically. Indexed structure: - `# API Title` — top-level section from `info.title` - `## Tag` — one section per tag group - `### METHOD /path — Summary` — one subsection per endpoint, with parameters, request body, responses - `## Sche	Low	3/8/2026
v1.0.1	Adds \ to every tool response's \ envelope and prints a startup attribution line to stderr. Fixes \ to match pyproject.toml. No functional changes.	Low	3/8/2026
v1.0.0	## jDocMunch MCP v1.0.0 First stable release. API is now frozen under semantic versioning. ### 11 document formats (14 extensions) - Markdown (.md, .markdown, .mdx) - Plain text (.txt) - reStructuredText (.rst) - AsciiDoc (.adoc, .asciidoc, .asc) - Jupyter Notebooks (.ipynb) - HTML (.html, .htm) - OpenAPI 3.x / Swagger 2.x (.yaml, .yml, .json) — content-sniffed ### Architecture highlights - Incremental indexing: only changed files re-parsed, atomic save - O(1) section lookup via id dict built	Low	3/7/2026
v0.1.5	## What's new ### OpenAPI/Swagger support - Index OpenAPI 3.x and Swagger 2.x specs (, , ) as structured documentation - Content sniffing: only specs with `openapi:` or `swagger:` keys are indexed — plain config YAML/JSON is ignored - Operations grouped by tag into `## Tag` sections; each operation becomes a `### METHOD /path` subsection - Renders parameters, request bodies, response codes, and a `## Schemas` section - Requires `pyyaml>=6.0` (new hard dependency); falls back to stdlib `json` if	Low	3/7/2026
v0.1.4	## Incremental indexing `index_local` and `index_repo` now default to `incremental=True`. On subsequent runs, only changed, new, or deleted files are re-parsed — unchanged files are skipped entirely. When nothing has changed, the call returns immediately. ### How it works - File hashes are stored in the index at save time - On re-index, current file hashes are compared against stored ones - Only the diff (changed + new files) is parsed and summarized - Deleted files have their sections removed	Low	3/7/2026
v0.1.3	## Performance & code quality - `DocIndex.get_section()` is now O(1) via a lookup dict built at load time (was O(n)) - `get_section` / `get_sections` no longer parse the index JSON twice per call - Token savings calculation now uses `os.path.getsize` on the cached raw file instead of summing encoded content across all sections in the document (much cheaper for large docs) - `get_sections` caches raw file sizes per doc_path — no repeated `stat` calls when fetching multiple sections from the same	Low	3/7/2026
v0.1.1	## Fixes & Hardening ### Security - Fix `_resolve_repo` glob injection: sanitize bare repo name before use in glob pattern - Fix `_repo_slug` collision: replace flat `owner-name` slug with `owner/name.json` directory layout (unambiguous, no overwrite risk) - Fix `_should_skip` false positives: use path-boundary matching so `rebuild/` is not caught by `build/` pattern ### Correctness - Fix `get_sections`: was loading the full index N times (once per section); now loads once - Fix `fetch_gitigno	Low	3/5/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

claude-code-configClaude Code skills, architectural principles, and alternative approaches for AI-assisted developmentmain@2026-07-22

markdown-vault-mcpGeneric markdown collection MCP server with FTS5 + semantic search, frontmatter-aware indexing, and incremental reindexingv3.1.0

basic-memoryAI conversations that actually remember. Never re-explain your project to your AI again. Join our Discord: https://discord.gg/tyvKNccgqNv0.22.1

sawzhang_skillsClaude Code skills collection — CCA study guides, Twitter research, MCP review, auto-iteration tools0.0.0

claude-engramPersistent memory and session intelligence for AI coding assistants. Auto-tracks mistakes, decisions, and context via hooks. Mines your full session history for patterns, predictions, and cross-sessiomain@2026-07-26

More from jgravelle

jcodemunch-mcpThe leading, most token-efficient MCP server for GitHub source code exploration via tree-sitter AST parsing

jdatamunch-mcpToken-efficient MCP server for tabular data retrieval. Index CSV/Excel files, query rows, aggregate — 99%+ token savings vs raw file reads.

More in MCP Servers

supersetCode Editor for the AI Agents Era - Run an army of Claude Code, Codex, etc. on your machine

kreuzbergA polyglot document intelligence framework with a Rust core. Extract text, metadata, images, and structured information from PDFs, Office documents, images, and 91+ formats. Available for Rust, Python

ai-engineering-from-scratchLearn it. Build it. Ship it for others.

CodeGraphContextAn MCP server plus a CLI tool that indexes local code into a graph database to provide context to AI assistants.