# crawl-mcp

> Crawl4AI MCP Server: Extract content from web pages, PDFs, Office docs, YouTube videos with AI-powered summarization. 17 tools, token reduction, production-ready.

- **URL**: https://www.freshcrate.ai/projects/crawl-mcp
- **Author**: walksoda
- **Category**: MCP Servers
- **Latest version**: `v0.3.3` (2026-06-06)
- **License**: MIT
- **Source**: https://github.com/walksoda/crawl-mcp
- **Language**: Python
- **GitHub**: 34 stars, 11 forks
- **Registry**: github
- **Tags**: `python`

## Description

Crawl4AI MCP Server: Extract content from web pages, PDFs, Office docs, YouTube videos with AI-powered summarization. 17 tools, token reduction, production-ready.

## Recent releases

| Version | Date | Urgency | Changes |
| --- | --- | --- | --- |
| `v0.3.3` | 2026-06-06 | High | ## What's Changed  ### Bug Fixes  - Fix `AttributeError` from `.strip()` on None content fields returned by crawl4ai (#24, #25) - Harden `search_and_crawl` failed-page detection so None content no longer crashes it - Accept markdown-only pages instead of treating them as failures - Guard content truncation `len()` against None in both the main and fallback paths  ### Internal  - Resolve `__version__` dynamically so the reported version no longer drifts from the release  ### Contributors  - @sota |
| `v0.3.2` | 2026-05-17 | High | ## What's Changed  ### Bug Fixes  - Fix `enhanced_process_large_content` crash when `content` is None (read `markdown` field first) - Surface YouTube Restricted Mode as `success=True` with structured warning instead of cryptic error - Add version upper bound to markitdown dependency (`<0.2`)  ### Security  - Pin urllib3>=2.7.0 to resolve CVE-2026-44431 and CVE-2026-44432  ### Dependencies  - Bump markitdown to 0.1.5 with `[pdf]` extra (replaces separate pdfminer-six pin) - Bump crawl4ai to `>=0. |
| `v0.3.1` | 2026-04-29 | High | ## What's Changed  ### New Features  - Support local file processing via `file://` URIs and absolute paths - Add `is_file_uri`, `is_local_path`, `file_uri_to_local_path` validators - `process_file` tool now accepts local file paths in addition to URLs  ### Bug Fixes  - Normalize `CRAWL4AI_BROWSER_TYPE` env var with `strip().lower()` - Use `CRAWL4AI_BROWSER_TYPE` to override default browser list  ### Security  - Add minimum version constraint for litellm dependency (>=1.83.7) - Resolve known crit |
| `v0.3.0` | 2026-04-12 | High | # Release v0.3.0 - Output Persistence and Reliability Improvements  ## Overview  This release adds a new output_path option for persisting tool results to disk. It also includes reliability fixes for CrawlResponse handling, batch execution, and pagination.  ## New Features  ### output_path Option - New `output_path` parameter available across all MCP tools - Persist tool results as files to disk for downstream processing - Useful for integrating crawl results into automated pipelines  ### readOn |
| `v0.2.0` | 2026-03-01 | Low | # Release v0.2.0 - YouTube Comments Tool and Codebase Modularization  ## Overview  This release adds a new YouTube comment extraction tool. It also includes a major codebase refactoring for better maintainability. Security, reliability, and test coverage are improved across the project.  ## New Features  ### extract_youtube_comments Tool - Extract YouTube video comments without API key using youtube-comment-downloader - Pagination support via `comment_offset` parameter for retrieving large comme |
| `v0.1.7` | 2026-01-12 | Low | ## What's Changed  ### Bug Fixes - Fix `extract_media=True` Pydantic validation error - Fix Docker build compatibility for Debian bookworm/trixie  ### New Features - Add Ollama LLM support for web content summarization - Add Anthropic and Ollama LLM support for file processing - Restore batch tools with rate limits   - `batch_crawl` - max 5 URLs   - `multi_url_crawl` - max 5 URL configurations   - `batch_search_google` - max 3 queries   - `batch_extract_youtube_transcripts` - max 3 URLs  ### Dep |
| `v0.1.6` | 2025-10-18 | Low | # Release v0.1.6 - Token Optimization and MCP Interface Refinement  ## Overview  This release focuses on optimizing token usage for Claude Code MCP integration and refining the MCP tool interface by removing batch operations that provide limited value in sequential processing contexts.  ## Major Updates  ### Token Usage Optimization - **Increased token limit**: Response token limit raised from 20000 to 25000 for all crawling tools - **Markdown-only response**: New `include_cleaned_html |
| `v0.1.5` | 2025-09-28 | Low | ## Overview  This maintenance release focuses on performance optimizations, code quality improvements, and dependency updates to enhance the overall stability and reliability of the crawl-mcp server.  ## Major Updates  ### Performance Enhancements - **Core server optimizations**: Improved response handling and resource management - **Web crawling efficiency**: Enhanced crawling performance and reliability - **Tool utilities refinement**: Optimized utility functions for better performanc |
| `v0.1.4` | 2025-09-21 | Low | # Release v0.1.4 - Enhanced Search Filtering with Date-Based Filtering  ## Overview  This release introduces enhanced search filtering capabilities with date-based filtering support, AI summarization improvements, and better API parameter standardization across all search tools.  ## Major Updates  ### Enhanced Search Filtering - **Date-based filtering**: New `recent_days` parameter for time-sensitive searches - **Improved search accuracy**: Removed deprecated 'recent' search genre for |
| `v0.1.2` | 2025-08-25 | Low | ## Overview  This release includes significant improvements and fixes, including FastMCP version adjustment for stability, enhanced crawling features, and a critical config loading fix that ensures proper execution from any directory.  ## Major Updates  ### FastMCP Version Adjustment & Project Restructure - Downgraded FastMCP from 2.x to FastMCP 2.11.0 for improved stability - Reorganized project structure for better maintainability   - Enhanced tool registration and MCP protocol handli |

## Citation

- HTML: https://www.freshcrate.ai/projects/crawl-mcp
- Markdown: https://www.freshcrate.ai/projects/crawl-mcp.md
- Dependencies JSON: https://www.freshcrate.ai/api/projects/crawl-mcp/deps

_Generated by freshcrate.ai. Indexes github releases for AI-agent ecosystem packages._