freshcrate
Home > MCP Servers > Autosearch

Autosearch

Self-evolving deep research system for Claude Code. Zero API keys.

Description

Self-evolving deep research system for Claude Code. Zero API keys.

README

Β AutoSearch

Open-source deep research tool for AI coding developers.
Structured coverage across English and Chinese sources, cited markdown reports.

ReleaseLicenseCIPythonStatus Status β€’ Quick Start β€’ Architecture β€’ Interfaces β€’ Delivery Status


Status

AutoSearch is undergoing a full v2 rewrite (legacy-v1 tag preserves the v1 state). v2 replaces the monolithic v1 with a modular pipeline (M0–M8) behind strict source-mapped reuse of proven deep-research projects. Channel adapters (the real data sources) are on the roadmap β€” the current release ships a DemoChannel placeholder so the end-to-end pipeline is exercisable.

See docs/delivery-status.md for a module-by-module checklist.

Quick Start

Dev install (current β€” no PyPI release yet):

git clone https://github.com/0xmariowu/Autosearch
cd Autosearch
uv venv --python 3.12
uv pip install -e . --python .venv/bin/python
.venv/bin/autosearch query "your topic"

After the first tagged v2 release:

pipx install autosearch
autosearch query "your topic"

Requirements: Python 3.12+. Set one of ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY, or have the claude CLI on PATH β€” the LLM layer auto-detects the first available provider.

Streaming progress to stderr is default (--stream); suppress with --no-stream or use --json for a machine-readable envelope:

autosearch query "retrieval-augmented generation survey" --json

Architecture

query
  ↓
M0 Knowledge Recall (known facts + gaps)
M1 Goal Crystallization + Clarify (rubrics + mode)
M2 Search Strategy (subqueries)
M3 Iteration Controller (reflect-on-gaps loop across channels)
M4 Material Cleaner (trafilatura)
M5 Evidence Processor (URL dedup + SimHash + BM25)
M7 Report Synthesizer (outline + per-section + citation remap)
M8 Quality Gate (rubric evaluation, one retry on fail)
  ↓
markdown + References + Sources breakdown

Observability (CostTracker) and persistence (SessionStore, three-table SQLite schema) are available as Pipeline constructor args.

Each module traces to a 1:1 source in a well-known deep-research project β€” see docs/delivery-status.md for the mapping.

Interfaces

AutoSearch runs as:

  • CLI: autosearch query "..."
  • HTTP + SSE: autosearch serve β€” POST /search streams typed events (phase / iteration / gap / quality / finished)
  • MCP server: autosearch mcp (or autosearch-mcp console script) β€” exposes a research tool to Claude Code, Cursor, and other MCP clients. See docs/mcp-clients.md for per-client config samples.
  • Claude Code slash command: /autosearch (ships in commands/autosearch.md)

Supported Channels

Generated from autosearch/skills/channels/*/SKILL.md. Run .venv/bin/python scripts/generate_channels_table.py after adding or changing a channel.

Tier 0 - always-on (21)

Channel Languages Description Typical yield
arxiv en Use for academic preprint searches in CS/ML/physics when query is English or mixed and expects peer-reviewed or preprint papers. medium-high
crossref en Cross-publisher scholarly search via the Crossref DOI registry, useful for journal articles, book chapters, and citation-linked research metadata. medium
dblp en Computer science bibliography search β€” technical papers, proceedings, and journal articles indexed by venue, author, and year. medium
ddgs en, mixed DuckDuckGo Search β€” free general web search with no auth, use as broad default for any English or mixed query. medium
devto en, mixed Developer blog articles tagged by technology topic, via the public dev.to API. medium
github en Use for code-level, issue-level, and repository discovery when query involves a library, framework, or implementation detail. high
google_news en, mixed Current news headlines aggregated across publishers via Google News RSS (English US feed). high
hackernews en Use for real-time developer discussion, tooling opinions, and early-stage product signals from the HN community. medium-high
huggingface_hub en, mixed Discover open machine learning models on Hugging Face Hub via the public model search API. high
infoq_cn zh, mixed Chinese engineering articles covering architecture, AI, and enterprise tech from InfoQ δΈ­ζ–‡, via public RSS feed. medium
kr36 zh, mixed Chinese tech business news, startup funding, and industry analysis from 36kr. medium
openalex en, mixed Search scholarly works through OpenAlex's public works search API with open-access URL fallback. high
package_search en, mixed Discover packages across PyPI (exact-name lookup) and npm (full-text search) registries. medium
papers en, mixed Multi-source academic paper search (arxiv, pubmed, biorxiv, medrxiv, google_scholar) via paper-search-mcp. high
podcast_cn zh, mixed Chinese-language podcasts searchable via the Apple iTunes store public API. low
reddit en, mixed Reddit community discussions, user experience reports, and topic debates via the public search.json endpoint. medium
sec_edgar en US public company filings (10-K, 10-Q, 8-K) via SEC EDGAR full-text search β€” financial and regulatory disclosures for research. medium
sogou_weixin zh, mixed Chinese WeChat Official Account articles via the public Sogou WeChat search SERP. high
stackoverflow en, mixed Programming Q&A with community-voted answers across 200+ technical tags via api.stackexchange.com. high
wikidata en, mixed Structured entity data (people, places, concepts) from Wikidata knowledge graph. medium
wikipedia en, mixed Authoritative encyclopedia articles via the Wikipedia Action API (English edition). high

Tier 1 - env-gated (1)

Channel Languages Required env Description Typical yield
youtube en, zh, mixed env:YOUTUBE_API_KEY Use for video tutorial discovery, conference talks, technical walkthroughs, and product demos. medium

Tier 2 - BYOK paid (8)

Channel Languages Required env Description Typical yield
bilibili zh, mixed env:TIKHUB_API_KEY Chinese tech video platform with tutorials, conference recordings, and uploader-authored articles, via TikHub. medium
douyin zh env:TIKHUB_API_KEY Chinese short-video content with product demos, tech reviews, and viral trends, via TikHub. medium
kuaishou zh, mixed env:TIKHUB_API_KEY Chinese short-video platform with lifestyle, humor, regional culture, and product demos, via TikHub. medium
tiktok en, mixed env:TIKHUB_API_KEY Global short-video platform with creator content, product demos, viral trends, and topical reactions, via TikHub. medium
twitter en, mixed env:TIKHUB_API_KEY Real-time public discourse including product launches, tech announcements, and breaking news, via TikHub. medium
weibo zh, mixed env:TIKHUB_API_KEY Chinese microblog platform for real-time opinion, trending topics, and event-level commentary in Chinese discourse, via TikHub. medium
xiaohongshu zh, mixed env:TIKHUB_API_KEY Chinese lifestyle + experience-sharing notes with strong product/beauty/travel/food coverage, via TikHub. high
zhihu zh, mixed env:TIKHUB_API_KEY Chinese Q&A platform with deep technical discussions and user experience reports β€” use when query is Chinese or mixed and targets developer opinions, comparisons, or tutorials. medium-high

Contributing

The v1 architecture (skills/, channels/*/SKILL.md, AVO self-evolution loop) was retired in v2. Contributions targeting the v2 architecture are welcome β€” the module map in docs/delivery-status.md is the starting point.

License

MIT.

Release History

VersionChangesUrgencyDate
v2026.04.06.6### Changes - Chinese topics now automatically select 5+ Chinese channels (csdn, juejin, bilibili, zhihu, 36kr, etc.) β€” previously the pipeline only picked 2-4 generic channels - Channel cap increased: Quick=10, Standard=15, Deep=all 34 (was flat 10 for all depths) - Quick mode now always searches (minimum 4 queries) instead of skipping when Claude's knowledge seems sufficient - Zero-query situations auto-generate freshness-check queries instead of asking the user - E2E test harness auto-instalMedium4/6/2026
v2026.04.06.4### Changes - You can now get full article content for 90% of search results (was 2%) β€” enrichment pipeline no longer blocked by score threshold - 12 Chinese channels upgraded to native platform APIs (CSDN, Juejin, 36kr, Zhihu, Weibo, Douyin, Xiaohongshu, Xueqiu, and more) - 5 Chinese channels switched from unreliable Baidu Kaifa fallback to DuckDuckGo site-search - HuggingFace search now shows rich snippets with tags, download counts, and like counts - YouTube/conference-talks now show author Medium4/6/2026
v2026.04.06.1### Changes - You can now install via npm: `npm install -g @0xmariowu/autosearch` - You can now use one API key (`SCRAPECREATORS_API_KEY`) to unlock full Reddit comments and Twitter engagement data - Reddit and Twitter channels now use three-tier fallback: ScrapeCreators (optional) β†’ native API β†’ DuckDuckGo - Twitter GraphQL query IDs auto-refresh from X's JS bundles (24h cache), no more brittle hardcoded IDs - Reddit enrichment tries ScrapeCreators for comments before .json endpoint, with cleaMedium4/6/2026
v2026.04.05.3### Changes - You can now get Reddit comment insights β€” top comments with scores, authors, and excerpts are fetched for the highest-scoring Reddit results - You can now search X/Twitter with full engagement data β€” cookie-based GraphQL search returns likes, reposts, replies, and author handles (falls back to DuckDuckGo when no credentials) - You can now run two-phase search β€” Phase 1 extracts entities (subreddits, X handles, authors), Phase 2 does targeted follow-up searches - HN channel now incMedium4/5/2026
v2026.04.05.1### Changes - You can now search HuggingFace models and datasets β€” new `huggingface` channel searches by downloads - You can now find trending AI papers β€” new `papers-with-code` channel via HuggingFace Daily Papers - 34 search channels total (was 32) - Cleaned up 18 orphan skills from legacy architecture (1,828 lines removed) - Removed outdated platform methodology docs, moved evidence principles to root Medium4/4/2026
v2026.04.04.9### Changes - Added CI hygiene workflow β€” checks author identity, internal codenames, personal paths, and container image pins on every PR and push - Git history fully rewritten β€” removed all PII (real names, Tailscale domains) from 270 commits - Pre-commit hook now enforces author whitelist, personal path scan, and internal codename scan ### Fixes - Removed literal PII from .gitleaks.toml detection rules (was leaking the values it was detecting) - Removed internal project references from metMedium4/4/2026
v2026.04.04.7### Changes - You can now benefit from engine-level health tracking β€” when Baidu or DuckDuckGo goes down, all dependent channels suspend together instead of failing one-by-one - Transient failures (timeouts, network blips) now automatically retry once before giving up β€” timeout waits 5s, network errors wait 2s - Channel health data now persists immediately after every success/failure, so a crash no longer loses health state from the current run ### Fixes - Removed redundant internal retry froMedium4/4/2026
v2026.04.04.5### Fixes - You can now search Chinese topics without losing Chinese channels β€” channel scoring now tracks language separately, so English-session zero-yield no longer poisons Chinese channel selection - Removed hardcoded Chinese channel exclusion list from select-channels skill Medium4/4/2026
v2026.04.04.4### Changes - You can now run pre-release tests with `./scripts/release-test.sh` β€” validates unit tests + real pipeline scenarios before tagging - 6 user scenarios (Quick/Standard/Deep Γ— EN/ZH, cold topic) with `--scenario s1|s2|s3|s4|s5|s7|all` - Docker support for clean-environment testing (`docker-compose.test.yml`) - Each scenario checks: timeout, block completion, judge score, delivery file, WebSearch bypass, Chinese content ratio Medium4/4/2026
v2026.04.04.3## What's Changed * fix: Block 4 hang prevention + test coverage by @0xmariowu in https://github.com/0xmariowu/Autosearch/pull/38 **Full Changelog**: https://github.com/0xmariowu/Autosearch/compare/v2026.04.04.2...v2026.04.04.3Medium4/4/2026
v2026.04.04.2**Full Changelog**: https://github.com/0xmariowu/Autosearch/compare/v2026.04.04.1...v2026.04.04.2Medium4/4/2026
v2026.04.04.1**Full Changelog**: https://github.com/0xmariowu/Autosearch/compare/v2026.4.8...v2026.04.04.1Medium4/4/2026
v2026.4.8**Full Changelog**: https://github.com/0xmariowu/Autosearch/compare/v2026.4.7...v2026.4.8Medium4/4/2026
v2026.4.7**Full Changelog**: https://github.com/0xmariowu/Autosearch/compare/v2026.4.6...v2026.4.7Medium4/4/2026
v2026.4.6**Full Changelog**: https://github.com/0xmariowu/Autosearch/compare/v2026.4.5...v2026.4.6Medium4/4/2026
v2026.4.5**Full Changelog**: https://github.com/0xmariowu/Autosearch/compare/v2026.4.4-1...v2026.4.5Medium4/4/2026
v2026.4.4-1**Full Changelog**: https://github.com/0xmariowu/Autosearch/commits/v2026.4.4-1Medium4/4/2026
v2026.4.4--- Medium4/4/2026
v2026.4.3- **AutoSearch is now a Claude Code Plugin.** Install with `/plugin marketplace add 0xmariowu/autosearch` + `/plugin install autosearch@autosearch`. Full plugin structure: commands, agents, skills, hooks, marketplace.json. Why: making AutoSearch distributable to any Claude Code user. - **32 search channels as independent plugins.** Each channel is a directory with SKILL.md (capability profile) + search.py (search implementation). Channels auto-discovered by convention-based loader. Why: each chaMedium4/3/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

Agent-ReachEquip AI agents with internet access to gather real-time data from restricted or hard-to-reach online sources.main@2026-04-21
LLM-WikiAutonomous knowledge base plugin for Claude Code - captures reserch, ideas, and decisions into an interlinked wiki with reserch-on-miss, semantic search, and a Wikipedia-style web UI. Knowledge compoumain@2026-04-18
AgentLintLint your repo for AI agent compatibility.v0.8.5
open-coworkOpen-source AI agent desktop app for Windows & macOS. One-click install Claude Code, MCP tools, and Skills β€” with sandbox isolation, multi-model support, and Feishu/Slack integration.v3.3.0
flow-nextPlan-first AI workflow plugin for Claude Code, OpenAI Codex, and Factory Droid. Zero-dep task tracking, worker subagents, Ralph autonomous mode, cross-model reviews.flow-next-v0.29.4