Open-source deep research tool for AI coding developers.
Structured coverage across English and Chinese sources, cited markdown reports.
Status β’
Quick Start β’
Architecture β’
Interfaces β’
Delivery Status
AutoSearch is undergoing a full v2 rewrite (legacy-v1 tag preserves the v1 state). v2 replaces the monolithic v1 with a modular pipeline (M0βM8) behind strict source-mapped reuse of proven deep-research projects. Channel adapters (the real data sources) are on the roadmap β the current release ships a DemoChannel placeholder so the end-to-end pipeline is exercisable.
See docs/delivery-status.md for a module-by-module checklist.
Dev install (current β no PyPI release yet):
git clone https://github.com/0xmariowu/Autosearch
cd Autosearch
uv venv --python 3.12
uv pip install -e . --python .venv/bin/python
.venv/bin/autosearch query "your topic"After the first tagged v2 release:
pipx install autosearch
autosearch query "your topic"Requirements: Python 3.12+. Set one of ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY, or have the claude CLI on PATH β the LLM layer auto-detects the first available provider.
Streaming progress to stderr is default (--stream); suppress with --no-stream or use --json for a machine-readable envelope:
autosearch query "retrieval-augmented generation survey" --jsonquery
β
M0 Knowledge Recall (known facts + gaps)
M1 Goal Crystallization + Clarify (rubrics + mode)
M2 Search Strategy (subqueries)
M3 Iteration Controller (reflect-on-gaps loop across channels)
M4 Material Cleaner (trafilatura)
M5 Evidence Processor (URL dedup + SimHash + BM25)
M7 Report Synthesizer (outline + per-section + citation remap)
M8 Quality Gate (rubric evaluation, one retry on fail)
β
markdown + References + Sources breakdown
Observability (CostTracker) and persistence (SessionStore, three-table SQLite schema) are available as Pipeline constructor args.
Each module traces to a 1:1 source in a well-known deep-research project β see docs/delivery-status.md for the mapping.
AutoSearch runs as:
- CLI:
autosearch query "..." - HTTP + SSE:
autosearch serveβPOST /searchstreams typed events (phase/iteration/gap/quality/finished) - MCP server:
autosearch mcp(orautosearch-mcpconsole script) β exposes aresearchtool to Claude Code, Cursor, and other MCP clients. Seedocs/mcp-clients.mdfor per-client config samples. - Claude Code slash command:
/autosearch(ships incommands/autosearch.md)
Generated from autosearch/skills/channels/*/SKILL.md. Run .venv/bin/python scripts/generate_channels_table.py after adding or changing a channel.
| Channel | Languages | Description | Typical yield |
|---|---|---|---|
| arxiv | en | Use for academic preprint searches in CS/ML/physics when query is English or mixed and expects peer-reviewed or preprint papers. | medium-high |
| crossref | en | Cross-publisher scholarly search via the Crossref DOI registry, useful for journal articles, book chapters, and citation-linked research metadata. | medium |
| dblp | en | Computer science bibliography search β technical papers, proceedings, and journal articles indexed by venue, author, and year. | medium |
| ddgs | en, mixed | DuckDuckGo Search β free general web search with no auth, use as broad default for any English or mixed query. | medium |
| devto | en, mixed | Developer blog articles tagged by technology topic, via the public dev.to API. | medium |
| github | en | Use for code-level, issue-level, and repository discovery when query involves a library, framework, or implementation detail. | high |
| google_news | en, mixed | Current news headlines aggregated across publishers via Google News RSS (English US feed). | high |
| hackernews | en | Use for real-time developer discussion, tooling opinions, and early-stage product signals from the HN community. | medium-high |
| huggingface_hub | en, mixed | Discover open machine learning models on Hugging Face Hub via the public model search API. | high |
| infoq_cn | zh, mixed | Chinese engineering articles covering architecture, AI, and enterprise tech from InfoQ δΈζ, via public RSS feed. | medium |
| kr36 | zh, mixed | Chinese tech business news, startup funding, and industry analysis from 36kr. | medium |
| openalex | en, mixed | Search scholarly works through OpenAlex's public works search API with open-access URL fallback. | high |
| package_search | en, mixed | Discover packages across PyPI (exact-name lookup) and npm (full-text search) registries. | medium |
| papers | en, mixed | Multi-source academic paper search (arxiv, pubmed, biorxiv, medrxiv, google_scholar) via paper-search-mcp. | high |
| podcast_cn | zh, mixed | Chinese-language podcasts searchable via the Apple iTunes store public API. | low |
| en, mixed | Reddit community discussions, user experience reports, and topic debates via the public search.json endpoint. | medium | |
| sec_edgar | en | US public company filings (10-K, 10-Q, 8-K) via SEC EDGAR full-text search β financial and regulatory disclosures for research. | medium |
| sogou_weixin | zh, mixed | Chinese WeChat Official Account articles via the public Sogou WeChat search SERP. | high |
| stackoverflow | en, mixed | Programming Q&A with community-voted answers across 200+ technical tags via api.stackexchange.com. | high |
| wikidata | en, mixed | Structured entity data (people, places, concepts) from Wikidata knowledge graph. | medium |
| wikipedia | en, mixed | Authoritative encyclopedia articles via the Wikipedia Action API (English edition). | high |
| Channel | Languages | Required env | Description | Typical yield |
|---|---|---|---|---|
| youtube | en, zh, mixed | env:YOUTUBE_API_KEY | Use for video tutorial discovery, conference talks, technical walkthroughs, and product demos. | medium |
| Channel | Languages | Required env | Description | Typical yield |
|---|---|---|---|---|
| bilibili | zh, mixed | env:TIKHUB_API_KEY | Chinese tech video platform with tutorials, conference recordings, and uploader-authored articles, via TikHub. | medium |
| douyin | zh | env:TIKHUB_API_KEY | Chinese short-video content with product demos, tech reviews, and viral trends, via TikHub. | medium |
| kuaishou | zh, mixed | env:TIKHUB_API_KEY | Chinese short-video platform with lifestyle, humor, regional culture, and product demos, via TikHub. | medium |
| tiktok | en, mixed | env:TIKHUB_API_KEY | Global short-video platform with creator content, product demos, viral trends, and topical reactions, via TikHub. | medium |
| en, mixed | env:TIKHUB_API_KEY | Real-time public discourse including product launches, tech announcements, and breaking news, via TikHub. | medium | |
| zh, mixed | env:TIKHUB_API_KEY | Chinese microblog platform for real-time opinion, trending topics, and event-level commentary in Chinese discourse, via TikHub. | medium | |
| xiaohongshu | zh, mixed | env:TIKHUB_API_KEY | Chinese lifestyle + experience-sharing notes with strong product/beauty/travel/food coverage, via TikHub. | high |
| zhihu | zh, mixed | env:TIKHUB_API_KEY | Chinese Q&A platform with deep technical discussions and user experience reports β use when query is Chinese or mixed and targets developer opinions, comparisons, or tutorials. | medium-high |
The v1 architecture (skills/, channels/*/SKILL.md, AVO self-evolution loop) was retired in v2. Contributions targeting the v2 architecture are welcome β the module map in docs/delivery-status.md is the starting point.
MIT.

