qwe-qwe

Home > AI Agents > qwe-qwe

⚡ Lightweight offline AI agent for local models. No cloud, no API keys — just your GPU.

agent ai ai-agent python

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

⚡ Lightweight offline AI agent for local models. No cloud, no API keys — just your GPU.

README

AI agent optimized for small local models

Built for Qwen 9B & Gemma 4B on a gaming laptop. No cloud required.

Quick Start • Why Small Models • Interfaces • Tool Search • Tools • Skills • MCP • Telegram • Doctor

What is qwe-qwe?

A personal AI agent designed to squeeze maximum capability out of small local models (4-9B parameters). Chat via terminal, browser, or Telegram — with tools, semantic memory, browser control, MCP integration, scheduled tasks, and a customizable personality.

Optimized for Qwen 3.5 9B and Gemma 4 E4B running on a single consumer GPU (4-8GB VRAM). Cloud providers supported as fallback, but the architecture, prompts, and tool system are built for the constraints of small models.

Philosophy: every token is expensive. Don't make the model smarter — make the system around it smarter. Tool search, compact prompts, retry loops, JSON repair, and self-checks compensate for what the model lacks.

Why Small Models

	Cloud (GPT, Claude)	Local (Qwen 9B)
Latency	2-10s network + inference	1-5s local inference
Privacy	Data leaves your machine	Everything stays local
Cost	$20-200/month	Free after GPU purchase
Offline	No	Works without internet
Customization	System prompt only	Full control over everything
Reliability	API outages, rate limits	Always available

qwe-qwe makes the trade-off worth it by working with the model's limitations instead of fighting them.

Quick Start

Prerequisites

Python 3.11+
LM Studio or Ollama with a loaded model
Recommended models:
- Qwen 3.5 9B Q4_K_M (~5.5GB) — best for tool calling and agents
- Gemma 4 E4B-IT (~4GB) — fast, good for simple tasks
Embeddings: FastEmbed (ONNX, local) — multilingual-MiniLM (384d, 50+ languages) + SPLADE++

Install

Runs natively on Linux, macOS (Intel & Apple Silicon) and Windows 10/11 — single pip install -e . pulls every runtime dep (including MarkItDown, python-docx/pptx, openpyxl, pdfminer.six, pypdf, fastembed, qdrant-client, uvicorn).

🐧 Linux / 🍎 macOS — one-line

curl -fsSL https://raw.githubusercontent.com/deepfounder-ai/qwe-qwe/main/install.sh | bash

This clones the repo, creates a venv, installs everything, verifies critical deps, pre-downloads the embedding model, and drops qwe-qwe on your $PATH.

🪟 Windows

git clone https://github.com/deepfounder-ai/qwe-qwe.git
cd qwe-qwe
setup.bat

On Windows shell commands are routed through Git Bash (auto-detected at install time — install Git for Windows if missing). Falls back to cmd.exe if not found.

Manual (any platform)

git clone https://github.com/deepfounder-ai/qwe-qwe.git
cd qwe-qwe

# Create venv
python3 -m venv .venv            # or `python -m venv .venv` on Windows
source .venv/bin/activate        # macOS/Linux
# .venv\Scripts\activate         # Windows PowerShell / cmd

# Install package + all runtime deps
pip install -e .

# Verify everything is wired
qwe-qwe --doctor

Update an existing install

# Linux / macOS
curl -fsSL https://raw.githubusercontent.com/deepfounder-ai/qwe-qwe/main/install.sh | bash

# Any platform, inside the checkout:
git pull && pip install -e . --upgrade

The update script is idempotent — re-running it detects an existing checkout and refreshes deps.

Run

qwe-qwe              # terminal chat
qwe-qwe --web        # web UI at http://localhost:7860
qwe-qwe --doctor     # check everything works

LM Studio / Ollama are auto-detected on localhost during setup. If your server is on another machine:

export QWE_LLM_URL=http://<your-ip>:1234/v1

Recommended hardware

Component	Minimum	Recommended
GPU	4GB VRAM (4B Q4)	8GB VRAM (9B Q4_K_M)
RAM	8GB	16GB
Storage	10GB	20GB (models + memory)

Works on: gaming laptops, desktop GPUs (RTX 3060+), Mac M1+ (via Ollama).

Architecture

                               +-- Qdrant (semantic memory, hybrid search)
CLI (terminal)  <--+           +-- RAG (file indexing & search)
Web UI (browser) <--+-- Agent -+-- SQLite (history, threads, state)
Telegram bot    <--/    Loop   +-- Tools (8 core + tool_search)
                        |      +-- Skills (7 built-in, user-creatable)
                        |      +-- Browser (Playwright/Chromium)
                        |      +-- MCP (external tool servers)
                        |      +-- Scheduler (cron tasks)
                        |      +-- Vault (encrypted secrets)
                        v
                   LLM (local or cloud)
                   7 providers supported

Small-model optimizations

Tool Search — only 8 core tools loaded by default (~750 tokens); model calls tool_search("keyword") to activate more. Saves 75% tokens vs loading all 46 tools
Compact system prompt (~1200 tokens) — no redundant tool descriptions
JSON repair engine — fixes malformed tool calls (trailing commas, unclosed brackets, single quotes)
Anti-hedge nudge — if model talks instead of acting, it gets pushed to use tools
Self-check validation — validates tool args before execution, with required-field checks
Smart compaction — summarizes old messages when context fills up, saves to memory
Stuck detection — warns model after 5+ tool errors per turn
Experience learning — agent remembers past task outcomes and adapts strategies
Shell via Git Bash — UNIX commands work on Windows, auto-detected

Interfaces

Web UI

qwe-qwe --web                    # http://localhost:7860
qwe-qwe --web --ssl --port 7861  # HTTPS (needed for mic/camera)

Premium single-file SPA — zero runtime JS dependencies (no React, no CDN build). Linear / Vercel / Anthropic-Console aesthetic with Geist + Instrument Serif + Geist Mono type stack.

Shell

56-px icon rail (left) → chat / memory / scheduler / presets / settings
264-px thread list with rename + delete inline actions
Editorial chat canvas (centered, 780 px)
Right-side Inspector: context-window gauge, INPUT / OUTPUT token cards, sparkbars (tokens-per-turn), recalled memories (/api/knowledge/search on last user prompt), active tools, latency bars
⌘K command palette + Gmail-style Alt+letter nav shortcuts
Keyboard cheatsheet modal (Shift+?)

Chat fidelity

Streaming without flicker — in-place DOM patches, targeted updates, never full re-render during a turn
Tool calls grouped by 11 categories (memory / knowledge / files / shell / browser / web / vision / voice / automation / skills / orchestration), each expandable for full JSON input + output
Markdown rendering (H1–H6, bold / italic / strike, inline code, blockquote, lists, links)
Code blocks with line-number gutter, filename + language label, copy button
Thinking block as collapsible <details> after the turn ends
Regenerate = clean restart — server deletes the last user→assistant turn so the model has no idea it's a regeneration
Persistent attachments — images + files saved to message meta, survive server restart

Memory / Knowledge

Drag-drop upload supporting 50+ formats (see Knowledge ingest)
URL scraping via MarkItDown
Folder scan — preview + batch index
Interactive knowledge graph (force-directed SVG) with hover edge highlights + search filter

Mobile

iPhone safe-area insets on all 4 sides
Bottom tab bar replaces rail
Slide-in drawer for thread list
Composer textarea at 16 px (no iOS auto-zoom)
100dvh viewport, honors URL bar + home indicator

Settings — 17 tabs grouped into Agent / I/O / Automation / System (Model, Soul, Tools, Memory, Voice, Camera, Telegram, MCP, Heartbeat, Inference, Network, Privacy, Appearance, Advanced, Account). Advanced sub-tabs expose all 30+ EDITABLE_SETTINGS as forms. Abort button stops runaway turns; login modal handles password-protected installs.

Terminal (CLI)

qwe-qwe

Rich-formatted terminal chat with 20+ slash commands: /soul, /skills, /memory, /model, /thread, /cron, /logs, /stats, /doctor and more.

Telegram Bot

Full mobile access — streaming responses, slash commands, topic-to-thread mapping, image support, formatted messages. Setup guide below.

Tool Search

qwe-qwe uses a meta-tool architecture to minimize token usage. Only 8 core tools are loaded by default:

Core Tool	Purpose
`memory_search`	Search saved memories
`memory_save`	Save to long-term memory
`read_file`	Read file contents
`write_file`	Write/create files
`shell`	Run bash commands
`http_request`	HTTP requests to any API
`spawn_task`	Run tasks in background
`tool_search`	Discover & activate more tools

When the model needs more capabilities, it calls tool_search("browser") or tool_search("notes") — which activates the relevant tools for that turn.

Keywords: browser, notes, schedule, secret, mcp, profile, rag, skill, soul, timer, model, cron

This saves ~3000 tokens per request compared to loading all 46 tools.

Tools

46 tools total across core + extensions + skills:

Category	Tools	Loaded
Memory	`memory_search`, `memory_save`, `memory_delete`	Core
Files & Shell	`read_file`, `write_file`, `shell`	Core
HTTP	`http_request`	Core
Tasks	`spawn_task`, `schedule_task`, `list_cron`, `remove_cron`	Core + Search
Vault	`secret_save`, `secret_get`, `secret_list`, `secret_delete`	Search
RAG	`rag_index`, `rag_search`, `rag_status`	Search
Browser	`browser_open`, `browser_snapshot`, `browser_screenshot`, `browser_click`, `browser_fill`, `browser_eval`, `browser_close`	Search
Notes	`create_note`, `list_notes`, `read_note`, `edit_note`, `delete_note`	Search
Model	`switch_model`	Search
Profile	`user_profile_update`, `user_profile_get`	Search

Skills

Pluggable skill system — built-in skills + create your own from chat:

Skill	Description
`browser`	Web browsing via Playwright (open, read, click, screenshot)
`mcp_manager`	Manage MCP tool servers (add, remove, restart)
`skill_creator`	Create new skills from chat (multi-step LLM pipeline)
`soul_editor`	AI-assisted personality tuning
`notes`	Note management
`timer`	Countdown timers
`weather`	Weather reports via wttr.in

Creating skills from chat

You: create a skill for tracking my daily habits
Agent: Skill 'habit_tracker' generation started...
       plan -> tools -> code -> validate -> Created and enabled! (3 tools, 45s)

Browser

Built-in browser control via Playwright + headless Chromium:

You: open google.com and search for "qwen 3.5 benchmarks"
Agent: [tool_search("browser")] -> [browser_open] -> [browser_snapshot]
       Found results: ...

Tools: browser_open, browser_snapshot, browser_screenshot, browser_click, browser_fill, browser_eval, browser_close

Activated via tool_search("browser"). The agent can navigate pages, read content, fill forms, click buttons, and take screenshots.

MCP

Model Context Protocol — connect external tool servers to extend the agent's capabilities:

You: add MCP server for filesystem access
Agent: [tool_search("mcp")] -> [mcp_add_server] Added 'filesystem' (14 tools)

Supports stdio (subprocess) and HTTP transports. Configured via Settings > System > MCP Servers or through chat using the mcp_manager skill.

MCP tools appear as mcp__servername__toolname and are automatically available through tool_search.

Providers

Primary target is local models via LM Studio or Ollama. Cloud providers supported as fallback:

Provider	Type	Notes
LM Studio	Local	Primary target. Auto-loads models
Ollama	Local	Standard Ollama API
OpenAI	Cloud	GPT-4o, GPT-4.1, etc.
OpenRouter	Cloud	Multi-model gateway
Groq	Cloud	Fast inference
Together	Cloud	Open-source models
DeepSeek	Cloud	DeepSeek models

Switch on the fly via /model (CLI/Telegram) or Settings (Web UI). Auto-discovers available models.

Knowledge ingest

The knowledge base ingests 50+ formats via Microsoft MarkItDown (primary) with stdlib fallbacks (pinned as hard deps — no silent degradation on fresh installs):

Category	Formats
Documents	PDF · DOCX · PPTX · XLSX · EPUB · ODT · RTF · Jupyter notebooks (`.ipynb`)
Web	HTML · any `https://…` URL (MarkItDown handles fetch + markdown conversion)
Data	JSON · CSV · TSV · YAML · TOML · XML · INI · ENV
Code	Python, JS/TS, Go, Rust, Java/Kotlin/Scala, C/C++, Ruby, PHP, SQL, GraphQL, 40+ extensions total
Markup	Markdown · reStructuredText · AsciiDoc · TeX
Images	PNG · JPG · WEBP — via vision pipeline

Three ways to ingest

Drop or pick files — Memory tab upload-zone → batch upload + index
Paste URL — POST /api/knowledge/url fetches, converts to markdown, indexes under source:url tag
Scan folder — preview first (lists indexable files with size/method), then index all in one pass

Each source is stored under ~/.qwe-qwe/uploads/kb/<slug>_<name>, chunked into ~800-char pieces, embedded + dense-vector-indexed in Qdrant, and queued for the nightly synthesis job that extracts entities + wiki pages from the content.

Memory & Knowledge Graph

Three-layer knowledge system in a single Qdrant collection:

Layer 1: RAW           Layer 2: ENTITIES        Layer 3: WIKI
(saved immediately)    (night synthesis)        (night synthesis)

"FastAPI uses       -> [FastAPI] --uses-->      "FastAPI is a modern
 Pydantic for          [Pydantic]               Python framework that
 validation..."        [Python]                  uses Pydantic for
                       [Starlette]               automatic validation..."

How it works

During the day (fast, no LLM cost):

Agent saves facts and knowledge via memory_save
Long texts (>1000 chars) auto-chunked into ~800 char pieces
Each chunk tagged synthesis_status=pending

At night (configurable cron, default 03:00):

Synthesis worker processes pending queue
LLM extracts entities + relations from chunks
Creates entity nodes with typed relations (uses, built_on, part_of, etc.)
Generates wiki summaries stored as searchable chunks
Writes markdown to ~/.qwe-qwe/wiki/ as human-readable backup

During search (enriched context):

Wiki chunks found first (synthesized = higher quality embeddings)
Entity relations expanded (follow links to related knowledge)
Raw chunks provide specifics
Result: synthesized + structured + raw knowledge in one query

Features

Hybrid search: FastEmbed dense (384d, 50+ languages) + SPLADE++ sparse, fused via RRF
Auto-chunking: long texts split on sentence boundaries with overlap
Knowledge graph: entities with typed relations, built automatically
Wiki pages: synthesized markdown, searchable and human-readable
Graph visualization: interactive force-directed graph in Web UI (Knowledge > Graph tab)
Thread isolation: each conversation has its own memory context
Smart compaction: old messages summarized and saved to memory when context fills
Auto-context: wiki + entities + memories injected into each turn
Experience learning: past task outcomes inform future strategies
Modes: in-memory (testing), disk (default), or remote Qdrant server

Scheduler

Cron-like task scheduling with natural syntax:

"in 5m"        -> run once in 5 minutes
"every 2h"     -> repeat every 2 hours
"daily 09:00"  -> every day at 09:00
"14:30"        -> once today/tomorrow at 14:30

Results delivered to Telegram and Web UI. Simple reminders bypass LLM for instant delivery.

Telegram Bot Setup

Create a bot via @BotFather -> copy the token
Set the token: /telegram token <TOKEN> (CLI) or Settings -> Telegram (Web)
Start the bot: /telegram start
Generate activation code: /telegram activate
Send the 6-digit code to your bot in Telegram

Security

One-time 6-digit codes, expire in 10 minutes
3 wrong attempts -> permanent ban (by Telegram user ID)
Only verified owner can chat with the bot

Telegram Features

Streaming responses via editMessageText
Topic isolation: supergroup topics -> separate threads
Formatted messages: MarkdownV2 with HTML fallback
Image support: send images for vision analysis
Cron results: scheduled task output delivered to chat
12 slash commands: /status, /model, /soul, /skills, /memory, /threads, /stats, /cron, /thinking, /doctor, /clear, /help

Personality (Soul)

8 adjustable traits (low / moderate / high):

Trait	Low	High
humor	serious	jokes around
honesty	diplomatic	brutally honest
curiosity	answers questions	asks follow-ups
brevity	verbose	concise
formality	casual	formal
proactivity	waits for requests	suggests ideas
empathy	rational	empathetic
creativity	practical	unconventional

Plus custom traits, agent name, and language selection. Edit via /soul (CLI), Settings (Web), or /soul (Telegram).

Diagnostics

qwe-qwe --doctor

Checks 20+ system components: Python, dependencies, SQLite, Qdrant, provider, LLM API, model loaded, embeddings, inference latency, agent loop v2, MCP servers, browser skill, Telegram, threads, skills, tools, cron/heartbeat, STT/TTS, files indexed, knowledge graph (entities/wiki), synthesis cron, BM25 index, disk space, logs.

Config

Environment variables:

QWE_LLM_URL=http://localhost:1234/v1   # LLM server URL
QWE_LLM_MODEL=qwen/qwen3.5-9b          # Model name
QWE_LLM_KEY=lm-studio                  # API key
QWE_DB_PATH=~/.qwe-qwe/qwe_qwe.db      # Database path
QWE_DATA_DIR=~/.qwe-qwe                # Where threads / memory / uploads live
QWE_QDRANT_MODE=disk                   # memory | disk | server
QWE_PASSWORD=                          # Web UI password (shows login modal if set)
QWE_STT_DEVICE=cpu                     # STT inference device (cpu | cuda)

Everything else (30+ knobs — context_budget, rag_chunk_size, synthesis_time, tts_api_url, etc.) lives in Settings → Advanced → Settings and persists in SQLite.

Data layout

All user data in ~/.qwe-qwe/ (configurable via QWE_DATA_DIR):

qwe_qwe.db        SQLite — messages, threads, KV, settings
memory/           Qdrant vectors (disk mode)
wiki/             Synthesized markdown pages
skills/           User-created skills
uploads/          Images, documents, camera captures
  kb/             Knowledge-base files awaiting / done indexing
workspace/        Default CWD for relative paths (switches per-preset)
presets/<id>/     Installed presets (each with own workspace/, knowledge/, skills/)
logs/             qwe-qwe.log (INFO+), errors.log (WARNING+)

Docker

docker compose up

LM Studio / Ollama should be running on the host. Persistent data in ./data/.

Project Structure

cli.py            Terminal interface + entry point
server.py         FastAPI web server + WebSocket
agent.py          Core agent loop + JSON repair + self-check
config.py         Settings (env-configurable)
db.py             SQLite storage (WAL mode)
memory.py         Qdrant semantic memory (hybrid search)
rag.py            RAG file indexing & search
tools.py          Tool definitions + tool_search + execution
mcp_client.py     Model Context Protocol client
providers.py      Multi-provider LLM management
soul.py           Personality system + prompt generation
tasks.py          Background task runner
scheduler.py      Cron-like scheduler
threads.py        Thread management
telegram_bot.py   Telegram bot integration
vault.py          Encrypted secrets (Fernet)
logger.py         Structured logging
skills/           Pluggable skill modules
  browser.py      Web browsing (Playwright)
  mcp_manager.py  MCP server management
  skill_creator.py Skill generation pipeline
  soul_editor.py  Personality editing
  notes.py        Note management
  timer.py        Countdown timers
  weather.py      Weather reports
static/           Web UI (single-file HTML/CSS/JS)

Community

Join our Telegram community: @qwe_qwe_ai

License

MIT

Built with care by DeepFounder

Release History

Version	Changes	Urgency	Date
v0.23.3	## v0.23.3 — Coach, recovery helpers, polish Patch release on v0.23.2: opt-in daily anti-pattern coach, a recovery path for Qdrant ↔ markdown desync, a sharper `--doctor` warning for `onnxruntime-gpu` (community PR), and a brand refresh on the web UI. No schema migrations, no breaking changes. Drop-in upgrade. ### Coach — daily anti-pattern scan (opt-in, no LLM cost) Inspired by Microsoft's [AI Engineer Coach](https://github.com/microsoft/AI-Engineering-Coach) VS Code extension. A small sche	High	6/2/2026
v0.23.2	## v0.23.2 — Phantom "generating" bubble fix ### Critical user-facing fix Phantom "generating" assistant bubble appeared out of nowhere on idle chats and blocked further sends. User report: idle chat, agent's last reply already delivered, everything looked done — and suddenly a "castor 09:39 PM generating" status appeared with the typing indicator on. The bubble never closed, so the composer stayed in a busy state and new messages couldn't be sent. Root cause: ``static/index.html::handle	High	5/28/2026
v0.23.0	## v0.23.0 — Goal Runtime, Native Anthropic, Plugin Framework The biggest release since v0.18.7. Goals turn castor from a chat assistant into an autonomous agent that can work for hours on multi-step tasks — surviving disconnects, process restarts, and context-window pressure. Long-running multi-step tasks are now first-class citizens. Create a goal in the Goals view and Castor breaks it into a plan, dispatches subagents per subtask, and tracks progress live. ### Goals — long-running autonomo	High	5/19/2026
v0.22.1	## v0.22.1 — Migration reliability fix - fix(db): SQLite migration runner now executes statements one-by-one instead of via `executescript()`. This makes `ALTER TABLE ADD COLUMN` migrations idempotent: if `scheduler._ensure_table()` or any other helper pre-creates a column before a migration runs, the "duplicate column name" error is silently skipped rather than aborting the entire migration. Eliminates a test-ordering flakiness introduced after the v0.20 / v0.21 merge. - Internal: `_iter_s	High	5/12/2026
v0.18.7	# v0.18.7 — Canvas (sandboxed HTML side panel) + Skill import (skills.sh / Anthropic SKILL.md spec) Two big features land together because they're the same idea from opposite directions: richer output → user (Canvas), and more capabilities ← community (Skill import). Plus a Tools & skills tab rebuild so the growing skill list stays usable. --- ## 🎨 Canvas — sandboxed HTML in a side panel The agent can now ship arbitrary HTML to a 480px right-side panel. Three concrete things this un	High	5/12/2026
v0.18.1	# v0.18.1 — Bug fixes from external reports Patch release closing three bugs reported on the issue tracker by @EugeneKorr (VegaEx). Each issue came with a clean repro and a working patch — pure pleasure to merge. No new features. ## 🔁 #10 — `_extract_tool_from_text` missing pattern for `!<function_call:>` format Symptom users saw: "infinite reply" — the model kept generating but the chat never advanced. Reported on Qwen 3.5 9B served via LM Studio, but the underlying mechanism applie	High	5/6/2026
v0.18.0	Minor bump because the project's positioning changed: qwe-qwe is now framed as a self-hosted AI agent for business automation, not just "an AI agent for small local models". No breaking API changes; the small-model heritage stays intact (it's why the system around the LLM works hard so the model doesn't have to). What changed is the framing of who it's built for. ## 🎯 Repositioning - Tagline: `AI agent optimized for small local models` → `Self-hosted AI agent for business automa	High	4/26/2026
v0.17.19	# v0.17.19 — Abort propagation, concurrency, security hardening Seven fixes from the code-review follow-up. Two batches landed in parallel (deep plumbing + security hardening) and were merged into a single release. ## 🛑 Abort + concurrency (Batch 3 — deep plumbing) ### A. Stop actually stops blocking tools Before: `tool_executor` called `subprocess.run(..., timeout=300)` for `shell` and `urlopen(..., timeout=15)` for `http_request`. The agent loop's abort check ran only between tool calls,	High	4/22/2026
v0.17.6	# v0.17.6 — Soul save + ground-truth core tools list Two fixes on the v2 web UI. ## 🐛 Fixes ### Soul settings silently wiped all traits (and UI didn't refresh) Changing one trait in Settings → Soul used to POST: ```js /api/soul { traits: { humor: "high" } } ``` The server iterates `data.items()` and calls `soul.save(key, value)` for each top-level key — so it received `save("traits", {humor: "high"})`, which returns `"Unknown trait: traits. Use add_trait()…"` and saves nothing. HTTP	High	4/21/2026
v0.17.5	# v0.17.5 — Secrets list not updating Quick patch: the Secrets sub-tab didn't refresh after a secret was saved, so users saw a "saved" toast but the list stayed empty. ## 🐛 Fix - `state.secrets` was being assigned a function — the loader had: ```js state.secrets = r.keys \|\| r \|\| []; ``` `/api/secrets` returns a bare array of keys. `r.keys` on an array resolves to `Array.prototype.keys` (a truthy method reference), so the fallback chain leaked the method into state instead of the	High	4/21/2026
v0.17.4	# v0.17.4 — Preset routes fix Quick patch addressing a route-ordering bug in `server.py`. ## 🐛 Fix - `GET /api/presets/onboarding` → 404 and `POST /api/presets/deactivate` → 405 — these literal-path routes were declared after the parameterised catch-all `GET /api/presets/{preset_id}`, so FastAPI swallowed requests to `/api/presets/onboarding` into the `{preset_id}` handler which returned 404 (no preset with id `"onboarding"` exists). Similarly for `deactivate`. Fixed by reorderi	High	4/21/2026
v0.17.3	# v0.17.3 — Advanced Settings rendering fix Quick patch addressing a UI regression in Settings → Advanced → Settings. ## 🐛 Fixes - All 30+ tunables rendered as `[object Object]` — `/api/settings` returns each key as `{value, default, description, type, min, max}` but the UI was reading the nested object as a flat value. Every input / toggle in the Advanced Settings tab now: - Pulls the real value from `.value` - Uses `.type` (`int` / `float` / `str` / `bool`) to pick the right in	High	4/21/2026
v0.17.2	# v0.17.2 — MCP hardened + systemic MCP (Model Context Protocol) module audited, hardened, and promoted to a first-class subsystem. Live-tested end-to-end with the official `@modelcontextprotocol/server-memory` — 9 tools discovered, handshake clean, `tools/call` roundtrip verified. ## 🔧 `mcp_client.py` hardening - stderr drain thread — stdio servers' stderr used to fill buffers silently and hang the subprocess. Now a daemon reader drains stderr into a rolling tail (last 50 lines) and sur	High	4/21/2026
v0.17.1	# v0.17.1 — Mobile layout fixes Patch release focused on mobile UI issues reported after v0.17.0 shipped. ## 🐛 Fixes - Mobile hamburger not visible — CSS source order made `.mobile-hamburger { display: none }` override the `@media (max-width:780px)` rule that should have shown it. Now hidden by default (before media query) so the media query correctly re-enables it on phones. Tap the 💬 icon in the topbar → thread drawer slides in → tap `+` to create a thread. - **Settings inputs crushed	High	4/21/2026
v0.17.0	### Premium web UI (Linear / Vercel / Anthropic-Console aesthetic) Full replacement of the legacy 6 021-line SPA with a single-file vanilla-JS shell. Zero runtime JS dependencies (no React, no CDN build step). - Editorial chat transcript — Geist (UI) + Instrument Serif (headings, big stat numbers, thinking italic) + Geist Mono (timestamps, tokens, tags, technical metadata) - Streaming without flicker — in-place body patches, targeted DOM updates, no full re-render during a tur	High	4/21/2026
v0.16.0	## Preset Workspace Isolation Presets now get full workspace isolation — like browser profiles: \| \| Before preset \| Preset active \| After deactivate \| \|---\|---\|---\|---\| \| Thread \| Main / Thread N \| "Preset: Ника" \| Main / Thread N \| \| Workspace \| ~/.qwe-qwe/workspace/ \| ~/.qwe-qwe/presets/\<id\>/workspace/ \| ~/.qwe-qwe/workspace/ \| \| Soul \| Spark \| Ника \| Spark \| Chats, files, and personality are fully isolated per preset. ## Code Cleanup - `strip_thinking` deduplicated — wa	High	4/20/2026
v0.15.1	Detailed logging for preset install pipeline. Every step (load, extract, validate, copy, register) logged at INFO. Errors logged with full details. Debug failed installs from logs alone.	High	4/19/2026
v0.15.0	## Visible Browser Mode Agent can now control a real browser window that the user watches: ``` browser_set_visible(true) → Chromium window appears browser_open("site.com") → user sees page loading browser_click("button") → user watches the click browser_fill("input", "X") → user sees text being typed ``` All 23 browser tools work in visible mode. 16-18 round sessions tested successfully. ## New Tool: open_url Opens URL in user's default desktop browser (Thorium, Edge, Chrome,	High	4/19/2026
v0.14.3	## Changes All artificial limits removed. Agent works until the task is done. - max_tool_rounds: 0 (unlimited) — was 10, then 25, then 30 - max_tool_calls: 0 (unlimited) — was 25 - per-tool frequency limit: removed entirely — was blocking shell after 5-8 calls - DB setting reset to 0 for existing installations ### Only protection: loop detection 2 identical tool calls (same tool + same arguments) = force stop. This catches real infinite loops without blocking legitimate multi-	High	4/19/2026
v0.14.2	## Changes ### Detailed Tool History - Tool call details (args + results) now saved to DB and shown when reloading chat history - Previously only tool names were saved — now you see exactly what each tool did - Args preview: 200 chars, result preview: 200 chars ### Unlimited Rounds - Removed hard `max_turns` and `max_tool_calls` limits (set to 0 = unlimited) - Agent works as long as needed — no more "budget exceeded" after 10 calls - Loop detection (2 identical calls) and **per-tool fr	High	4/19/2026
v0.14.1	## Bugfixes - `import tools` missing in server.py — caused `NameError: name 'tools' is not defined` on send_file - `agent_loop`, `agent_events`, `agent_budget` missing from pyproject.toml — caused `ModuleNotFoundError` when installed via pip/installer - FastEmbed ONNX cache corruption — graceful handling when embedding model files are missing - Rich console cp1251 crash — doctor command crashed on Windows terminals with emoji; now uses safe fallback ## Camera & Vision - **Pers	High	4/19/2026
v0.14.0	## 🚀 What's New ### 🧠 Agent Persistence (long multi-step tasks) - Tool result clearing: before each API call, old tool results replaced with one-line summaries (keep last 3 intact). Prevents context overflow during 10+ step tasks. - Tool result size cap: individual results capped at 4K chars. One `browser_snapshot` no longer eats the entire context. - Structured compaction: 9-section summary format (Current State, Goals, Key Files, Learnings, Next Steps...) — injected back into co	High	4/18/2026
v0.13.0	## 🚀 Major Features ### 🔧 5-Layer Reliable Tool Calling Local models (Qwen, Gemma) now reliably call tools instead of describing what they'd do: - Intent Router: configurable small instruct model (1-3B) routes user messages to the right tool before the main LLM runs - Text-to-Tool Extraction: parses tool calls from model text output when it fails to use native function calling - tool_search Short-Circuit: prevents repeated tool_search — "tools already active, call them directly" -	High	4/18/2026
v0.11.0	A polish-focused release centered on three themes: a ground-up Web UI redesign, a critical fix that stops chat file attachments from blowing up the context window, and a new upload-from-computer flow for the Knowledge Base. --- ## 🎨 Web UI Redesign The entire web interface has been rebuilt on a modern design system without touching the underlying structure. ### Design tokens - Full CSS variable palette: colors, radii, shadows, easing curves, inset highlights - Semantic tokens:	Medium	4/11/2026
v0.10.0	## Major Changes ### Agent Loop v2 Clean execution loop inspired by [claw-code-agent](https://github.com/HarnessLab/claw-code-agent): - Continuation handling on max_tokens truncation - Multi-dimensional budget (turns, tool calls, tokens) - Typed event emitter (replaces 5 callback globals) - Vision fallback for non-multimodal models ### Unified Memory + RAG - Single Qdrant collection for agent memory AND indexed files - Files from Upload UI go through memory.save() → knowledge graph	Medium	4/10/2026
v0.9.0	## What's New ### Knowledge Graph - Three-layer memory system: raw chunks + entity nodes + wiki summaries in single Qdrant collection - Auto-chunking: long texts split into ~800 char pieces on sentence boundaries - Night synthesis worker: cron task extracts entities + relations via LLM, builds wiki pages - Entity nodes: typed entities (technology, person, project) with weighted relations - Wiki chunks: synthesized knowledge with better embeddings than raw text - **En	Medium	4/6/2026
v0.8.0	## What's New ### Tool Search (Meta-Tool) - Only 8 core tools loaded by default (~750 tokens vs ~3500 previously) - `tool_search("keyword")` activates more tools on demand - 75% token savings on every API call ### Browser Skill - Native browser control via Playwright + headless Chromium - 7 tools: open, snapshot, screenshot, click, fill, eval, close ### Gemma 4 Support - Strip thinking tags, anti-hedge nudge, works alongside Qwen 3.5 ### System Prompt Optimization - Total	Medium	4/4/2026
v0.6.0	Streaming Web UI: Real-time token streaming with live Markdown rendering (debounced 80ms) CLI: Rich Live progressive Markdown display in terminal Telegram: Native streaming via sendMessageDraft (Bot API 9.3+) with editMessageText fallback Fix: asyncio.Queue-based delivery (event loop no longer blocks during generation) Telegram Upgrade Inline keyboard [Retry] button on every response Thinking shown in spoiler tags, tool calls listed in footer Callback query handler for inline buttons Fi	Medium	4/3/2026
v0.5.0	## 🧠 Hybrid Search & Embeddings - FastEmbed replaces OpenAI/LM Studio — fully local ONNX, no server needed - Multilingual embeddings — 50+ languages including Russian (`paraphrase-multilingual-MiniLM-L12-v2`) - Hybrid search — dense + sparse (SPLADE++) fused via RRF - IDF modifier, score filtering, Recommend API, grouping, float16 vectors - Auto-migration v1→v2 with crash recovery ## 🤖 Small Model Optimizations - Chain-of-workers — exhausted worker generates handoff +	Low	3/20/2026
v0.4.0	## v0.4.0 — Security, Experience Learning, Fallback, Inference Wizard 29 commits since v0.3.0. Major release with 3 new features, comprehensive security hardening, and performance improvements. ### 🧠 Experience Learning (Memento) Agent learns from past task executions — no fine-tuning required. - After each tool-using turn, saves a compact experience case to vector memory - On future similar tasks, retrieves relevant past experiences into context - Composite scoring: `effectiv	Low	3/18/2026
v0.3.1	Full Changelog: https://github.com/deepfounder-ai/qwe-qwe/compare/v0.3.0...v0.3.1	Low	3/17/2026
v0.3.0	Full Changelog: https://github.com/deepfounder-ai/qwe-qwe/commits/v0.3.0	Low	3/16/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

hermes-gate🏛️ Hermes Gate — Terminal TUI for managing remote Hermes Agent sessions with auto-reconnect, detach support, and zero config0.0.0

CodexSkillManager🛠 Manage Codex and Claude Code skills on macOS with this SwiftUI app. Browse, import, and delete skills effortlessly while viewing detailed info.main@2026-06-07

E2BOpen-source, secure environment with real-world tools for enterprise-grade agents.e2b@2.28.0

hermes-agentThe agent that grows with youv2026.6.5

CopilotKitThe Frontend Stack for Agents & Generative UI. React + Angular. Makers of the AG-UI Protocolv1.59.5

More in AI Agents

hermes-agentThe agent that grows with you

awesome-copilotCommunity-contributed instructions, agents, skills, and configurations to help you make the most of GitHub Copilot.

e2bE2B SDK that give agents cloud environments

letta-codeThe memory-first coding agent