freshcrate
Home > AI Agents > qwe-qwe

qwe-qwe

⚑ Lightweight offline AI agent for local models. No cloud, no API keys β€” just your GPU.

Description

⚑ Lightweight offline AI agent for local models. No cloud, no API keys β€” just your GPU.

README

qwe-qwe

AI agent optimized for small local models

Built for Qwen 9B & Gemma 4B on a gaming laptop. No cloud required.

Quick Start β€’ Why Small Models β€’ Interfaces β€’ Tool Search β€’ Tools β€’ Skills β€’ MCP β€’ Telegram β€’ Doctor

version python platform license offline Telegram


What is qwe-qwe?

A personal AI agent designed to squeeze maximum capability out of small local models (4-9B parameters). Chat via terminal, browser, or Telegram β€” with tools, semantic memory, browser control, MCP integration, scheduled tasks, and a customizable personality.

Optimized for Qwen 3.5 9B and Gemma 4 E4B running on a single consumer GPU (4-8GB VRAM). Cloud providers supported as fallback, but the architecture, prompts, and tool system are built for the constraints of small models.

Philosophy: every token is expensive. Don't make the model smarter β€” make the system around it smarter. Tool search, compact prompts, retry loops, JSON repair, and self-checks compensate for what the model lacks.

Why Small Models

Cloud (GPT, Claude) Local (Qwen 9B)
Latency 2-10s network + inference 1-5s local inference
Privacy Data leaves your machine Everything stays local
Cost $20-200/month Free after GPU purchase
Offline No Works without internet
Customization System prompt only Full control over everything
Reliability API outages, rate limits Always available

qwe-qwe makes the trade-off worth it by working with the model's limitations instead of fighting them.

Quick Start

Prerequisites

  • Python 3.11+
  • LM Studio or Ollama with a loaded model
  • Recommended models:
    • Qwen 3.5 9B Q4_K_M (~5.5GB) β€” best for tool calling and agents
    • Gemma 4 E4B-IT (~4GB) β€” fast, good for simple tasks
  • Embeddings: FastEmbed (ONNX, local) β€” multilingual-MiniLM (384d, 50+ languages) + SPLADE++

Install

Runs natively on Linux, macOS (Intel & Apple Silicon) and Windows 10/11 β€” single pip install -e . pulls every runtime dep (including MarkItDown, python-docx/pptx, openpyxl, pdfminer.six, pypdf, fastembed, qdrant-client, uvicorn).

🐧 Linux / 🍎 macOS β€” one-line

curl -fsSL https://raw.githubusercontent.com/deepfounder-ai/qwe-qwe/main/install.sh | bash

This clones the repo, creates a venv, installs everything, verifies critical deps, pre-downloads the embedding model, and drops qwe-qwe on your $PATH.

πŸͺŸ Windows

git clone https://github.com/deepfounder-ai/qwe-qwe.git
cd qwe-qwe
setup.bat

On Windows shell commands are routed through Git Bash (auto-detected at install time β€” install Git for Windows if missing). Falls back to cmd.exe if not found.

Manual (any platform)

git clone https://github.com/deepfounder-ai/qwe-qwe.git
cd qwe-qwe

# Create venv
python3 -m venv .venv            # or `python -m venv .venv` on Windows
source .venv/bin/activate        # macOS/Linux
# .venv\Scripts\activate         # Windows PowerShell / cmd

# Install package + all runtime deps
pip install -e .

# Verify everything is wired
qwe-qwe --doctor

Update an existing install

# Linux / macOS
curl -fsSL https://raw.githubusercontent.com/deepfounder-ai/qwe-qwe/main/install.sh | bash

# Any platform, inside the checkout:
git pull && pip install -e . --upgrade

The update script is idempotent β€” re-running it detects an existing checkout and refreshes deps.

Run

qwe-qwe              # terminal chat
qwe-qwe --web        # web UI at http://localhost:7860
qwe-qwe --doctor     # check everything works

LM Studio / Ollama are auto-detected on localhost during setup. If your server is on another machine:

export QWE_LLM_URL=http://<your-ip>:1234/v1

Recommended hardware

Component Minimum Recommended
GPU 4GB VRAM (4B Q4) 8GB VRAM (9B Q4_K_M)
RAM 8GB 16GB
Storage 10GB 20GB (models + memory)

Works on: gaming laptops, desktop GPUs (RTX 3060+), Mac M1+ (via Ollama).

Architecture

                               +-- Qdrant (semantic memory, hybrid search)
CLI (terminal)  <--+           +-- RAG (file indexing & search)
Web UI (browser) <--+-- Agent -+-- SQLite (history, threads, state)
Telegram bot    <--/    Loop   +-- Tools (8 core + tool_search)
                        |      +-- Skills (7 built-in, user-creatable)
                        |      +-- Browser (Playwright/Chromium)
                        |      +-- MCP (external tool servers)
                        |      +-- Scheduler (cron tasks)
                        |      +-- Vault (encrypted secrets)
                        v
                   LLM (local or cloud)
                   7 providers supported

Small-model optimizations

  • Tool Search β€” only 8 core tools loaded by default (~750 tokens); model calls tool_search("keyword") to activate more. Saves 75% tokens vs loading all 46 tools
  • Compact system prompt (~1200 tokens) β€” no redundant tool descriptions
  • JSON repair engine β€” fixes malformed tool calls (trailing commas, unclosed brackets, single quotes)
  • Anti-hedge nudge β€” if model talks instead of acting, it gets pushed to use tools
  • Self-check validation β€” validates tool args before execution, with required-field checks
  • Smart compaction β€” summarizes old messages when context fills up, saves to memory
  • Stuck detection β€” warns model after 5+ tool errors per turn
  • Experience learning β€” agent remembers past task outcomes and adapts strategies
  • Shell via Git Bash β€” UNIX commands work on Windows, auto-detected

Interfaces

Web UI

qwe-qwe --web                    # http://localhost:7860
qwe-qwe --web --ssl --port 7861  # HTTPS (needed for mic/camera)

Premium single-file SPA β€” zero runtime JS dependencies (no React, no CDN build). Linear / Vercel / Anthropic-Console aesthetic with Geist + Instrument Serif + Geist Mono type stack.

Shell

  • 56-px icon rail (left) β†’ chat / memory / scheduler / presets / settings
  • 264-px thread list with rename + delete inline actions
  • Editorial chat canvas (centered, 780 px)
  • Right-side Inspector: context-window gauge, INPUT / OUTPUT token cards, sparkbars (tokens-per-turn), recalled memories (/api/knowledge/search on last user prompt), active tools, latency bars
  • ⌘K command palette + Gmail-style Alt+letter nav shortcuts
  • Keyboard cheatsheet modal (Shift+?)

Chat fidelity

  • Streaming without flicker β€” in-place DOM patches, targeted updates, never full re-render during a turn
  • Tool calls grouped by 11 categories (memory / knowledge / files / shell / browser / web / vision / voice / automation / skills / orchestration), each expandable for full JSON input + output
  • Markdown rendering (H1–H6, bold / italic / strike, inline code, blockquote, lists, links)
  • Code blocks with line-number gutter, filename + language label, copy button
  • Thinking block as collapsible <details> after the turn ends
  • Regenerate = clean restart β€” server deletes the last userβ†’assistant turn so the model has no idea it's a regeneration
  • Persistent attachments β€” images + files saved to message meta, survive server restart

Memory / Knowledge

  • Drag-drop upload supporting 50+ formats (see Knowledge ingest)
  • URL scraping via MarkItDown
  • Folder scan β€” preview + batch index
  • Interactive knowledge graph (force-directed SVG) with hover edge highlights + search filter

Mobile

  • iPhone safe-area insets on all 4 sides
  • Bottom tab bar replaces rail
  • Slide-in drawer for thread list
  • Composer textarea at 16 px (no iOS auto-zoom)
  • 100dvh viewport, honors URL bar + home indicator

Settings β€” 17 tabs grouped into Agent / I/O / Automation / System (Model, Soul, Tools, Memory, Voice, Camera, Telegram, MCP, Heartbeat, Inference, Network, Privacy, Appearance, Advanced, Account). Advanced sub-tabs expose all 30+ EDITABLE_SETTINGS as forms. Abort button stops runaway turns; login modal handles password-protected installs.

Terminal (CLI)

qwe-qwe

Rich-formatted terminal chat with 20+ slash commands: /soul, /skills, /memory, /model, /thread, /cron, /logs, /stats, /doctor and more.

Telegram Bot

Full mobile access β€” streaming responses, slash commands, topic-to-thread mapping, image support, formatted messages. Setup guide below.

Tool Search

qwe-qwe uses a meta-tool architecture to minimize token usage. Only 8 core tools are loaded by default:

Core Tool Purpose
memory_search Search saved memories
memory_save Save to long-term memory
read_file Read file contents
write_file Write/create files
shell Run bash commands
http_request HTTP requests to any API
spawn_task Run tasks in background
tool_search Discover & activate more tools

When the model needs more capabilities, it calls tool_search("browser") or tool_search("notes") β€” which activates the relevant tools for that turn.

Keywords: browser, notes, schedule, secret, mcp, profile, rag, skill, soul, timer, model, cron

This saves ~3000 tokens per request compared to loading all 46 tools.

Tools

46 tools total across core + extensions + skills:

Category Tools Loaded
Memory memory_search, memory_save, memory_delete Core
Files & Shell read_file, write_file, shell Core
HTTP http_request Core
Tasks spawn_task, schedule_task, list_cron, remove_cron Core + Search
Vault secret_save, secret_get, secret_list, secret_delete Search
RAG rag_index, rag_search, rag_status Search
Browser browser_open, browser_snapshot, browser_screenshot, browser_click, browser_fill, browser_eval, browser_close Search
Notes create_note, list_notes, read_note, edit_note, delete_note Search
Model switch_model Search
Profile user_profile_update, user_profile_get Search

Skills

Pluggable skill system β€” built-in skills + create your own from chat:

Skill Description
browser Web browsing via Playwright (open, read, click, screenshot)
mcp_manager Manage MCP tool servers (add, remove, restart)
skill_creator Create new skills from chat (multi-step LLM pipeline)
soul_editor AI-assisted personality tuning
notes Note management
timer Countdown timers
weather Weather reports via wttr.in

Creating skills from chat

You: create a skill for tracking my daily habits
Agent: Skill 'habit_tracker' generation started...
       plan -> tools -> code -> validate -> Created and enabled! (3 tools, 45s)

Browser

Built-in browser control via Playwright + headless Chromium:

You: open google.com and search for "qwen 3.5 benchmarks"
Agent: [tool_search("browser")] -> [browser_open] -> [browser_snapshot]
       Found results: ...

Tools: browser_open, browser_snapshot, browser_screenshot, browser_click, browser_fill, browser_eval, browser_close

Activated via tool_search("browser"). The agent can navigate pages, read content, fill forms, click buttons, and take screenshots.

MCP

Model Context Protocol β€” connect external tool servers to extend the agent's capabilities:

You: add MCP server for filesystem access
Agent: [tool_search("mcp")] -> [mcp_add_server] Added 'filesystem' (14 tools)

Supports stdio (subprocess) and HTTP transports. Configured via Settings > System > MCP Servers or through chat using the mcp_manager skill.

MCP tools appear as mcp__servername__toolname and are automatically available through tool_search.

Providers

Primary target is local models via LM Studio or Ollama. Cloud providers supported as fallback:

Provider Type Notes
LM Studio Local Primary target. Auto-loads models
Ollama Local Standard Ollama API
OpenAI Cloud GPT-4o, GPT-4.1, etc.
OpenRouter Cloud Multi-model gateway
Groq Cloud Fast inference
Together Cloud Open-source models
DeepSeek Cloud DeepSeek models

Switch on the fly via /model (CLI/Telegram) or Settings (Web UI). Auto-discovers available models.

Knowledge ingest

The knowledge base ingests 50+ formats via Microsoft MarkItDown (primary) with stdlib fallbacks (pinned as hard deps β€” no silent degradation on fresh installs):

Category Formats
Documents PDF Β· DOCX Β· PPTX Β· XLSX Β· EPUB Β· ODT Β· RTF Β· Jupyter notebooks (.ipynb)
Web HTML Β· any https://… URL (MarkItDown handles fetch + markdown conversion)
Data JSON Β· CSV Β· TSV Β· YAML Β· TOML Β· XML Β· INI Β· ENV
Code Python, JS/TS, Go, Rust, Java/Kotlin/Scala, C/C++, Ruby, PHP, SQL, GraphQL, 40+ extensions total
Markup Markdown Β· reStructuredText Β· AsciiDoc Β· TeX
Images PNG Β· JPG Β· WEBP β€” via vision pipeline

Three ways to ingest

  1. Drop or pick files β€” Memory tab upload-zone β†’ batch upload + index
  2. Paste URL β€” POST /api/knowledge/url fetches, converts to markdown, indexes under source:url tag
  3. Scan folder β€” preview first (lists indexable files with size/method), then index all in one pass

Each source is stored under ~/.qwe-qwe/uploads/kb/<slug>_<name>, chunked into ~800-char pieces, embedded + dense-vector-indexed in Qdrant, and queued for the nightly synthesis job that extracts entities + wiki pages from the content.

Memory & Knowledge Graph

Three-layer knowledge system in a single Qdrant collection:

Layer 1: RAW           Layer 2: ENTITIES        Layer 3: WIKI
(saved immediately)    (night synthesis)        (night synthesis)

"FastAPI uses       -> [FastAPI] --uses-->      "FastAPI is a modern
 Pydantic for          [Pydantic]               Python framework that
 validation..."        [Python]                  uses Pydantic for
                       [Starlette]               automatic validation..."

How it works

During the day (fast, no LLM cost):

  • Agent saves facts and knowledge via memory_save
  • Long texts (>1000 chars) auto-chunked into ~800 char pieces
  • Each chunk tagged synthesis_status=pending

At night (configurable cron, default 03:00):

  • Synthesis worker processes pending queue
  • LLM extracts entities + relations from chunks
  • Creates entity nodes with typed relations (uses, built_on, part_of, etc.)
  • Generates wiki summaries stored as searchable chunks
  • Writes markdown to ~/.qwe-qwe/wiki/ as human-readable backup

During search (enriched context):

  • Wiki chunks found first (synthesized = higher quality embeddings)
  • Entity relations expanded (follow links to related knowledge)
  • Raw chunks provide specifics
  • Result: synthesized + structured + raw knowledge in one query

Features

  • Hybrid search: FastEmbed dense (384d, 50+ languages) + SPLADE++ sparse, fused via RRF
  • Auto-chunking: long texts split on sentence boundaries with overlap
  • Knowledge graph: entities with typed relations, built automatically
  • Wiki pages: synthesized markdown, searchable and human-readable
  • Graph visualization: interactive force-directed graph in Web UI (Knowledge > Graph tab)
  • Thread isolation: each conversation has its own memory context
  • Smart compaction: old messages summarized and saved to memory when context fills
  • Auto-context: wiki + entities + memories injected into each turn
  • Experience learning: past task outcomes inform future strategies
  • Modes: in-memory (testing), disk (default), or remote Qdrant server

Scheduler

Cron-like task scheduling with natural syntax:

"in 5m"        -> run once in 5 minutes
"every 2h"     -> repeat every 2 hours
"daily 09:00"  -> every day at 09:00
"14:30"        -> once today/tomorrow at 14:30

Results delivered to Telegram and Web UI. Simple reminders bypass LLM for instant delivery.

Telegram Bot Setup

  1. Create a bot via @BotFather -> copy the token
  2. Set the token: /telegram token <TOKEN> (CLI) or Settings -> Telegram (Web)
  3. Start the bot: /telegram start
  4. Generate activation code: /telegram activate
  5. Send the 6-digit code to your bot in Telegram

Security

  • One-time 6-digit codes, expire in 10 minutes
  • 3 wrong attempts -> permanent ban (by Telegram user ID)
  • Only verified owner can chat with the bot

Telegram Features

  • Streaming responses via editMessageText
  • Topic isolation: supergroup topics -> separate threads
  • Formatted messages: MarkdownV2 with HTML fallback
  • Image support: send images for vision analysis
  • Cron results: scheduled task output delivered to chat
  • 12 slash commands: /status, /model, /soul, /skills, /memory, /threads, /stats, /cron, /thinking, /doctor, /clear, /help

Personality (Soul)

8 adjustable traits (low / moderate / high):

Trait Low High
humor serious jokes around
honesty diplomatic brutally honest
curiosity answers questions asks follow-ups
brevity verbose concise
formality casual formal
proactivity waits for requests suggests ideas
empathy rational empathetic
creativity practical unconventional

Plus custom traits, agent name, and language selection. Edit via /soul (CLI), Settings (Web), or /soul (Telegram).

Diagnostics

qwe-qwe --doctor

Checks 20+ system components: Python, dependencies, SQLite, Qdrant, provider, LLM API, model loaded, embeddings, inference latency, agent loop v2, MCP servers, browser skill, Telegram, threads, skills, tools, cron/heartbeat, STT/TTS, files indexed, knowledge graph (entities/wiki), synthesis cron, BM25 index, disk space, logs.

Config

Environment variables:

QWE_LLM_URL=http://localhost:1234/v1   # LLM server URL
QWE_LLM_MODEL=qwen/qwen3.5-9b          # Model name
QWE_LLM_KEY=lm-studio                  # API key
QWE_DB_PATH=~/.qwe-qwe/qwe_qwe.db      # Database path
QWE_DATA_DIR=~/.qwe-qwe                # Where threads / memory / uploads live
QWE_QDRANT_MODE=disk                   # memory | disk | server
QWE_PASSWORD=                          # Web UI password (shows login modal if set)
QWE_STT_DEVICE=cpu                     # STT inference device (cpu | cuda)

Everything else (30+ knobs β€” context_budget, rag_chunk_size, synthesis_time, tts_api_url, etc.) lives in Settings β†’ Advanced β†’ Settings and persists in SQLite.

Data layout

All user data in ~/.qwe-qwe/ (configurable via QWE_DATA_DIR):

qwe_qwe.db        SQLite β€” messages, threads, KV, settings
memory/           Qdrant vectors (disk mode)
wiki/             Synthesized markdown pages
skills/           User-created skills
uploads/          Images, documents, camera captures
  kb/             Knowledge-base files awaiting / done indexing
workspace/        Default CWD for relative paths (switches per-preset)
presets/<id>/     Installed presets (each with own workspace/, knowledge/, skills/)
logs/             qwe-qwe.log (INFO+), errors.log (WARNING+)

Docker

docker compose up

LM Studio / Ollama should be running on the host. Persistent data in ./data/.

Project Structure

cli.py            Terminal interface + entry point
server.py         FastAPI web server + WebSocket
agent.py          Core agent loop + JSON repair + self-check
config.py         Settings (env-configurable)
db.py             SQLite storage (WAL mode)
memory.py         Qdrant semantic memory (hybrid search)
rag.py            RAG file indexing & search
tools.py          Tool definitions + tool_search + execution
mcp_client.py     Model Context Protocol client
providers.py      Multi-provider LLM management
soul.py           Personality system + prompt generation
tasks.py          Background task runner
scheduler.py      Cron-like scheduler
threads.py        Thread management
telegram_bot.py   Telegram bot integration
vault.py          Encrypted secrets (Fernet)
logger.py         Structured logging
skills/           Pluggable skill modules
  browser.py      Web browsing (Playwright)
  mcp_manager.py  MCP server management
  skill_creator.py Skill generation pipeline
  soul_editor.py  Personality editing
  notes.py        Note management
  timer.py        Countdown timers
  weather.py      Weather reports
static/           Web UI (single-file HTML/CSS/JS)

Community

Join our Telegram community: @qwe_qwe_ai

License

MIT


Built with care by DeepFounder

Release History

VersionChangesUrgencyDate
v0.17.6# v0.17.6 β€” Soul save + ground-truth core tools list Two fixes on the v2 web UI. ## πŸ› Fixes ### Soul settings silently wiped all traits (and UI didn't refresh) Changing one trait in **Settings β†’ Soul** used to POST: ```js /api/soul { traits: { humor: "high" } } ``` The server iterates `data.items()` and calls `soul.save(key, value)` for each top-level key β€” so it received `save("traits", {humor: "high"})`, which returns `"Unknown trait: traits. Use add_trait()…"` and saves nothing. HTTP High4/21/2026
v0.17.5# v0.17.5 β€” Secrets list not updating Quick patch: the Secrets sub-tab didn't refresh after a secret was saved, so users saw a "saved" toast but the list stayed empty. ## πŸ› Fix - **`state.secrets` was being assigned a function** β€” the loader had: ```js state.secrets = r.keys || r || []; ``` `/api/secrets` returns a bare array of keys. `r.keys` on an array resolves to `Array.prototype.keys` (a truthy method reference), so the fallback chain leaked the method into state instead of the High4/21/2026
v0.17.4# v0.17.4 β€” Preset routes fix Quick patch addressing a route-ordering bug in `server.py`. ## πŸ› Fix - **`GET /api/presets/onboarding` β†’ 404** and **`POST /api/presets/deactivate` β†’ 405** β€” these literal-path routes were declared *after* the parameterised catch-all `GET /api/presets/{preset_id}`, so FastAPI swallowed requests to `/api/presets/onboarding` into the `{preset_id}` handler which returned 404 (no preset with id `"onboarding"` exists). Similarly for `deactivate`. Fixed by reorderiHigh4/21/2026
v0.17.3# v0.17.3 β€” Advanced Settings rendering fix Quick patch addressing a UI regression in **Settings β†’ Advanced β†’ Settings**. ## πŸ› Fixes - **All 30+ tunables rendered as `[object Object]`** β€” `/api/settings` returns each key as `{value, default, description, type, min, max}` but the UI was reading the nested object as a flat value. Every input / toggle in the Advanced Settings tab now: - Pulls the real value from `.value` - Uses `.type` (`int` / `float` / `str` / `bool`) to pick the right inHigh4/21/2026
v0.17.2# v0.17.2 β€” MCP hardened + systemic MCP (Model Context Protocol) module audited, hardened, and promoted to a first-class subsystem. Live-tested end-to-end with the official `@modelcontextprotocol/server-memory` β€” 9 tools discovered, handshake clean, `tools/call` roundtrip verified. ## πŸ”§ `mcp_client.py` hardening - **stderr drain thread** β€” stdio servers' stderr used to fill buffers silently and hang the subprocess. Now a daemon reader drains stderr into a rolling tail (last 50 lines) and surHigh4/21/2026
v0.17.1# v0.17.1 β€” Mobile layout fixes Patch release focused on mobile UI issues reported after v0.17.0 shipped. ## πŸ› Fixes - **Mobile hamburger not visible** β€” CSS source order made `.mobile-hamburger { display: none }` override the `@media (max-width:780px)` rule that should have shown it. Now hidden by default (before media query) so the media query correctly re-enables it on phones. Tap the πŸ’¬ icon in the topbar β†’ thread drawer slides in β†’ tap `+` to create a thread. - **Settings inputs crushedHigh4/21/2026
v0.17.0### Premium web UI (Linear / Vercel / Anthropic-Console aesthetic) Full replacement of the legacy 6 021-line SPA with a single-file vanilla-JS shell. **Zero runtime JS dependencies** (no React, no CDN build step). - **Editorial chat transcript** β€” Geist (UI) + Instrument Serif (headings, big stat numbers, thinking italic) + Geist Mono (timestamps, tokens, tags, technical metadata) - **Streaming without flicker** β€” in-place body patches, targeted DOM updates, no full re-render during a turHigh4/21/2026
v0.16.0## Preset Workspace Isolation Presets now get full workspace isolation β€” like browser profiles: | | Before preset | Preset active | After deactivate | |---|---|---|---| | **Thread** | Main / Thread N | "Preset: Ника" | Main / Thread N | | **Workspace** | ~/.qwe-qwe/workspace/ | ~/.qwe-qwe/presets/\<id\>/workspace/ | ~/.qwe-qwe/workspace/ | | **Soul** | Spark | Ника | Spark | Chats, files, and personality are fully isolated per preset. ## Code Cleanup - **`strip_thinking` deduplicated** β€” waHigh4/20/2026
v0.15.1Detailed logging for preset install pipeline. Every step (load, extract, validate, copy, register) logged at INFO. Errors logged with full details. Debug failed installs from logs alone.High4/19/2026
v0.15.0## Visible Browser Mode Agent can now control a **real browser window** that the user watches: ``` browser_set_visible(true) β†’ Chromium window appears browser_open("site.com") β†’ user sees page loading browser_click("button") β†’ user watches the click browser_fill("input", "X") β†’ user sees text being typed ``` All 23 browser tools work in visible mode. 16-18 round sessions tested successfully. ## New Tool: open_url Opens URL in user's **default desktop browser** (Thorium, Edge, Chrome, High4/19/2026
v0.14.3## Changes All artificial limits removed. Agent works until the task is done. - **max_tool_rounds**: 0 (unlimited) β€” was 10, then 25, then 30 - **max_tool_calls**: 0 (unlimited) β€” was 25 - **per-tool frequency limit**: removed entirely β€” was blocking shell after 5-8 calls - **DB setting reset** to 0 for existing installations ### Only protection: loop detection 2 identical tool calls (same tool + same arguments) = force stop. This catches real infinite loops without blocking legitimate multi-High4/19/2026
v0.14.2## Changes ### Detailed Tool History - Tool call details (**args + results**) now saved to DB and shown when reloading chat history - Previously only tool names were saved β€” now you see exactly what each tool did - Args preview: 200 chars, result preview: 200 chars ### Unlimited Rounds - Removed hard `max_turns` and `max_tool_calls` limits (set to 0 = unlimited) - Agent works as long as needed β€” no more "budget exceeded" after 10 calls - **Loop detection** (2 identical calls) and **per-tool frHigh4/19/2026
v0.14.1## Bugfixes - **`import tools` missing in server.py** β€” caused `NameError: name 'tools' is not defined` on send_file - **`agent_loop`, `agent_events`, `agent_budget` missing from pyproject.toml** β€” caused `ModuleNotFoundError` when installed via pip/installer - **FastEmbed ONNX cache corruption** β€” graceful handling when embedding model files are missing - **Rich console cp1251 crash** β€” doctor command crashed on Windows terminals with emoji; now uses safe fallback ## Camera & Vision - **PersHigh4/19/2026
v0.14.0## πŸš€ What's New ### 🧠 Agent Persistence (long multi-step tasks) - **Tool result clearing**: before each API call, old tool results replaced with one-line summaries (keep last 3 intact). Prevents context overflow during 10+ step tasks. - **Tool result size cap**: individual results capped at 4K chars. One `browser_snapshot` no longer eats the entire context. - **Structured compaction**: 9-section summary format (Current State, Goals, Key Files, Learnings, Next Steps...) β€” injected back into coHigh4/18/2026
v0.13.0## πŸš€ Major Features ### πŸ”§ 5-Layer Reliable Tool Calling Local models (Qwen, Gemma) now reliably call tools instead of describing what they'd do: - **Intent Router**: configurable small instruct model (1-3B) routes user messages to the right tool before the main LLM runs - **Text-to-Tool Extraction**: parses tool calls from model text output when it fails to use native function calling - **tool_search Short-Circuit**: prevents repeated tool_search β€” "tools already active, call them directly" -High4/18/2026
v0.11.0 A polish-focused release centered on three themes: a ground-up Web UI redesign, a critical fix that stops chat file attachments from blowing up the context window, and a new upload-from-computer flow for the Knowledge Base. --- ## 🎨 Web UI Redesign The entire web interface has been rebuilt on a modern design system without touching the underlying structure. ### Design tokens - Full CSS variable palette: colors, radii, shadows, easing curves, inset highlights - Semantic tokens:Medium4/11/2026
v0.10.0## Major Changes ### Agent Loop v2 Clean execution loop inspired by [claw-code-agent](https://github.com/HarnessLab/claw-code-agent): - Continuation handling on max_tokens truncation - Multi-dimensional budget (turns, tool calls, tokens) - Typed event emitter (replaces 5 callback globals) - Vision fallback for non-multimodal models ### Unified Memory + RAG - Single Qdrant collection for agent memory AND indexed files - Files from Upload UI go through memory.save() β†’ knowledge graph Medium4/10/2026
v0.9.0## What's New ### Knowledge Graph - **Three-layer memory system**: raw chunks + entity nodes + wiki summaries in single Qdrant collection - **Auto-chunking**: long texts split into ~800 char pieces on sentence boundaries - **Night synthesis worker**: cron task extracts entities + relations via LLM, builds wiki pages - **Entity nodes**: typed entities (technology, person, project) with weighted relations - **Wiki chunks**: synthesized knowledge with better embeddings than raw text - **EnMedium4/6/2026
v0.8.0## What's New ### Tool Search (Meta-Tool) - Only 8 core tools loaded by default (~750 tokens vs ~3500 previously) - `tool_search("keyword")` activates more tools on demand - **75% token savings** on every API call ### Browser Skill - Native browser control via Playwright + headless Chromium - 7 tools: open, snapshot, screenshot, click, fill, eval, close ### Gemma 4 Support - Strip thinking tags, anti-hedge nudge, works alongside Qwen 3.5 ### System Prompt Optimization - Total Medium4/4/2026
v0.6.0Streaming Web UI: Real-time token streaming with live Markdown rendering (debounced 80ms) CLI: Rich Live progressive Markdown display in terminal Telegram: Native streaming via sendMessageDraft (Bot API 9.3+) with editMessageText fallback Fix: asyncio.Queue-based delivery (event loop no longer blocks during generation) Telegram Upgrade Inline keyboard [Retry] button on every response Thinking shown in spoiler tags, tool calls listed in footer Callback query handler for inline buttons FiMedium4/3/2026
v0.5.0## 🧠 Hybrid Search & Embeddings - **FastEmbed** replaces OpenAI/LM Studio β€” fully local ONNX, no server needed - **Multilingual embeddings** β€” 50+ languages including Russian (`paraphrase-multilingual-MiniLM-L12-v2`) - **Hybrid search** β€” dense + sparse (SPLADE++) fused via RRF - **IDF modifier**, score filtering, Recommend API, grouping, float16 vectors - **Auto-migration** v1β†’v2 with crash recovery ## πŸ€– Small Model Optimizations - **Chain-of-workers** β€” exhausted worker generates handoff + Low3/20/2026
v0.4.0## v0.4.0 β€” Security, Experience Learning, Fallback, Inference Wizard **29 commits** since v0.3.0. Major release with 3 new features, comprehensive security hardening, and performance improvements. ### 🧠 Experience Learning (Memento) Agent learns from past task executions β€” no fine-tuning required. - After each tool-using turn, saves a compact experience case to vector memory - On future similar tasks, retrieves relevant past experiences into context - Composite scoring: `effectivLow3/18/2026
v0.3.1**Full Changelog**: https://github.com/deepfounder-ai/qwe-qwe/compare/v0.3.0...v0.3.1Low3/17/2026
v0.3.0**Full Changelog**: https://github.com/deepfounder-ai/qwe-qwe/commits/v0.3.0Low3/16/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

hermes-gateπŸ›οΈ Hermes Gate β€” Terminal TUI for managing remote Hermes Agent sessions with auto-reconnect, detach support, and zero config0.0.0
@jidagraphy/myliaMy Little AI - tiny AI agent2026.4.21
claude-ruby-grape-railsClaude Code plugin for Ruby, Rails, Grape, PostgreSQL, Redis, and Sidekiq developmentv1.13.4
SurfSenseAn open source, privacy focused alternative to NotebookLM for teams with no data limit's. Join our Discord: https://discord.gg/ejRNvftDp9v0.0.19
PulSeedAn AI agent system that grows your goals from seed to tree. Set a goal β€” Seedy observes, delegates, and tracks until done.v0.4.18