freshcrate
Home > AI Agents > prompt-os

prompt-os

A desktop AI agent that controls your local machine — runs commands, manages files, executes code, browses the web autonomously etc. Supports Claude, GPT, Gemini, Llama, DeepSeek, and more. .exe avail

Description

A desktop AI agent that controls your local machine — runs commands, manages files, executes code, browses the web autonomously etc. Supports Claude, GPT, Gemini, Llama, DeepSeek, and more. .exe available

README

🖥️ Prompt OS

Prompt OS Logo

PythonLicensePlatformStarsPrompt OS is a powerful, desktop-based AI agent built in Python. Unlike a simple chatbot, it acts as a true system agent — it can execute terminal commands, manage your files, run code, search the web, and more, all from a sleek local GUI.

"What processes are using the most memory?" → it runs the command and tells you. "Rename all images in my Downloads folder" → it writes and runs the script.



✨ Why Prompt OS?

Most AI assistants just talk. Prompt OS acts. It runs iteratively — thinking, using tools, reading outputs, and refining — until your task is actually done.


🚀 Features

  • Autonomous Browser Control — Uses browser-use to navigate the web, fill forms, and extract information just like a human.
  • Vision & Screenshot Interpretation — Captures the current screen and uses advanced multimodal models (INTERPRET_SCREENSHOT) to describe windows, apps, and layouts in extreme detail.
  • System Clipboard Manager — The agent can read from and write to your system clipboard (CLIPBOARD_MANAGER) to help you transfer data between applications flawlessly.
  • Native Document Parser — Supports reading and extracting clean text from PDF, DOCX, XLSX, CSV, and HTML files autonomously (READ_FILE).
  • File Pattern Matching (GLOB) — Finds files in directories using glob pattern matching (e.g., *.txt or src/**/*.py) for efficient bulk processing.
  • Content Search (GREP) — Searches for specific patterns or text inside files autonomously, making codebase navigation faster.
  • Shell & PowerShell Support — Executes shell commands via BASH and native Windows commands via POWERSHELL (opt-in preview).
  • Smart Modes — Choose between FAST (efficiency), THINKING (deep reasoning), and PRO (advanced tasks) directly in settings.
  • Multi-Provider LLM Support — Switch between GitHubAI, Groq, OpenRouter, Unclose, Anthropic, OpenAI, Google, Ollama, and LM Studio right from the GUI settings.
  • Web Search & Scraping — The agent autonomously queries DuckDuckGo (SEARCH_WEB) and extracts clean text from webpages.
  • Improved UI Feedback — The status bar now clearly displays which tool is currently running (e.g., "Running Tool: BASH"), giving you full visibility into the agent's actions.
  • Persistent Memory — Stores long-term preferences and context in config/memory.txt, injected automatically into every session.
  • Modern Dark UI — Sleek, responsive desktop interface built with customtkinter featuring an in-app Settings menu for models and API keys.

⚙️ Prerequisites

  • Python 3.10+
  • An active internet connection for cloud API services
  • API keys for one or more supported LLM providers
  • Optional local runtime: Ollama (http://localhost:11434/v1) or LM Studio (http://localhost:1234/v1)

🛠️ Installation

1. Clone the repository:

git clone https://github.com/thomastschinkel/prompt-os.git
cd prompt-os

2. Install dependencies:

pip install -r requirements.txt

3. Configure API keys:

Launch the app (python main.py) and click the Settings gear icon (⚙️) at the top left. Select your desired provider, pick a model, and choose your preferred Mode (FAST, THINKING, or PRO).

Type or paste your API key directly into the secure input box. Click Save to update config/keys.json and config/settings.json.

The built-in Unclose and Google (Gemini) providers work for free. Local providers Ollama and LM Studio do not require paid cloud APIs.


💡 Usage

python main.py
Step Action
1. Pick a model Click the ⚙️ icon to select your LLM provider and default model, enter the API key, and click Save
2. Type a request e.g. "What's eating my CPU right now?" or "Create a script to organize my Desktop"
3. Watch it work The agent thinks iteratively, uses tools, and streams updates in real time

Prompt OS Demo


📁 Project Structure

prompt-os/
├── main.py               # Entry point, UI layer, and tool execution loop
├── src/                  # Core application logic
│   ├── ai.py             # LLM engine, conversation history, API integrations, and Browser-Use setup
│   └── utils.py          # Web search, audio recording, terminal output helpers
├── assets/               # Images, icons, and static UI assets
└── config/               # Configuration and memory files
    ├── keys.json         # API key configuration
    ├── settings.json     # App state (provider, model, mode)
    ├── memory.txt        # Persistent long-term memory
    └── prompt.txt        # System instructions defining agent behavior

🤖 Supported Providers

Provider Model Free? Requires Key?
Google Gemini 3.1 Pro/Flash ✅ (free tier) Yes
GitHubAI GPT-4o Mini / Phi-4 / Llama-3.3 ✅ (with GitHub account) Yes
Groq LLaMA 3.3 70B / Qwen ✅ (free tier) Yes
OpenRoute Qwen 3.6+ / Kimi / Claude ✅ (free tier) Yes
Unclose DeepSeek R1 14B / Qwen3-VL ✅ Completely free No
Anthropic Claude 4.5/4.6 Family ❌ Paid API Yes
OpenAI GPT-4o / GPT-5 ❌ Paid API Yes
Ollama llama3.2 / qwen3 / deepseek-r1 ✅ Local Optional
LM Studio Any locally served OpenAI-compatible model ✅ Local Optional

📜 License

Distributed under the MIT License.


🙌 Contributing

Contributions, issues, and feature requests are welcome! Feel free to open an issue or submit a pull request.

⭐ Support

If you find this project useful, please consider giving it a Star on GitHub. It helps the project grow and stay motivated!

Release History

VersionChangesUrgencyDate
v1.0.0Latest release: v1.0.0High4/21/2026
v1.0.0-beta.4Added an installer to install the exe, add to Desktop, start immediately etc.High4/20/2026
v1.0.0-beta.3# Update Highlights - **Improved GUI**: Modern design inspired by leading AI companies. - **Chat Context**: Expanded from "tasks" to full conversational context. - **Better Rendering**: Superior text and font rendering for better readability. - **Persistent Chats**: Conversations are now permanently saved.High4/19/2026
v1.0.0-beta.2# Release Notes ## New Tools & Shell Update - **Shell Transition:** Removed `cmd`; added `powershell` and `shell`. - **File System:** Added `glob` (pattern search) and `grep` (text search). - **Vision:** `READ.FILE` now supports images. - **New Tool:** `INTERPRET_SCREENSHOT` – capture and analyze screens directly. ## Intelligence & Models - **Optimized Prompt:** Clearer instructions for better agent performance. - **Custom Models:** Support for custom Model IDs and manual selection. High4/17/2026
v1.0.0-beta.1First Beta, could contain some bugs 🐛High4/14/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

GENesis-AGIAutonomous AI agent with persistent memory, self-learning, and earned autonomy. Cognitive partner that remembers, learns, and evolves.v3.0a8
tsunamiautonomous AI agent that builds full-stack apps. local models. no cloud. no API keys. runs on your hardware.main@2026-04-21
forgegodAutonomous coding agent with web research (Recon), adversarial plan debate, 5-tier cognitive memory, multi-model routing (Gemini + DeepSeek + Ollama), 24/7 loops, and $0 local mode. Apache 2.0.main@2026-04-19
Auto-UseAuto-Use Computer Use — drives your OS, browser, scours the web, writes your code. One agent, end to end.V1.0
zai-shellCommand Line telepathy. An Autonomous Al Agent for your Terminal that turns intent into Execution (Windows/Linux/Mac)v9.0.3