freshcrate

Search results for "speech"

102 results found
docling📁2.90.0🏛️ Flagship58,310

SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for powering downstream workflows such as gen AI applications.

faster-whisper📁1.2.1🏛️ Flagship22,327

Faster Whisper transcription with CTranslate2

elevenlabs📁2.44.0🌳 Mature2,935

No description

weasel📁1.0.0🌿 Growing93

Weasel: A small and easy workflow system

ai-powered-video-analyzer📁0.0.0🌿 Growing71

An offline AI-powered video analysis tool with object detection (YOLO), image captioning (BLIP), speech transcription (Whisper), audio event detection (PANNs), and AI-generated summaries (LLMs via Oll

onyx📁v3.2.6🏛️ Flagship27,905

Open Source AI Platform - AI Chat with advanced features that works with every LLM

casibase📁v1.771.3🌳 Mature4,500

⚡️AI Cloud OS: Open-source enterprise-level AI knowledge base and MCP (model-context-protocol)/A2A (agent-to-agent) management platform with admin UI, user management and Single-Sign-On⚡️, supports Ch

minutes📁v0.13.3🌳 Mature1,116

Every meeting, every idea, every voice note — searchable by your AI. Open-source, privacy-first conversation memory layer.

lobehub📁v2.1.53-canary.9🏛️ Flagship75,446

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effo

edgecrab📁v0.8.0🌱 Seedling38

EdgeCrab 🦀 A Super Powerful Personal Assistant inspired by NousHermes and OpenClaw — Rust-native, blazing-fast terminal UI, ReAct tool loop, multi-provider LLM support, ACP protocol, gateway adapters

voicemode📁v8.6.1🌳 Mature1,103

Natural (2-way) voice conversations with Claude Code

npcpy📁v1.4.21🌳 Mature1,307

The python library for research and development in NLP, multimodal LLMs, Agents, ML, Knowledge Graphs, and more.

claude-code-plugins-plus-skills📁v4.26.0🌳 Mature1,995

423 plugins, 2,849 skills, 177 agents for Claude Code. Open-source marketplace at tonsofskills.com with the ccpi CLI package manager.

jarvis📁v1.28.0🌿 Growing300

Your AI assistant that never forgets and runs 100% privately on your computer. Leave it on 24/7 - it learns your preferences, helps with code, manages your health goals, searches the web, and connects

cherry-studio📁v1.9.2🏛️ Flagship43,992

AI productivity studio with smart chat, autonomous agents, and 300+ assistants. Unified access to frontier LLMs

agentscope📁v1.0.19🏛️ Flagship24,189

Build and run agents you can see, understand and trust.

skales📁v10.0.4🌳 Mature831

Your local AI Desktop Agent for Windows, macOS & Linux. Agent Skills (SKILL.md), autonomous coding (Codework), multi-agent teams, desktop automation, 15+ AI providers, Desktop Buddy. No Docker, no ter

mesh-llm📁v0.64.0🌳 Mature834

Distributed AI/LLM for the people. Share compute privately or publicly to power your agents and chat.

Auto-claude-code-research-in-sleep📁v0.4.4🏛️ Flagship7,173

ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works wi

OmniRoute📁v3.6.9🌳 Mature3,250

OmniRoute is an AI gateway for multi-provider LLMs: an OpenAI-compatible endpoint with smart routing, load balancing, retries, and fallbacks. Add policies, rate limits, caching, and observability for

litellm📁v1.83.7-stable🏛️ Flagship44,168

Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropi

Tigrimos📁v1.3.1🌿 Growing76

A self-hosted AI workspace with chat, code execution, parallel multi-agent orchestration, and a skill marketplace. Runs on macOS and Windows. Everything executes inside a secure Ubuntu sandbox — no Do

opencode-telegram-bot📁v0.17.0🌿 Growing499

OpenCode mobile client via Telegram: run and monitor AI coding tasks from your phone while everything runs locally on your machine. Scheduled tasks support. Can be used as lightweight OpenClaw alterna

WeKnora📁v0.4.0🏛️ Flagship13,971

LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.

oh-my-pi📁v14.1.2🌳 Mature3,285

⌥ AI Coding agent for the terminal — hash-anchored edits, optimized tool harness, LSP, Python, browser, subagents, and more

agent-zero📁v1.9🏛️ Flagship17,142

Agent Zero AI framework

CoWork-OS📁v0.5.35🌿 Growing240

Operating System for your personal AI Agents with Security-first approach. Multi-channel (WhatsApp, Telegram, Discord, Slack, iMessage), multi-provider (Claude, GPT, Gemini, Ollama), fully self-hosted

vllm-mlx📁v0.2.8🌳 Mature917

OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX bac

voltagent📁@voltagent/server-elysia@2.0.7🏛️ Flagship8,380

AI Agent Engineering Platform built on an Open Source TypeScript AI Agent Framework

simplechat📁v0.241.006🌿 Growing129

Secure AI conversations with documents, video, audio, and more. Personal workspaces for focused context, group spaces for shared insight. Classify docs, reuse prompts, and extend with modular features

vmlx📁v1.3.34🌿 Growing348

vMLX - Home of JANG_Q - Cont Batch, Prefix, Paged, KV Cache Quant, VL - Powers MLX Studio. Image gen/edit, OpenAI/Anth

chak-ai📁v0.3.1🌿 Growing212

A simple, yet handy, LLM gateway.

ai-forge-mcp📁0.0.0🌿 Growing51

565 AI-callable tools across 16 MCP servers. Full-pipeline AAA game asset production. Controls Blender, Substance Suite, Maya, Houdini, and Unreal Engine 5. 50 specialized AI agents. One prompt in, ga

UGTLive📁0.0.0🌿 Growing75

An easy to use GUI-based tool that performs live translations using OCR and LLMs (Either cloud or local only)

LocalAI📁v4.1.3🏛️ Flagship45,672

LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.

resonant📁v2.1.1🌱 Seedling27

Open-source relational AI framework with identity persistence, memory, and MCP integration. Build relationship-aware AI agents that remember, grow, and maintain continuity. Built on Claude Agent SDK.

claude-code-ultimate-guide📁guide-export-v3.38.3🌳 Mature3,789

A tremendous feat of documentation, this guide covers Claude Code from beginner to power user, with production-ready templates for Claude Code features, guides on agentic workflows, and a lot of great

anything-llm📁v1.12.0🏛️ Flagship58,708

The all-in-one AI productivity accelerator. On device and privacy first with no annoying setup or configuration.

ten-framework📁0.11.63🏛️ Flagship10,435

Open-source framework for conversational voice AI agents

dify📁1.13.3🏛️ Flagship138,659

Production-ready platform for agentic workflow development.

txtai📁v9.7.0🏛️ Flagship12,412

💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows

py-gpt📁v2.7.12🌳 Mature1,738

Desktop AI Assistant powered by GPT-5, GPT-4, o1, o3, Gemini, Claude, Ollama, DeepSeek, Perplexity, Grok, Bielik, chat, vision, voice, RAG, image and video generation, agents, tools, MCP, plugins, spe

Autonomous-Agents📁main@2026-04-16🌿 Growing1,232

Autonomous Agents (LLMs) research papers. Updated Daily.

VideoGraphAI📁0.0.0🌿 Growing57

🎬 AI-powered YouTube Shorts automation tool using LLMs, real-time search, and text-to-speech. Create engaging short-form videos with automated research, voiceovers, and subtitles.

cerul📁v0.0.03🌿 Growing127

The video search layer for AI agents. Search video by meaning — across speech, visuals, and on-screen text.

PythonClaw📁0.0.0🌱 Seedling23

OpenClaw reimagined in pure Python — autonomous AI agent with memory, RAG, skills, web dashboard, voice input, daemon, and multi-channel support.

cyllama📁0.2.11🌱 Seedling25

A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp

miniclaw-os📁v0.1.9🌱 Seedling39

We gave AI agents a brain. Memory, planning, continuity, and self-repair — the missing cognitive architecture layer. Runs on your Mac.

orbit📁v2.6.6🌿 Growing250

One API for 20+ LLM providers, your databases, and your files — self-hosted, open-source AI gateway with RAG, voice, and guardrails.

RAPTOR📁0.0.0🌱 Seedling14

RAPTOR (Robust AI-Powered Toolkit for Operational Robots) is an AI-native Content Insight Engine that transforms passive media storage into an intelligent knowledge platform through automated analysis

DuckyClaw📁v1.1.0🌿 Growing129

Edge-Hardware (SoC/MCU) oriented Claw🦞

Awesome-World-Models📁main@2026-04-21🌿 Growing1,542

A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related website

awesome-opensource-ai📁main@2026-04-20🌿 Growing2,849

Curated list of the best truly open-source AI projects, models, tools, and infrastructure.

tools📁main@2026-04-20🌿 Growing1,636

Assorted useful tools, almost entirely generated using LLMs

awesome-ai-tools📁main@2026-04-19🌿 Growing390

🔴 VERY LARGE AI TOOL LIST! 🔴 Curated list of AI Tools - Updated 2026

VisionClaw-Agent-Public-Release📁v0.1.1🌱 Seedling10

Open-source multi-tenant AI agent platform — 14 specialized agents, 195+ tools, 37+ AI models. Self-hosted. Fork and deploy your own AI operations team.

agenticSeek📁main@2026-04-11🌿 Growing26,028

Fully Local Manus AI. No APIs, No $200 monthly bills. Enjoy an autonomous agent that thinks, browses the web, and code for the sole cost of electricity. 🔔 Official updates only via twitter @Martin993

MakerAi📁master@2026-04-11🌿 Growing163

The AI Operating System for Delphi. 100% native framework with RAG 2.0 for knowledge retrieval, autonomous agents with semantic memory, visual workflow orchestration, and universal LLM connector. Supp

Agent-World-Protocol📁main@2026-04-10🌱 Seedling38

The open world for autonomous AI agents on Solana Trade. Build. Fight. Earn. Explore. Connect your AI agent to a persistent shared world. Trade real SOL, build structures, form guilds, fight for terri

kai📁v1.4.0🌱 Seedling29

Agentic AI assistant on Telegram, powered by Claude Code. Runs locally with shell access, spec-driven PR reviews, layered security, persistent memory, and scheduled jobs. Your machine, your data, your

TomoriBot📁v0.7.904🌱 Seedling34

A highly customizable personal AI assistant for Discord featuring smart agentic AI features such as memory, personas, tool usage, and more! | 長期記憶やペルソナ、ツール連携を完備。 次世代の「自律型AIエージェント」Discordボット!

Cogitator-AI📁main@2026-04-21🌱 Seedling36

🤖 Kubernetes for AI Agents. Self-hosted, production-grade runtime for orchestrating LLM swarms and autonomous agents. TypeScript-native.

mayros📁v0.3.2🌱 Seedling10

Production-ready AI agent framework — semantic memory, multi-agent mesh, MCP server, intelligent routing, governance, and 67+ platform integrations.

APT📁2.9.16.0🌳 Mature773

AI Productivity Tool - Free and open source, improve user productivity, and protect privacy and data security. Including but not limited to: built-in local exclusive ChatGPT, DeepSeek, Phi, Qwen and o

heurist-agent-framework📁0.0.0🌱 Seedling798

A flexible multi-interface AI agent framework for building agents with reasoning, tool use, memory, deep research, blockchain interaction, MCP, and agents-as-a-service.

Awesome-GPT-Image-2-API-Prompts📁0.0.0🌱 Seedling1,530

Curated GPT-Image-2 prompts for the OpenAI API — portraits, posters, UI mockups, game screenshots, character sheets, and more. Ready-to-use prompts for gpt-image-2.

Ultimate-Agent-Directory📁0.0.0🌱 Seedling51

🤖 The most comprehensive directory of AI agent frameworks, platforms, tools, and resources - hundreds of curated entries covering open-source, no-code, enterprise, and autonomous solutions. NEW Boil

kernel📁v3.97.0🌱 Seedling12

kbot — the AI agent that dreams, learns, and evolves. 764+ tools, 35 agents, 20 providers. Music production, iPhone control, financial analysis, cyber threat intel. Always-on daemon. Runs offline. npm

awesome-gpt-image-1.5📁main@2026-04-21🌱 Seedling19

🎨 100+ selected GPT Image 1.5 prompts with images, multilingual support, and instant gallery preview. Open-source prompt engineering library

radio-gateway📁v3.3.0🌱 Seedling5

Ham radio & GMRS gateway, repeater and packet radio — bridges two-way radios to Mumble, Broadcastify, and the internet. AIOC USB, RSPduo dual SDR, TH-9800/D75/KV4P CAT control, AI announcements, ADS-B

CoexistAI📁v2.6💤 Dormant470

CoexistAI is a modular, developer-friendly research assistant framework . It enables you to build, search, summarize, and automate research workflows using LLMs, web search, Reddit, YouTube, and mappi

Open-Sable📁v1.7.0🌱 Seedling19

Open-Sable is a local-first autonomous agent framework with AGI-inspired cognitive subsystems (goals, memory, metacognition, tool use). It can run continuously on your machine, integrate with chat int

hermes-life-os📁v1.3.0🌱 Seedling35

Personal OS agent that learns who you are, detects life patterns, and grows smarter about you every day. Memory + Cron + Atropos RL

apiclaw📁v2.0.0🌱 Seedling7

The API layer for AI agents. Dashboard + 22K APIs + 18 Direct Call providers. MCP native.

DreamServer📁v2.0.0🌿 Growing443

Local AI anywhere, for everyone — LLM inference, chat UI, voice, agents, workflows, RAG, and image generation. No cloud, no subscriptions.

Somi📁Mineralization🌱 Seedling20

Local-first AI agent framework with GUI, memory, web search, personality constructs, speech i/o, tools, skills, CLI & Telegram features — fully self-hosted via Ollama.

LLM-Agent-Paper-daily📁main@2026-04-21🌱 Seedling20

Automatically Update LLM-Agent Papers Daily using Github Actions (Update Every 12th hours)

ben📁v1.0.0🌱 Seedling31

Ben — an autonomous digital entity that lives on Crustocean

cloneme📁0.0.0💤 Dormant38

CloneMe is an advanced AI platform that builds your digital twin—an AI that chats like you, remembers details, and supports multiple platforms. Customizable, memory-driven, and hot-reloadable, it's th

ryvos📁v0.9.0🌱 Seedling2

Open-source autonomous AI assistant with 5-tier security, 62 tools, 14 LLM providers. Written in Rust. Single binary.

second-brain📁1.0🌱 Seedling461

Second Brain is a desktop application that acts as a personal knowledge base, using retrieval-augmented generation (RAG), multimodal AI models, and a hybrid lexical/semantic search algorithm to intera

brain-thing📁v0.5.0🌱 Seedling1

This is a standalone MCP for Claude & Claude Code, which can be used alongside Obsidian MD. But I'm bad at naming things.

react-native-agentic-ai📁main@2026-04-18🌱 Seedling4

Autonomous AI Agent SDK for React Native & Expo — AI reads your live UI, acts via natural language, real-time voice agent (Gemini Live), and AI-powered testing via MCP (Model Context Protocol). One co

agenticchat📁v2.31.0🌱 Seedling2

Turn natural language into executable code — right in your browser. Lightweight AI chat powered by GPT-4o with sandboxed JavaScript execution.

openchatci📁v0.42.0🌱 Seedling1

The localhost AI Agent Runtime -- Chat UI, Tools, RAG, and MCP in one pip install

@100xbot/agent-input📁0.3.22🌱 Seedling

Reusable AI agent chat input bar with mentions, speech, file upload, and workflow review

sofia📁main@2026-04-11🌱 Seedling2

Autonomous local AI assistant in Go — 40+ tools, 20+ LLM providers, multi-agent orchestration, self-improving

AttentiveSupport📁0.0.0💤 Dormant36

llm-based robot that intervenes only when needed

ai-video-generation-workflow📁main@2026-04-21🌱 Seedling2

Generate reliable short finance explainer videos with script, slides, voice, subtitles, and batch-ready rendering in a stable, modular workflow.

awesome-lark-bots📁main@2026-04-21🌱 Seedling2

Provide open-source AI bots for Lark to automate tasks like brainstorming, project planning, content creation, and monitoring within a secure chat interface.

JianYan📁main@2026-04-21🌱 Seedling2

🎤 Transform speech to text on Windows with fast, local AI processing. Enjoy seamless recording and automatic integration for effective communication.

agentic-news-generator📁main@2026-04-20🌱 Seedling1

Generate a custom newspaper with an AI agent based on your favorite YouTube channels.

Government-Citizen-Services-Voice-Agent📁main@2026-04-15🌱 Seedling1

Autonomous, multilingual AI voice agent using ElevenLabs, LangGraph, and RAG for government services

seedance-2-ai📁main@2026-04-21🌱 Seedling1

🎥 Generate AI-driven videos with Seedance 2.0, offering precise physics, lip-sync, and prompt accuracy for seamless content creation.

enton📁main@2026-04-21🌱 Seedling1

Builds an autonomous AI robot with vision, voice, and decision-making capabilities using Python, PyTorch, and CUDA technology.

fast-rlm📁main@2026-04-21🌱 Seedling1

Implement Recursive Language Models using Deno and Pyodide to enable scalable, code-driven prompt processing with modular sub-agent calls.

pyannote-metrics4.0.0🌱 Seedling

A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems