freshcrate

Search results for "speech"

Clear filters
50 results found (Python)
doclingπŸ“2.90.0πŸ›οΈ Flagship⭐58,310

SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for powering downstream workflows such as gen AI applications.

faster-whisperπŸ“1.2.1πŸ›οΈ Flagship⭐22,327

Faster Whisper transcription with CTranslate2

elevenlabsπŸ“2.44.0🌳 Mature⭐2,935

No description

weaselπŸ“1.0.0🌿 Growing⭐93

Weasel: A small and easy workflow system

ai-powered-video-analyzerπŸ“0.0.0🌿 Growing⭐71

An offline AI-powered video analysis tool with object detection (YOLO), image captioning (BLIP), speech transcription (Whisper), audio event detection (PANNs), and AI-generated summaries (LLMs via Oll

onyxπŸ“v3.2.6πŸ›οΈ Flagship⭐27,905

Open Source AI Platform - AI Chat with advanced features that works with every LLM

voicemodeπŸ“v8.6.1🌳 Mature⭐1,103

Natural (2-way) voice conversations with Claude Code

npcpyπŸ“v1.4.21🌳 Mature⭐1,307

The python library for research and development in NLP, multimodal LLMs, Agents, ML, Knowledge Graphs, and more.

claude-code-plugins-plus-skillsπŸ“v4.26.0🌳 Mature⭐1,995

423 plugins, 2,849 skills, 177 agents for Claude Code. Open-source marketplace at tonsofskills.com with the ccpi CLI package manager.

jarvisπŸ“v1.28.0🌿 Growing⭐300

Your AI assistant that never forgets and runs 100% privately on your computer. Leave it on 24/7 - it learns your preferences, helps with code, manages your health goals, searches the web, and connects

agentscopeπŸ“v1.0.19πŸ›οΈ Flagship⭐24,189

Build and run agents you can see, understand and trust.

Auto-claude-code-research-in-sleepπŸ“v0.4.4πŸ›οΈ Flagship⭐7,173

ARIS βš”οΈ (Auto-Research-In-Sleep) β€” Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in β€” works wi

litellmπŸ“v1.83.7-stableπŸ›οΈ Flagship⭐44,168

Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropi

agent-zeroπŸ“v1.9πŸ›οΈ Flagship⭐17,142

Agent Zero AI framework

vllm-mlxπŸ“v0.2.8🌳 Mature⭐917

OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX bac

simplechatπŸ“v0.241.006🌿 Growing⭐129

Secure AI conversations with documents, video, audio, and more. Personal workspaces for focused context, group spaces for shared insight. Classify docs, reuse prompts, and extend with modular features

vmlxπŸ“v1.3.34🌿 Growing⭐348

vMLX - Home of JANG_Q - Cont Batch, Prefix, Paged, KV Cache Quant, VL - Powers MLX Studio. Image gen/edit, OpenAI/Anth

chak-aiπŸ“v0.3.1🌿 Growing⭐212

A simple, yet handy, LLM gateway.

ten-frameworkπŸ“0.11.63πŸ›οΈ Flagship⭐10,435

Open-source framework for conversational voice AI agents

txtaiπŸ“v9.7.0πŸ›οΈ Flagship⭐12,412

πŸ’‘ All-in-one AI framework for semantic search, LLM orchestration and language model workflows

py-gptπŸ“v2.7.12🌳 Mature⭐1,738

Desktop AI Assistant powered by GPT-5, GPT-4, o1, o3, Gemini, Claude, Ollama, DeepSeek, Perplexity, Grok, Bielik, chat, vision, voice, RAG, image and video generation, agents, tools, MCP, plugins, spe

VideoGraphAIπŸ“0.0.0🌿 Growing⭐57

🎬 AI-powered YouTube Shorts automation tool using LLMs, real-time search, and text-to-speech. Create engaging short-form videos with automated research, voiceovers, and subtitles.

PythonClawπŸ“0.0.0🌱 Seedling⭐23

OpenClaw reimagined in pure Python β€” autonomous AI agent with memory, RAG, skills, web dashboard, voice input, daemon, and multi-channel support.

cyllamaπŸ“0.2.11🌱 Seedling⭐25

A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp

orbitπŸ“v2.6.6🌿 Growing⭐250

One API for 20+ LLM providers, your databases, and your files β€” self-hosted, open-source AI gateway with RAG, voice, and guardrails.

RAPTORπŸ“0.0.0🌱 Seedling⭐14

RAPTOR (Robust AI-Powered Toolkit for Operational Robots) is an AI-native Content Insight Engine that transforms passive media storage into an intelligent knowledge platform through automated analysis

awesome-opensource-aiπŸ“main@2026-04-20🌿 Growing⭐2,849

Curated list of the best truly open-source AI projects, models, tools, and infrastructure.

agenticSeekπŸ“main@2026-04-11🌿 Growing⭐26,028

Fully Local Manus AI. No APIs, No $200 monthly bills. Enjoy an autonomous agent that thinks, browses the web, and code for the sole cost of electricity. πŸ”” Official updates only via twitter @Martin993

kaiπŸ“v1.4.0🌱 Seedling⭐29

Agentic AI assistant on Telegram, powered by Claude Code. Runs locally with shell access, spec-driven PR reviews, layered security, persistent memory, and scheduled jobs. Your machine, your data, your

heurist-agent-frameworkπŸ“0.0.0🌱 Seedling⭐798

A flexible multi-interface AI agent framework for building agents with reasoning, tool use, memory, deep research, blockchain interaction, MCP, and agents-as-a-service.

Ultimate-Agent-DirectoryπŸ“0.0.0🌱 Seedling⭐51

πŸ€– The most comprehensive directory of AI agent frameworks, platforms, tools, and resources - hundreds of curated entries covering open-source, no-code, enterprise, and autonomous solutions. NEW Boil

RIGELπŸ“0.0.0🌱 Seedling⭐26

A Multi-Agentic AI Assistant/Builder

radio-gatewayπŸ“v3.3.0🌱 Seedling⭐5

Ham radio & GMRS gateway, repeater and packet radio β€” bridges two-way radios to Mumble, Broadcastify, and the internet. AIOC USB, RSPduo dual SDR, TH-9800/D75/KV4P CAT control, AI announcements, ADS-B

Open-SableπŸ“v1.7.0🌱 Seedling⭐19

Open-Sable is a local-first autonomous agent framework with AGI-inspired cognitive subsystems (goals, memory, metacognition, tool use). It can run continuously on your machine, integrate with chat int

hermes-life-osπŸ“v1.3.0🌱 Seedling⭐35

Personal OS agent that learns who you are, detects life patterns, and grows smarter about you every day. Memory + Cron + Atropos RL

apiclawπŸ“v2.0.0🌱 Seedling⭐7

The API layer for AI agents. Dashboard + 22K APIs + 18 Direct Call providers. MCP native.

SomiπŸ“Mineralization🌱 Seedling⭐20

Local-first AI agent framework with GUI, memory, web search, personality constructs, speech i/o, tools, skills, CLI & Telegram features β€” fully self-hosted via Ollama.

LLM-Agent-Paper-dailyπŸ“main@2026-04-21🌱 Seedling⭐20

Automatically Update LLM-Agent Papers Daily using Github Actions (Update Every 12th hours)

clonemeπŸ“0.0.0πŸ’€ Dormant⭐38

CloneMe is an advanced AI platform that builds your digital twinβ€”an AI that chats like you, remembers details, and supports multiple platforms. Customizable, memory-driven, and hot-reloadable, it's th

second-brainπŸ“1.0🌱 Seedling⭐461

Second Brain is a desktop application that acts as a personal knowledge base, using retrieval-augmented generation (RAG), multimodal AI models, and a hybrid lexical/semantic search algorithm to intera

openchatciπŸ“v0.42.0🌱 Seedling⭐1

The localhost AI Agent Runtime -- Chat UI, Tools, RAG, and MCP in one pip install

AttentiveSupportπŸ“0.0.0πŸ’€ Dormant⭐36

llm-based robot that intervenes only when needed

awesome-lark-botsπŸ“main@2026-04-21🌱 Seedling⭐2

Provide open-source AI bots for Lark to automate tasks like brainstorming, project planning, content creation, and monitoring within a secure chat interface.

JianYanπŸ“main@2026-04-21🌱 Seedling⭐2

🎀 Transform speech to text on Windows with fast, local AI processing. Enjoy seamless recording and automatic integration for effective communication.

Government-Citizen-Services-Voice-AgentπŸ“main@2026-04-15🌱 Seedling⭐1

Autonomous, multilingual AI voice agent using ElevenLabs, LangGraph, and RAG for government services

seedance-2-aiπŸ“main@2026-04-21🌱 Seedling⭐1

πŸŽ₯ Generate AI-driven videos with Seedance 2.0, offering precise physics, lip-sync, and prompt accuracy for seamless content creation.

entonπŸ“main@2026-04-21🌱 Seedling⭐1

Builds an autonomous AI robot with vision, voice, and decision-making capabilities using Python, PyTorch, and CUDA technology.

pyannote-metrics4.0.0🌱 Seedling

A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems