freshcrate
Skin:/

moss

Official Repo of Moss

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

Official Repo of Moss

README

Moss

Moss

Real-time semantic search for AI agents. Sub-10 ms.

License PyPI PyPI downloads npm npm downloads Discord

Website ยท Docs ยท Discord ยท Blog


Moss is the search runtime that lives inside your Conversational AI agent.

Index documents, query them semantically, and get results back in under 10 ms - fast enough for real-time conversation.

Moss Python walkthrough

Quickstart

Python

pip install moss
from moss import MossClient, QueryOptions

client = MossClient("your_project_id", "your_project_key")

# Create an index and add documents
await client.create_index("support-docs", [
    {"id": "1", "text": "Refunds are processed within 3-5 business days."},
    {"id": "2", "text": "You can track your order on the dashboard."},
    {"id": "3", "text": "We offer 24/7 live chat support."},
])

# Load and query โ€” results in <10 ms
await client.load_index("support-docs")
results = await client.query("support-docs", "how long do refunds take?", QueryOptions(top_k=3))

for doc in results.docs:
    print(f"[{doc.score:.3f}] {doc.text}")  # Returned in {results.time_taken_ms}ms

TypeScript

npm install @moss-dev/moss
import { MossClient } from "@moss-dev/moss";

const client = new MossClient("your_project_id", "your_project_key");

// Create an index and add documents
await client.createIndex("support-docs", [
  { id: "1", text: "Refunds are processed within 3-5 business days." },
  { id: "2", text: "You can track your order on the dashboard." },
  { id: "3", text: "We offer 24/7 live chat support." },
]);

// Load and query โ€” results in <10 ms
await client.loadIndex("support-docs");
const results = await client.query("support-docs", "how long do refunds take?", { topK: 3 });

results.docs.forEach((doc) => {
  console.log(`[${doc.score.toFixed(3)}] ${doc.text}`); // Returned in ${results.timeTakenInMs}ms
});

Get your project credentials at moss.dev - free tier available.

Why Moss?

Vector databases were built for batch analytics. Moss was built for real-time agents.

If you're building a voice bot, a copilot, or any AI system that talks to humans, you need retrieval that keeps up with conversation. A 200-500 ms round trip to a vector database kills the experience. Moss delivers results in single-digit milliseconds - fast enough that retrieval disappears from the latency budget.

Benchmarks

End-to-end query latency (embedding + search) on 100,000 documents, 750 measured queries, top_k=5. Tested with Macbook pro (M4 Pro, 24GB).

System P50 P95 P99 Mean
Moss 3.1 ms 4.3 ms 5.4 ms 3.3 ms
Pinecone 432.6 ms 732.1 ms 934.2 ms 485.8 ms
Qdrant 597.6 ms 682.0 ms 771.4 ms 596.5 ms
ChromaDB 351.8 ms 423.5 ms 538.5 ms 358.0 ms

Moss includes embedding in the measurement โ€” competitors use an external embedding service (modal). Pinecone and Qdrant use cloud search.

Reproduce these benchmarks โ†’

Moss isn't a database! It's a search runtime. You don't manage clusters, tune HNSW parameters, or worry about sharding. You index documents, load them into the runtime, and query. That's it.

Features

  • Sub-10 ms semantic search - p99 of 8 ms
  • Built-in embedding models - no OpenAI key required (or bring your own)
  • Metadata filtering - filter by $eq, $and, $in, $near operators
  • Document management - add, upsert, retrieve, and delete documents
  • Python + TypeScript SDKs - async-first, type-safe
  • Framework integrations - LangChain, DSPy, Pipecat, LiveKit, LlamaIndex

Examples

This repo contains working examples you can copy straight into your project:

examples/
โ”œโ”€โ”€ python/                  # Python SDK samples
โ”‚   โ”œโ”€โ”€ load_and_query_sample.py
โ”‚   โ”œโ”€โ”€ comprehensive_sample.py
โ”‚   โ”œโ”€โ”€ custom_embedding_sample.py
โ”‚   โ””โ”€โ”€ metadata_filtering.py
โ”œโ”€โ”€ javascript/              # TypeScript SDK samples
โ”‚   โ”œโ”€โ”€ load_and_query_sample.ts
โ”‚   โ”œโ”€โ”€ comprehensive_sample.ts
โ”‚   โ””โ”€โ”€ custom_embedding_sample.ts
โ””โ”€โ”€ cookbook/                # Framework integrations
    โ”œโ”€โ”€ langchain/           # LangChain retriever
    โ””โ”€โ”€ dspy/                # DSPy module

apps/
โ”œโ”€โ”€ next-js/                 # Next.js semantic search UI
โ”œโ”€โ”€ pipecat-moss/            # Pipecat voice agent with Moss retrieval
โ”œโ”€โ”€ livekit-moss-vercel/     # LiveKit voice agent on Vercel
โ””โ”€โ”€ docker/                  # Dockerized examples (ECS/K8s pattern)

Run the Python examples

cd examples/python
pip install -r requirements.txt
cp ../../.env.example .env   # Add your credentials
python load_and_query_sample.py

Run the TypeScript examples

cd examples/javascript
npm install
cp ../../.env.example .env   # Add your credentials
npx tsx load_and_query_sample.ts

Run the Next.js app

cd apps/next-js
npm install
cp ../../.env.example .env   # Add your credentials
npm run dev                  # Open http://localhost:3000

Run the Pipecat voice agent

Sub-10 ms retrieval plugged into Pipecat's real-time voice pipeline โ€” a customer support agent that actually keeps up with conversation.

cd apps/pipecat-moss/pipecat-quickstart
# See README for setup and Pipecat Cloud deployment

SDK Reference

Python (moss)

from moss import MossClient, DocumentInfo, QueryOptions, MutationOptions, GetDocumentsOptions

client = MossClient(project_id, project_key)

# Index management
await client.create_index(name, documents, model_id="moss-minilm")
await client.get_index(name)
await client.list_indexes()
await client.delete_index(name)

# Document operations
await client.add_docs(name, documents, MutationOptions(upsert=True))
await client.get_docs(name)
await client.get_docs(name, GetDocumentsOptions(doc_ids=["id1", "id2"]))
await client.delete_docs(name, ["id1", "id2"])

# Search
await client.load_index(name)
results = await client.query(name, "your query", QueryOptions(top_k=5))
# results.docs[0].id, .text, .score, .metadata
# results.time_taken_ms

TypeScript (@moss-dev/moss)

import { MossClient, DocumentInfo } from "@moss-dev/moss";

const client = new MossClient(projectId, projectKey);

// Index management
await client.createIndex(name, documents, { modelId: "moss-minilm" });
await client.getIndex(name);
await client.listIndexes();
await client.deleteIndex(name);

// Document operations
await client.addDocs(name, documents, { upsert: true });
await client.getDocs(name);
await client.getDocs(name, { docIds: ["id1", "id2"] });
await client.deleteDocs(name, ["id1", "id2"]);

// Search
await client.loadIndex(name);
const results = await client.query(name, "your query", { topK: 5 });
// results.docs[0].id, .text, .score, .metadata
// results.timeTakenInMs

Integrations

Framework Status Example
LangChain Available examples/cookbook/langchain/
DSPy Available examples/cookbook/dspy/
Pipecat Available apps/pipecat-moss/
LiveKit Available apps/livekit-moss-vercel/
Next.js Available apps/next-js/
VitePress Available packages/vitepress-plugin-moss/
Vercel AI SDK Available packages/vercel-sdk/
CrewAI Coming soon โ€”

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                  Your Application               โ”‚
โ”‚         (Voice bot, Copilot, Chat agent)        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                     โ”‚
          โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
          โ”‚     Moss SDK        โ”‚
          โ”‚(Python / TypeScript)โ”‚
          โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                     โ”‚  HTTPS
          โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
          โ”‚   Moss Runtime      โ”‚
          โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
          โ”‚  โ”‚  Embedding    โ”‚  โ”‚
          โ”‚  โ”‚  Engine       โ”‚  โ”‚
          โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
          โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
          โ”‚  โ”‚  Search       โ”‚  โ”‚
          โ”‚  โ”‚  Runtime      โ”‚โ—„โ”€โ”ผโ”€โ”€ Sub-10 ms queries
          โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
          โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

The SDKs in this repo are thin clients that talk to the Moss runtime over HTTPS. The runtime handles embedding, indexing, and search โ€” you don't need to manage any infrastructure.

Full Python SDK source code is available at sdks/python/.

Contributing

We welcome contributions! Here's where the community can have the most impact:

  • New SDK bindings โ€” Swift, Go, Elixir,...
  • Framework integrations โ€” CrewAI, Haystack, AutoGen
  • Reranking support โ€” plug in cross-encoder rerankers
  • Doc-parsing connectors โ€” PDF, DOCX, HTML, Markdown ingestion
  • Examples and tutorials โ€” if you build something with Moss, we'd love to feature it

See our Contributing Guide for setup instructions and our Roadmap for what's planned.

Check out issues labeled good first issue to get started.

Contributors

Contributors

Community

  • Discord โ€” ask questions, share what you're building
  • GitHub Issues โ€” bug reports and feature requests
  • Twitter โ€” announcements and updates

License

BSD 2-Clause License โ€” the SDKs, examples, and integrations in this repo are fully open source.


Built by the team at Moss ยท Backed by Y Combinator

Release History

VersionChangesUrgencyDate
v0.4.1Clean re-release of the v0.4.0 fixes (the v0.4.0 tag was force-moved, which SwiftPM caches as a pin mismatch). Same xcframework as v0.4.0; synced 4-arg Swift wrapper. Consume with `from: "0.4.0"` (resolves up) or `from: "0.4.1"`. SHA-256: 49a9c47198ffd75f50b3a91aaa7f416091ebe893a79e1517631a90ecff4b340cHigh6/3/2026
v0.2.0Moss iOS SDK v0.2.0. ## SwiftPM consumption ```swift .package(url: "https://github.com/usemoss/moss", from: "0.2.0"), ``` ## Package.swift binary target ```swift .binaryTarget( name: "MossC", url: "https://github.com/usemoss/moss/releases/download/v0.2.0/Moss.xcframework.zip", checksum: "0e0c6ef37569f0570511d86878c28485b5c3c8865f30541a16242875fdde9cbc" ) ``` SHA-256: `0e0c6ef37569f0570511d86878c28485b5c3c8865f30541a16242875fdde9cbc` ## What's in the xcframewHigh5/27/2026
c-sdk-v0.9.0Precompiled C SDK binaries for the Moss SDK. ### Platforms | Archive | OS | Arch | |---|---|---| | `libmoss-v0.9.0-aarch64-apple-darwin.tar.gz` | macOS | ARM64 (Apple Silicon) | | `libmoss-v0.9.0-x86_64-unknown-linux-gnu.tar.gz` | Linux | x86_64 | | `libmoss-v0.9.0-aarch64-unknown-linux-gnu.tar.gz` | Linux | ARM64 | | `libmoss-v0.9.0-x86_64-pc-windows-msvc.tar.gz` | Windows | x86_64 | ### Contents Each archive contains: - `include/libmoss.h` โ€” C header - `lib/libmoss.{dylib,so,dll}` โ€” sharedHigh4/9/2026
pipecat-moss-v0.0.3Latest release: pipecat-moss-v0.0.3High4/9/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

VisionClaw-Agent-Public-ReleaseOpen-source multi-tenant AI agent platform โ€” 14 specialized agents, 195+ tools, 37+ AI models. Self-hosted. Fork and deploy your own AI operations team.v0.1.1
antflyNo descriptionmain@2026-06-05
taleThe Sovereign AI Platformโ€‹ โ€” Local AI models, agents, skills, and automations โ€” on your own infrastructure, connected to your datav0.2.81
claude-memA Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future sesv13.4.0
chronosChronos is visual AI agent builder - tailored for self-hosted deployments and observabilityv1.8.2

More in Infrastructure

tensorzeroTensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.
planoPlano is an AI-native proxy and data plane for agentic apps โ€” with built-in orchestration, safety, observability, and smart LLM routing so you stay focused on your agents core logic.
modelsThis repository contains comprehensive pricing and configuration data for LLMs. It powers cost attribution for 200+ enterprises running 400B+ tokens through Portkey AI Gateway every day.
edgeeOpen-source AI gateway written in Rust, with token compression for Claude Code, Codex... and any other LLM client.