npcpy

The python library for research and development in NLP, multimodal LLMs, Agents, ML, Knowledge Graphs, and more.

agents ai llm mcp mcp-client mcp-server ollama perplexity python

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

The python library for research and development in NLP, multimodal LLMs, Agents, ML, Knowledge Graphs, and more.

README

npcpy

npcpy is a flexible agent framework for building AI applications and conducting research with LLMs. It supports local and cloud providers, multi-agent teams, tool calling, image/audio/video generation, knowledge graphs, fine-tuning, and more.

pip install npcpy

Quick Examples

Create and use personas

from npcpy import NPC

simon = NPC(
    name='Simon Bolivar',
    primary_directive='Liberate South America from the Spanish Royalists.',
    model='gemma3:4b',
    provider='ollama'
)
response = simon.get_llm_response("What is the most important territory to retain in the Andes?")
print(response['response'])

Direct LLM call

from npcpy import get_llm_response

response = get_llm_response("Who was the celtic messenger god?", model='qwen3:4b', provider='ollama')
print(response['response'])
# or use ollama's cloud models

test = get_llm_response('who is john wick', model='minimax-m2.7:cloud', provider='ollama',)

print(test['response'])

Agent with tools

from npcpy import Agent, ToolAgent, CodingAgent

# Agent — comes with default tools (sh, python, edit_file, web_search, etc.)
agent = Agent(name='ops', model='qwen3.5:2b', provider='ollama')
print(agent.run("Find all Python files over 500 lines in this repo and list them"))

# ToolAgent — add your own tools alongside defaults
import subprocess

def run_tests(test_path: str = "tests/") -> str:
    """Run pytest on the given path and return results."""
    result = subprocess.run(["python3", "-m", "pytest", test_path, "-v", "--tb=short"],
                            capture_output=True, text=True, timeout=120)
    return result.stdout + result.stderr

def git_diff(branch: str = "main") -> str:
    """Show the git diff against a branch."""
    result = subprocess.run(["git", "diff", branch, "--stat"], capture_output=True, text=True)
    return result.stdout

reviewer = ToolAgent(
    name='code_reviewer',
    primary_directive='You review code changes, run tests, and report issues.',
    tools=[run_tests, git_diff],
    model='qwen3.5:2b', provider='ollama'
)
print(reviewer.run("Run the tests and summarize any failures"))

# CodingAgent — auto-executes code blocks from LLM responses
coder = CodingAgent(name='coder', language='python', model='qwen3.5:2b', provider='ollama')
print(coder.run("Write a script that finds duplicate files by hash in the current directory"))

Streaming

from npcpy import get_llm_response
from npcpy.streaming import parse_stream_chunk

response = get_llm_response("Explain quantum entanglement.", model='qwen3.5:2b', provider='ollama', stream=True)
for chunk in response['response']:
    content, _, _ = parse_stream_chunk(chunk, provider='ollama')
    if content:
        print(content, end='', flush=True)

# Works the same with any provider
response = get_llm_response("Explain quantum entanglement.", model='gemini-2.5-flash', provider='gemini', stream=True)
for chunk in response['response']:
    content, _, _ = parse_stream_chunk(chunk, provider='gemini')
    if content:
        print(content, end='', flush=True)

JSON output

Include the expected JSON structure in your prompt. With format='json', the response is auto-parsed — response['response'] is already a dict or list.

from npcpy import get_llm_response

response = get_llm_response(
    '''List 3 planets from the sun.
    Return JSON: {"planets": [{"name": "planet name", "distance_au": 0.0, "num_moons": 0}]}''',
    model='qwen3.5:2b', provider='ollama',
    format='json'
)
for planet in response['response']['planets']:
    print(f"{planet['name']}: {planet['distance_au']} AU, {planet['num_moons']} moons")

response = get_llm_response(
    '''Analyze this review: 'The battery life is amazing but the screen is too dim.'
    Return JSON: {"tone": "positive/negative/mixed", "key_phrases": ["phrase1", "phrase2"], "confidence": 0.0}''',
    model='qwen3.5:2b', provider='ollama',
    format='json'
)
result = response['response']
print(result['tone'], result['key_phrases'])

Pydantic structured output

Pass a Pydantic model and the JSON schema is sent to the LLM directly.

from npcpy import get_llm_response
from pydantic import BaseModel
from typing import List

class Planet(BaseModel):
    name: str
    distance_au: float
    num_moons: int

class SolarSystem(BaseModel):
    planets: List[Planet]

response = get_llm_response(
    "List the first 4 planets from the sun.",
    model='qwen3.5:2b', provider='ollama',
    format=SolarSystem
)
for p in response['response']['planets']:
    print(f"{p['name']}: {p['distance_au']} AU, {p['num_moons']} moons")

Image, audio, and video generation

from npcpy.llm_funcs import gen_image, gen_video
from npcpy.gen.audio_gen import text_to_speech

# Image — OpenAI, Gemini, Ollama, or diffusers
images = gen_image("A sunset over the mountains", model='gpt-image-1', provider='openai')
images[0].save("sunset.png")

# Audio — OpenAI, Gemini, ElevenLabs, Kokoro, gTTS
audio_bytes = text_to_speech("Hello from npcpy!", engine="openai", voice="alloy")
with open("hello.wav", "wb") as f:
    f.write(audio_bytes)

# Video — Gemini Veo
result = gen_video("A cat riding a skateboard", model='veo-3.1-fast-generate-preview', provider='gemini')
print(result['output'])

Multi-agent team

from npcpy import NPC, Team

team = Team(team_path='./npc_team')
result = team.orchestrate("Analyze the latest sales data and draft a report")
print(result['output'])

Or define a team in code:

from npcpy import NPC, Team

coordinator = NPC(name='lead', primary_directive='Coordinate the team. Delegate to @analyst and @writer.')
analyst = NPC(name='analyst', primary_directive='Analyze data. Provide numbers and trends.', model='gemini-2.5-flash', provider='gemini')
writer = NPC(name='writer', primary_directive='Write clear reports from analysis.', model='qwen3:8b', provider='ollama')

team = Team(npcs=[coordinator, analyst, writer], forenpc='lead')
result = team.orchestrate("What are the trends in renewable energy adoption?")
print(result['output'])

Team from files — .npc, .jinx, team.ctx

team.ctx:

context: |
  Research team for analyzing scientific literature.
  The lead delegates to specialists as needed.
forenpc: lead
model: qwen3.5:2b
provider: ollama
output_format: markdown
max_search_results: 5
mcp_servers:
  - path: ~/.npcsh/mcp_server.py

lead.npc:

#!/usr/bin/env npc
name: lead
primary_directive: |
  You lead the research team. Delegate literature searches to @searcher,
  data analysis to @analyst. Synthesize their findings into a coherent summary.
jinxes:
  - {{ Jinx('sh') }}
  - {{ Jinx('python') }}
  - {{ Jinx('delegate') }}
  - {{ Jinx('web_search') }}

searcher.npc:

#!/usr/bin/env npc
name: searcher
primary_directive: |
  You search for scientific papers and extract key findings.
  Use web_search and load_file to find and read papers.
model: gemini-2.5-flash
provider: gemini
jinxes:
  - {{ Jinx('web_search') }}
  - {{ Jinx('load_file') }}
  - {{ Jinx('sh') }}

Jinxes can reference a specific NPC to always run under that persona, and access ctx variables from team.ctx:

jinxes/search_and_summarize.jinx:

#!/usr/bin/env npc
jinx_name: search_and_summarize
description: Search for papers and summarize findings using the searcher NPC.
npc: {{ NPC('searcher') }}
inputs:
  - query
steps:
  - name: search
    engine: natural
    code: |
      Search for papers about {{ query }}.
      Return up to {{ ctx.max_search_results }} results.
  - name: summarize
    engine: natural
    code: |
      Summarize the findings in {{ ctx.output_format }} format:
      {{ output }}

The npc: field binds the jinx to a specific NPC — when this jinx runs, it always uses the searcher persona regardless of which NPC invoked it. Any custom keys in team.ctx (like output_format, max_search_results) are available as {{ ctx.key }} in Jinja templates and as context['key'] in Python steps.

my_project/
├── npc_team/
│   ├── team.ctx
│   ├── lead.npc
│   ├── searcher.npc
│   ├── analyst.npc
│   ├── jinxes/
│   │   └── skills/
│   └── models/
├── agents.md             # Optional: define agents in markdown
└── agents/               # Optional: one .md file per agent
    └── translator.md

.npc and .jinx files are directly executable:

./npc_team/lead.npc "summarize the latest arxiv papers on transformers"
./npc_team/jinxes/lib/sh.jinx bash_command="echo hello"

MCP server integration

Add MCP servers to your team for external tool access:

team.ctx:

forenpc: assistant
mcp_servers:
  - path: ./tools/db_server.py
  - path: ./tools/api_server.py

db_server.py:

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("Database Tools")

@mcp.tool()
def query_orders(customer_id: str, limit: int = 10) -> str:
    """Query recent orders for a customer."""
    # Your database logic here
    return f"Found {limit} orders for customer {customer_id}"

@mcp.tool()
def search_products(query: str) -> str:
    """Search the product catalog."""
    return f"Products matching: {query}"

if __name__ == "__main__":
    mcp.run()

The team's NPCs automatically get access to MCP tools alongside their jinxes.

Agent definitions in markdown & Skills

agents.md — multiple agents in one file:

## summarizer
You summarize long documents into concise bullet points.
Focus on key findings, methodology, and conclusions.

## fact_checker
You verify claims against reliable sources and flag inaccuracies.
Always cite your sources.

agents/translator.md — one file per agent with optional frontmatter:

---
model: gemini-2.5-flash
provider: gemini
---
You translate content between languages while preserving tone and idiom.

Skills are knowledge-content jinxes that provide instructional sections to agents on demand.

npc_team/jinxes/skills/code-review/SKILL.md:

---
name: code-review
description: Use when reviewing code for quality, security, and best practices.
---
# Code Review Skill

## checklist
- Check for security vulnerabilities (SQL injection, XSS, etc.)
- Verify error handling and edge cases
- Review naming conventions and code clarity

## security
Focus on OWASP top 10 vulnerabilities...

Reference in your NPC:

jinxes:
  - {{ Jinx('skills/code-review') }}

CLI tools

# The NPC shell — the recommended way to use NPC teams
npcsh                        # Interactive shell with agents, tools, and jinxes

# Scaffold a new team
npc-init

# Launch AI coding tools as an NPC from your team
npc-claude --npc corca       # Claude Code
npc-codex --npc analyst      # Codex
npc-gemini                   # Gemini CLI (interactive picker)
npc-opencode / npc-aider / npc-amp

# Register MCP server + hooks for deeper integration
npc-plugin claude

NPCArray — parallel jinx across multiple NPCs

Run any jinx in parallel across a list of NPC instances and collect results as an array:

from npcpy import NPC
from npcpy.npc_array import NPCArray

# Three NPCs with different models/providers
npcs = [
    NPC(name='drafter', primary_directive='Draft concise commit messages.', model='qwen3:4b', provider='ollama'),
    NPC(name='reviewer', primary_directive='Review and improve commit messages for clarity.', model='gemini-2.5-flash', provider='gemini'),
    NPC(name='enforcer', primary_directive='Check commit messages follow Conventional Commits spec.', model='gemini-2.5-flash', provider='gemini'),
]

arr = NPCArray.from_npcs(npcs)

# Run the same jinx on all three in parallel, collect results
results = arr.jinx('summarize', inputs={'topic': 'fix auth middleware to propagate clerkUserId through GraphQL resolvers'}).collect()
for npc, result in zip(npcs, results.data):
    print(f"[{npc.name}] {result}")

You can also pass a list directly to jinx.execute():

from npcpy.npc_compiler import load_jinx_from_file

jinx = load_jinx_from_file('npc_team/jinxes/analyze.jinx')
results = jinx.execute({'topic': 'rate limiting'}, npc=npcs)  # list → parallel NPCArray run

Knowledge graphs

Build, evolve, and search knowledge graphs from text. The KG grows through waking (assimilation), sleeping (consolidation), and dreaming (speculative synthesis).

from npcpy.memory.knowledge_graph import (
    kg_initial, kg_evolve_incremental, kg_sleep_process,
    kg_dream_process, kg_hybrid_search,
)
from npcpy.data.load_file import load_file_contents

# Seed the KG from a design doc PDF and a migration script
design_doc = load_file_contents("docs/auth_migration_plan.pdf")
migration_sql = load_file_contents("migrations/003_clerk_auth.sql")

kg = kg_initial(
    content=design_doc + "\n\n" + migration_sql,
    model="qwen3:4b", provider="ollama",
)

# Assimilate follow-up commits and PR descriptions
kg, _ = kg_evolve_incremental(
    kg,
    new_content_text=(
        "PR #412: Replaced Stripe customer-session lookup with Clerk JWT verification. "
        "Removed /api/stripe/webhook endpoint. Added ClerkMiddleware to all protected routes. "
        "CSP headers updated to allow clerk.accounts.dev origin."
    ),
    model="qwen3:4b", provider="ollama", get_concepts=True,
)

# Consolidate — merge redundant nodes, strengthen high-frequency edges
kg, sleep_report = kg_sleep_process(kg, model="qwen3:4b", provider="ollama")

# Dream — generate speculative connections between loosely related concepts
kg, dream_report = kg_dream_process(kg, model="qwen3:4b", provider="ollama")

# Search across facts, concepts, and speculative edges
results = kg_hybrid_search(kg, "How does auth propagate through GraphQL resolvers?",
                           model="qwen3:4b", provider="ollama")
for r in results:
    print(r['score'], r['text'])
print(f"{len(kg['facts'])} facts, {len(kg['concepts'])} concepts")

Extract structured memories from conversations:

from npcpy.llm_funcs import get_facts

conversation = """
User: We're ripping out Stripe entirely and moving auth to Clerk. The JWT verification
      will happen in ClerkMiddleware instead of the custom verify_stripe_session helper.
Assistant: Got it. I'll update the middleware chain. What about the existing session store?
User: Kill the Redis session cache — Clerk handles session state on their end.
      Also, the CSP headers need clerk.accounts.dev and clerk.enpisi.com added to connect-src.
"""

facts = get_facts(conversation, model="qwen3:4b", provider="ollama")
for f in facts:
    print(f"[{f.get('category', 'general')}] {f['statement']}")
# [architecture] Auth provider migrated from Stripe to Clerk with JWT verification via ClerkMiddleware
# [infrastructure] Redis session cache removed — Clerk manages session state
# [security] CSP connect-src updated to include clerk.accounts.dev and clerk.enpisi.com

Sememolution — population-based KG evolution

Maintain a population of KG variants that evolve independently. Each individual has Poisson-sampled search parameters, producing different traversals each query. Selection pressure from response ranking drives convergence toward useful graph structures.

from pathlib import Path
from npcpy.memory.kg_population import SememolutionPopulation
from npcpy.data.load_file import load_file_contents

pop = SememolutionPopulation(population_size=100, sample_size=10)
pop.initialize()

# Ingest a heterogeneous corpus — PDFs, DOCX, source code, meeting transcripts
corpus_dirs = [Path("docs/architecture"), Path("docs/meeting_notes"), Path("src/auth")]
for d in corpus_dirs:
    for f in sorted(d.glob("*")):
        if f.suffix in (".pdf", ".docx", ".md", ".py", ".ts", ".txt"):
            text = load_file_contents(str(f))
            pop.assimilate_text(text)

# Sleep/dream cycle — each individual consolidates according to its genome
pop.sleep_cycle()

# Query: sample 10 individuals, generate competing responses, rank them
rankings = pop.query_and_rank("How does the auth middleware chain interact with the GraphQL context?")
for rank, entry in enumerate(rankings[:3], 1):
    print(f"#{rank} (individual {entry['id']}, score {entry['score']:.3f}): {entry['response'][:120]}...")

# Selection + reproduction — top performers breed, bottom are replaced
pop.evolve_generation()

stats = pop.get_stats()
print(f"Generation {stats['generation']} | avg fitness {stats['avg_fitness']:.3f} | "
      f"best fitness {stats['best_fitness']:.3f} | diversity {stats['diversity']:.3f}")

Fine-tuning (SFT, RL, MLX)

from npcpy.ft.sft import run_sft

# Train a model to extract structured decisions from meeting notes
# LoRA fine-tuning — auto-uses MLX on Apple Silicon
X_train = [
    "Meeting: Auth Migration Sync (2025-01-15)\nAttendees: Sarah, Mike, Priya\n"
    "Discussion: Evaluated Clerk vs Auth0 for replacing Stripe auth. Clerk chosen "
    "for lower latency and native Next.js support. Migration starts sprint 12. "
    "Redis session store will be removed once Clerk JWT verification is stable.",

    "Meeting: API Rate Limiting Review (2025-01-22)\nAttendees: Mike, Jordan\n"
    "Discussion: Current per-session token bucket is incompatible with Clerk's "
    "stateless JWTs. Agreed to switch to per-IP sliding window with 100 req/min "
    "default. Premium tier gets 500 req/min. Jordan to implement by Friday.",

    "Meeting: GraphQL Schema Freeze (2025-02-01)\nAttendees: Sarah, Priya, Jordan\n"
    "Discussion: Schema v2 locked for release. Nested auth context propagation "
    "through dataloaders confirmed working. New 'viewer' pattern adopted for "
    "all authenticated queries. Breaking changes documented in CHANGELOG.",

    "Meeting: Deployment Postmortem (2025-02-10)\nAttendees: full team\n"
    "Discussion: Production outage caused by missing CSP header for clerk.accounts.dev. "
    "Root cause: deploy script didn't pick up new env vars. Fix: added CSP validation "
    "to CI pipeline. New rule: all external origins must be in csp_allowlist.json.",
]
y_train = [
    '{"decisions": [{"what": "Adopt Clerk for auth", "why": "Lower latency, native Next.js support", "owner": "team", "deadline": "sprint 12"}, {"what": "Remove Redis session store", "why": "Clerk handles session state", "owner": "team", "deadline": "after JWT verification stable"}]}',
    '{"decisions": [{"what": "Switch to per-IP sliding window rate limiter", "why": "Token bucket incompatible with stateless JWTs", "owner": "Jordan", "deadline": "Friday"}, {"what": "Set rate limits to 100/min default, 500/min premium", "why": "Tiered access control", "owner": "Jordan", "deadline": "Friday"}]}',
    '{"decisions": [{"what": "Freeze GraphQL schema v2", "why": "Release readiness", "owner": "Sarah", "deadline": "immediate"}, {"what": "Adopt viewer pattern for authenticated queries", "why": "Consistent auth context in nested resolvers", "owner": "Priya", "deadline": "immediate"}]}',
    '{"decisions": [{"what": "Add CSP validation to CI pipeline", "why": "Prevent missing CSP headers in deploys", "owner": "team", "deadline": "immediate"}, {"what": "Require external origins in csp_allowlist.json", "why": "Enforce explicit approval of external domains", "owner": "team", "deadline": "immediate"}]}',
]

model_path = run_sft(X_train=X_train, y_train=y_train)

Features

Agents (NPCs) — Agents with personas, directives, and tool calling. Subclasses: Agent (default tools), ToolAgent (custom tools + MCP), CodingAgent (auto-execute code blocks)
Multi-Agent Teams — Team orchestration with a coordinator (forenpc)
Jinx Workflows — Jinja Execution templates for multi-step prompt pipelines
Skills — Knowledge-content jinxes that serve instructional sections to agents on demand
NPCArray — NumPy-like vectorized operations over model populations
Image, Audio & Video — Generation via Ollama, diffusers, OpenAI, Gemini, ElevenLabs
Knowledge Graphs — Build and evolve knowledge graphs from text with sleep/dream lifecycle
Sememolution — Population-based KG evolution with genetic selection and Poisson-sampled search
Memory Pipeline — Extract, approve, and backfill memories with self-improving quality feedback
Fine-Tuning & Evolution — SFT, USFT, RL/DPO, diffusion, genetic algorithms, MLX on Apple Silicon
Serving — Flask server for deploying teams via REST API
ML Functions — Scikit-learn grid search, ensemble prediction, PyTorch training
Streaming & JSON — Streaming responses, structured JSON output, message history

Providers

Works with all major LLM providers through LiteLLM: ollama, openai, anthropic, gemini, deepseek, airllm, openai-like, and more.

Installation

pip install npcpy              # base
pip install npcpy[lite]        # + API provider libraries
pip install npcpy[local]       # + ollama, diffusers, transformers, airllm
pip install npcpy[yap]         # + TTS/STT
pip install npcpy[all]         # everything

System dependencies

Linux:

sudo apt-get install espeak portaudio19-dev python3-pyaudio ffmpeg libcairo2-dev libgirepository1.0-dev
curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen3.5:2b

macOS:

brew install portaudio ffmpeg pygobject3 ollama
brew services start ollama
ollama pull qwen3.5:2b

Windows: Install Ollama and ffmpeg, then ollama pull qwen3.5:2b.

API keys go in a .env file:

export OPENAI_API_KEY="your_key"
export ANTHROPIC_API_KEY="your_key"
export GEMINI_API_KEY="your_key"

Read the Docs

Full documentation, guides, and API reference at npcpy.readthedocs.io.

Research

A Quantum Semantic Framework for natural language processing: arxiv, accepted at QNLP 2025
Simulating hormonal cycles for AI: arxiv
TinyTim: A Family of Language Models for Divergent Generation arxiv
The production of meaning in the processing of natural language: arxiv
ALARA for Agents: Least-Privilege Context Engineering Through Portable Composable Multi-Agent Teams: arxiv

Has your research benefited from npcpy? Let us know!

Support

Monthly donation | Merch | Consulting: info@npcworldwi.de

Contributing

Contributions welcome! Submit issues and pull requests on the GitHub repository.

License

MIT License.

Star History

Release History

Version	Changes	Urgency	Date
v2.1.9	-openrouter routing fix	High	7/22/2026
v2.1.8	- image model set return in serve - mlx gen and tool calling - ollama api url and key fix	High	7/13/2026
v2.1.2	-team / npc db decoupling -permissions on jinxes directly.	High	6/30/2026
v1.4.34	## What's Changed * epoch-based mlx sft	High	6/20/2026
v1.4.32	-skill jinxes for knowledge and memory manipulation -command history removed -routes removed from serve related to incognide -various npcsh references removed	High	6/11/2026
v1.4.30	-fixes for not thinking in serve, -fixes for not getting random json outputs when just wanting to chat by passing empty jinx list to system prompt.	High	6/7/2026
v1.4.28	-temp /top p issue for claude models	High	5/24/2026
v1.4.25	-text prediction endpoint fix in serve	High	5/19/2026
v1.4.24	## What's new - Defensive error handling in `/api/models` — isolates model-loading failures and returns partial results instead of 500s - Content-Type validation in `register_studio_window` - Global Flask error handlers (404, 500, unhandled exceptions) returning JSON instead of HTML error pages	High	5/12/2026
v1.4.23	## What's Changed * Update README.md by @cagostino in https://github.com/NPC-Worldwide/npcpy/pull/232 * Caug/readme by @cagostino in https://github.com/NPC-Worldwide/npcpy/pull/233 * Update README.md by @cagostino in https://github.com/NPC-Worldwide/npcpy/pull/234 * Add CLI providers as first-class providers in get_llm_response by @svax974 in https://github.com/NPC-Worldwide/npcpy/pull/235 ## New Contributors * @svax974 made their first contribution in https://github.com/NPC-Worldwide/np	High	5/10/2026
v1.4.22	- Agent.run(): full tool-calling loop with max_iterations, verbose mode, and permission prompts ([y]es/[n]o/[a]ll) with allow_tools bypass list - kg_facts.memory_id: formal FK to memory_lifecycle(id) so facts retain provenance when derived from approved memories - /api/kg/facts: returns memory_id and memory_status for linking back to source memories without extra round-trips - ollama tool calling: fixes message sanitization and proper tool_calls handling in assistant messages - gen_image/gen	High	5/3/2026
v1.4.21	- unified generate_music(): local (PyTorch MusicGen) / replicate / elevenlabs with auto-fallback when a provider lacks creds or fails - /api/generate_music endpoint - openai image gen: duck-type b64_json/url so gpt-image-1.5 and future models work without whitelisting - serve.py: unwrap non-PIL OpenAI image responses before saving - serve + npc_compiler: team-routed NPC lookup, source_path/source_ext on NPC dicts, save_npc accepts explicit team - generative fill: PIL mask composite for OpenAI an	High	4/21/2026
v1.4.20	kg_facts.memory_id FK, team-level .ctx jinxes, markdown agent loading, slinky team, jinx<->skill / npc<->agents converters (#224)	High	4/20/2026
v1.4.19	- omlx provider routing (openai-compatible, 127.0.0.1:8000/v1) - rename mlx → omlx label in local model discovery - suppress litellm debug output - remove redundant API endpoints from serve.py	High	4/17/2026
v1.4.18	- Security: desktop.py shell command injection fix (shell=True → shlex.split) - Security: torch.load weights_only=True for diff/image_gen checkpoints; explicit weights_only=False with comment for DIAMOND models - Security: Agent gains safe_tools=True param — excludes sh and python execution tools from default set - Security: SandboxedEnvironment for user-controlled Jinja2 rendering in serve.py and mcp_server.py - Jinx.execute() accepts a list or NPCArray as npc — runs jinx in parallel across all	High	4/15/2026
v1.4.17	- Default search to startpage, cascade to searxng then ddgs - Activity logging tables (activity_log, autocomplete_suggestions, autocomplete_training) - API endpoints for activity/autocomplete logging and training data export - Fix memory scope query to not require all filters	High	4/8/2026
v1.4.16	fix: flush SSE events immediately for real-time streaming, prompt chat before stop in agentic loop	Medium	4/6/2026
v1.4.15	- Fix command injection in _tool_web_search and _tool_file_search (PR #208, thanks @spidershield-contrib)	Medium	4/4/2026
v1.4.14	- Add Sememolution population-based KG evolution module (kg_population.py) with GeneticEvolver integration, Poisson-sampled search traversal, per-individual graph state, crossover with graph merging, and LLM-judged response ranking - Add memory extraction pipeline and sememolution documentation to knowledge-graphs guide - Add MLX Apple Silicon documentation to fine-tuning guide - Add KG, memory, sememolution, and fine-tuning examples to README - Remove dead shell=True subprocess fallback in _too	Medium	4/4/2026
v1.4.13	- Pass user generation params (temperature, top_p, top_k, max_tokens) to LLM in both chat and tool_agent modes - Make serve.py robust for Windows — optional imports for redis/flask_sse/mcp, better error handling, NPCSH_BASE env var - Fix MCP server engine step rendering — action/args from _raw_steps now template-rendered with tool call arguments - Add Windows CI tests for serve.py imports, settings round-trip, and server startup	Medium	4/3/2026
v1.4.12	- device routing in ft modules (sft, rl, usft): device='mlx'\|'cpu'\|'cuda' - MLX LoRA training via mlx-lm Python API on Apple Silicon - HF model name → mlx-community resolution - backwards compatible, default device='cpu'	Medium	4/2/2026
v1.4.11	stream_events plumbing from jinx execution through generator chain	Medium	3/28/2026
v1.4.10	- Inline generator in check_llm_command, no separate function - create_jinx_stream takes (npc, command) directly, no StreamConfig - Sub-delegation events via shared_context['sub_events']	Medium	3/28/2026
v1.4.9	- Generator-based streaming for check_llm_command (stream=True yields events) - No threads/queues — clean generator protocol - Chat streams token by token, tools emit tool_start/tool_result events	Medium	3/28/2026
v1.4.8	- Threaded check_llm_command in create_jinx_stream with keepalive SSE events - Prevents SSE timeout during long delegation - Event queue for jinxes to push real-time progress	Medium	3/28/2026
v1.4.7	- Fix: don't pass tool_choice when no tools specified (fixes #206 - OpenAI NPC quickstart error)	Medium	3/28/2026
v1.4.6	- Fix create_jinx_stream: stream=True, consume stream wrapper, skip chat/stop tool events - Resolve api_url and api_key from NPC/Team in resolve_model_provider	Medium	3/27/2026
v1.4.5	- Bump litellm version to 1.81.13	Medium	3/27/2026
v1.4.4	- Resolve api_url and api_key from NPC/Team in resolve_model_provider - Route all responses through jinx system in create_jinx_stream	Medium	3/26/2026
v1.4.3	- file-based storage for all CommandHistory tables — CSV and Parquet backends alongside SQLite/Postgres (c3eb2f2) - `append_row_csv` / `append_row_parquet` for any of the 10 tables (conversation_history, command_history, jinx_executions, npc_executions, memory_lifecycle, npc_memories, knowledge_graphs, labels, message_attachments, compiled_npcs) - partitioned directory structure: `table/path/year/month/day/group_id.{csv,parquet}` - `scan_all(base_dir, table, ext)` loads entire table tree i	Medium	3/25/2026
v1.4.1	- pinned litellm dependency to version 1.76.0 in setup.py (6b34b71) for assuring users not affected by security vulnerabilities - updated research section in README with new papers (TinyTim, ALARA for Agents, production of meaning) and revised QNLP 2025 paper title (1eaf753, 5ee0c00)	Medium	3/24/2026
v1.3.37	- rewrite create_jinx_stream as true agentic loop using check_llm_command - remove weak followup classifier LLM call between iterations - agent loops autonomously: stream → execute jinxes → feed results back → repeat until stop - max_followups default bumped from 3 to 10	Low	3/22/2026
v1.3.36	-streaming consolidation and module -agent tool fix -readme example updates	Low	3/20/2026
v1.3.35	- MCPClientNPC supports command+args, url (SSE), and path connection modes - resolve_mcp_server_path falls back to `python -m npcpy.mcp_server --team <path>` - /api/mcp/tools no longer scans jinx directories — tools come only from MCP server + NPC config - npc_compiler: fix stdout prints corrupting MCP stdio transport (redirect to stderr) - serve.py: add _is_command_string() utility, update MCPServerManager for command strings	Low	3/12/2026
v1.3.34	- Centralized stream setup/cleanup into helper functions - Jinx discovery refactored to use load_jinxes_from_directory instead of manual file walking - save_jinx uses Jinx.save() instead of manual yaml dumping - save_npc uses NPC.save() instead of manual yaml string formatting - Team/package endpoints use Team class for NPC loading	Low	3/10/2026
v1.3.33	- jinxs renamed to jinxes across all code, paths, API routes, and variable names (401d727, 0f238f5) - resource directories (images, models, videos, attachments, jobs, triggers, logs) moved under npc_team/ with auto-migration on startup (f7af1c3) - path helpers centralized in npc_sysenv.py and used by OCR, video generation, and other modules (f7af1c3) - MCP server startup supports command-style invocations (npx, uvx, node, python) and custom environment variables (f7af1c3) - Jinja	Low	3/8/2026
v1.3.32	## What's Changed * Add UTF-8 encoding to file operations by @madalfad in https://github.com/NPC-Worldwide/npcpy/pull/203 ## New Contributors * @madalfad made their first contribution in https://github.com/NPC-Worldwide/npcpy/pull/203 Full Changelog: https://github.com/NPC-Worldwide/npcpy/compare/v1.3.31...v1.3.32	Low	3/5/2026
v1.3.31	- get_model_context_window() function in gen/response.py — queries litellm/ollama for model context sizes - breathe() now returns summary key with raw structured data alongside formatted output - get_last_message_id() on CommandHistory for conversation linking - KG API: /api/kg/ingest and /api/kg/query endpoints for ingesting text and querying KG with natural language - cron/scheduling API: /api/cron/jobs, /api/cron/schedule, /api/cron/unschedule, /api/cron/crontab, /ap	Low	3/1/2026
v1.3.30	- fixes for action sequences - fixes for jinx exiting	Low	2/25/2026
v1.3.29	-image gen model fixes in serve and image gen for deprecated models.	Low	2/24/2026
v1.3.28	- knowledge graph tracks conversation and message history as weights - simplified llm_funcs helpers, fixed tool calling issues - thinking parameter fix - team sync functions moved to npc_sysenv (resolve_team_dir, git-based sync ops) - INCOGNIDE_HOME env var support in get_data_dir() - serve.py endpoints thinned out to call npc_sysenv	Low	2/23/2026
v1.3.27	-couple agent bug fixes in serve.	Low	2/18/2026
v1.3.26	- knowledge graph fixes, removed kuzu, additional embedding search, kg crud endpoints in serve - git-based syncing for teams - window-aware sse in serve - npcsql fixes for snowflake - `top_k` parameter was missing from ollama provider calls	Low	2/16/2026
v1.3.25	- minor fixes in serve for streaming	Low	2/10/2026
v1.3.24	- Fix MCP agent crashes from orphaned tool_calls without matching tool_results - Fix Ollama streaming showing only first token then full response at end - Preserve {{var}} template variables in jinx steps for runtime rendering	Low	2/10/2026
v1.3.23	- Ollama image generation support - Agent server improvements - Documentation overhaul - Multi-directory NPC loading for incognide and npcsh teams for server	Low	2/4/2026
v1.3.22	-audio fixes for yap, qwen 3 tts	Low	2/1/2026
v1.3.21	-sys prompt improvements for tool and jinx calling	Low	2/1/2026
v1.3.20	-separation of npc and team jinxs in jinx consolidation	Low	2/1/2026
v1.3.19	- New /api/generate_video endpoint for text-to-video generation - Optional reference image input for image-to-video generation - Auto-saves generated videos to generated_videos/ directory with timestamped filenames - Returns video as base64-encoded data for immediate preview	Low	1/25/2026
v1.3.18	Security - Jinja2 sandbox: all template environments now use SandboxedEnvironment to prevent injection attacks - Removed pickle: model serialization now uses joblib/safetensors (no more pickle.loads vulnerabilities)	Low	1/24/2026
v1.3.17	-lora accommodations in serve, response, model gathering	Low	1/18/2026
v1.3.16	-import shuffling in rl -paths for npcsh stuff in serve	Low	1/17/2026
v1.3.15	-tokenizer kw fix for rl	Low	1/15/2026
v1.3.14	-DPO training now supports 4-bit and 8-bit quantization via use_4bit/use_8bit in RLConfig -configurable LoRA parameters (lora_r, lora_alpha, lora_dropout, lora_target_modules) -fp16/bf16 precision options for training -max_pairs limit for preference dataset size control -paged_adamw_8bit optimizer auto-selected for quan	Low	1/15/2026
v1.3.13	-server jinx processing fixes -txt file handling processing	Low	1/13/2026
v1.3.12	-tts fixes in server -sql fix for tool call retrieval	Low	1/8/2026
v1.3.11	-adjustments with conversation branching / parent message relations	Low	1/4/2026
v1.3.10	-npc team initialization adjustments in serve	Low	12/27/2025
v1.3.9	-llama cpp and lm studio end point fixes -npc object api url check -intiailize npc team folder creation	Low	12/27/2025
v1.3.8	-jinx fix in npc method for check_llm_command -bug fix in react fallback loop -gem3 flash and pro name cleaning	Low	12/23/2025

Dependencies & License Audit

Loading dependencies...

Similar Packages

studioOpen-source control plane for your AI agents. Connect tools, hire agents, track every token and dollarv4.53.0

developers-guide-to-aiThe Developer's Guide to AI - A Field Guide for the Working Developermain@2026-07-24

asya-chat-uiBuild multi-organization LLM chat platforms with model routing, tool execution, usage analytics, and OpenAI-compatible APIs.main@2026-07-19

pipulateLocal First AI SEO Software on Nix, FastHTML & HTMXmain@2026-07-15

agentic-codingAgentic Coding Rules, Templates etc...main@2026-07-11

More in MCP Servers

supersetCode Editor for the AI Agents Era - Run an army of Claude Code, Codex, etc. on your machine

kreuzbergA polyglot document intelligence framework with a Rust core. Extract text, metadata, images, and structured information from PDFs, Office documents, images, and 91+ formats. Available for Rust, Python

ai-engineering-from-scratchLearn it. Build it. Ship it for others.

CodeGraphContextAn MCP server plus a CLI tool that indexes local code into a graph database to provide context to AI assistants.

npcpy

Description

README

npcpy

Quick Examples

Create and use personas

Direct LLM call

Agent with tools

Streaming

JSON output

Multi-agent team

CLI tools

NPCArray — parallel jinx across multiple NPCs

Features

Providers

Installation

Read the Docs

Links

Research

Support

Contributing

License

Star History

Release History

Dependencies & License Audit

Similar Packages

More in MCP Servers