OpenDQV

Open-source, contract-driven data quality validation. Shift-left enforcement at the point of write — before data enters your pipeline.

data-contracts data-governance data-quality data-validation fastapi mcp open-source python

Why this rank:Release freshnessStrong adoptionHealthy release cadence

Description

Open-source, contract-driven data quality validation. Shift-left enforcement at the point of write — before data enters your pipeline.

README

Quickstart	Rules	Contracts	MCP	API	Security	FAQ

"Trust is easier to build than to repair." That is why OpenDQV exists. A 422 at the point of write is cheaper than a data incident three weeks later.

Beta (v2.x). Public API surface (REST, contract YAML, MCP tools, Python SDK) is stable. Breaking changes follow a one-release deprecation cycle. Security fixes backported to the latest 2.x line. See API Stability for commitments.

OpenDQV is a write-time data validation service. Source systems call it before writing data. Bad records return a 422 with per-field errors. Good records pass through. No payload is stored.

flowchart LR
    subgraph Callers
        direction TB
        SF[Salesforce]
        SAP[SAP]
        DYN[Dynamics]
        ORA[Oracle]
        WEB[Web forms]
        ETL1[ETL pipelines]

        DJ[Django clean]
        PY[Python scripts]
        PD[Pandas / ETL]

        CD[Claude Desktop]
        CUR[Cursor]
        LLM[LLM agents]
    end

    subgraph OpenDQV
        direction TB
        API[Validation API\nREST / batch]
        SDK[LocalValidator\nin-process SDK]
        MCP[MCP Server\nAI-native]
        API & SDK & MCP --> CON[Contracts · YAML\nGovernance · RBAC\nAudit trail]
        API & SDK & MCP --> GEN[Code Generator\nApex · JS · SQL]
    end

    subgraph Results
        direction TB
        R1[valid: true / false]
        R2[per-field errors]
        R3[severity levels]
        R4[webhooks on events]
    end

    SF & SAP & DYN & ORA & WEB & ETL1 --> API
    DJ & PY & PD --> SDK
    CD & CUR & LLM --> MCP

    API & SDK & MCP --> R1

    subgraph Importers
        IMP[dbt schema · GX suites\nSoda checks · ODCS · CSV]
    end
    IMP --> CON

    style API fill:#0d3b5e,stroke:#092a44,color:#fff
    style SDK fill:#0d3b5e,stroke:#092a44,color:#fff
    style MCP fill:#0d3b5e,stroke:#092a44,color:#fff
    style CON fill:#1a8aad,stroke:#14708d,color:#fff
    style GEN fill:#1a8aad,stroke:#14708d,color:#fff
    style R1 fill:#2ec4e6,stroke:#1a8aad,color:#0d3b5e
    style R2 fill:#2ec4e6,stroke:#1a8aad,color:#0d3b5e
    style R3 fill:#2ec4e6,stroke:#1a8aad,color:#0d3b5e
    style R4 fill:#2ec4e6,stroke:#1a8aad,color:#0d3b5e
    style IMP fill:#1a8aad,stroke:#14708d,color:#fff

A 422 at the point of write closes the feedback loop — producers see failures immediately and fix them at source. Rejection rates drop over time because the tool changes the incentive, not just the outcome.

For post-landing monitoring use Great Expectations, Soda, or dbt tests — they're complementary, not competing. OpenDQV owns layer one (write-time enforcement); those tools own layer three (post-ingestion observability).

AI Agents — first-class via MCP

OpenDQV ships a built-in Model Context Protocol server, so Claude Desktop, Cursor, and any other MCP-compatible agent can discover contracts, validate records, and explain failures through tool calls the agent explicitly declares — no hallucinated compliance, no invented rules.

OpenDQV_Marmot_MCP_Demo.mp4

4-minute demo: Claude Desktop uses two MCP servers — OpenDQV for validation, Marmot for catalog lineage — to check a menu item against ppds_menu_item for Natasha's Law allergen compliance, stating which tool calls it makes and why. (Backup: download the MP4 from the repo)

For tool reference, write guardrails, remote/enterprise mode, and the Marmot composition pattern, see docs/mcp.md.

Install

I have...	Command
Python 3.11+	`git clone https://github.com/OpenDQV/OpenDQV.git && cd OpenDQV && bash install.sh`
Docker	`git clone https://github.com/OpenDQV/OpenDQV.git && cd OpenDQV && cp .env.example .env && docker compose up -d`
Just the SDK/CLI	`pip install opendqv` then `opendqv init` to bootstrap contracts
None of the above	Beginner setup guide →

install.sh creates a virtual environment, installs dependencies, and launches the onboarding wizard. Docker pulls ghcr.io/opendqv/opendqv:latest — no build step required.

⚠️ AUTH_MODE=open (the default) has no authentication. Set AUTH_MODE=token and a strong SECRET_KEY in .env before any non-local deployment. See SECURITY.md.

Your First Validation

1. Write a contract — drop a YAML file in your contracts directory (run opendqv init --all to copy the 43 bundled contracts, or opendqv init for a single starter):

contract:
  name: order
  version: "1.0"
  owner: "Data Governance"
  status: active
  rules:
    - name: valid_email
      type: regex
      field: email
      pattern: "^[^@\\s]+@[^@\\s]+\\.[^@\\s]+$"
      severity: error
      error_message: "Invalid email format"
    - name: amount_positive
      type: min
      field: amount
      min: 0.01
      severity: error
      error_message: "Order amount must be positive"
    - name: status_valid
      type: allowed_values
      field: status
      allowed_values: [pending, confirmed, shipped, cancelled]
      severity: error
      error_message: "Invalid order status"

2. Reload contracts:

curl -X POST http://localhost:8000/api/v1/contracts/reload

3. Send a bad record — OpenDQV rejects it:

curl -s -X POST http://localhost:8000/api/v1/validate \
  -H "Content-Type: application/json" \
  -d '{"contract": "order", "record": {"email": "not-an-email", "amount": -5, "status": "unknown"}}'

{
  "valid": false,
  "errors": [
    {"field": "email",  "rule": "valid_email",    "message": "Invalid email format",        "severity": "error"},
    {"field": "amount", "rule": "amount_positive", "message": "Order amount must be positive", "severity": "error"},
    {"field": "status", "rule": "status_valid",    "message": "Invalid order status",        "severity": "error"}
  ],
  "contract": "order",
  "version": "1.0"
}

4. Fix the record — it passes:

curl -s -X POST http://localhost:8000/api/v1/validate \
  -H "Content-Type: application/json" \
  -d '{"contract": "order", "record": {"email": "alice@example.com", "amount": 49.99, "status": "pending"}}'

{"valid": true, "errors": [], "warnings": [], "contract": "order", "version": "1.0"}

The customer contract ships pre-seeded if you want to skip step 1. The quickstart guide walks through authoring, lifecycle, and batch validation.

Rules

Type	What it checks
`not_empty`	Field is present and non-empty
`regex`	Field matches (or does not match) a pattern. Built-ins: `builtin:email`, `builtin:uuid`, `builtin:ipv4`, `builtin:url`
`min` / `max` / `range`	Numeric bounds
`min_length` / `max_length`	String length
`date_format`	Parseable date/datetime. Falls back through common formats if no explicit format is set
`allowed_values`	Value must be in a fixed list
`lookup`	Value must appear in a local file or HTTP endpoint (with TTL cache)
`compare`	Cross-field: `field` op `compare_to` — supports `gt`, `lt`, `gte`, `lte`, `eq`, `neq`, and `today`/`now` sentinels
`required_if` / `forbidden_if`	Conditional: required or forbidden when another field equals a value
`checksum`	Check-digit integrity: IBAN, GTIN/GS1, NHS, ISIN, LEI, VIN, CPF, ISRC
`unique`	No duplicates within a batch (batch mode only)
`cross_field_range`	Value must be between two other fields in the same record
`field_sum`	Sum of named fields must equal a target (within optional tolerance)
`geospatial_bounds`	Lat/lon pair within a bounding box
`date_diff`	Difference between two date fields within a range
`age_match`	Declared age consistent with date-of-birth field

Rules have severity: error (blocks the record) or severity: warning (flags but allows). Any rule can include a condition block to apply it only when another field equals a given value.

Full reference: docs/rules/

How it compares

A mature data governance programme operates across three layers, each with a distinct job:

Layer	Purpose	Tools
1. Write-time enforcement	Prevent bad data from entering any system	OpenDQV
2. Catalog / governance / stewardship	Ownership, glossary, lineage, policy, stewardship workflows	Alation, Atlan, Collibra, Purview, DataHub, Marmot
3. Pipeline testing / observability	Detect drift, freshness issues, residual quality after ingestion	Great Expectations, Soda Core, dbt tests, Monte Carlo

OpenDQV Core owns layer one. Your catalog handles layer two, your pipeline tools handle layer three.

	Great Expectations / Soda / dbt	OpenDQV
When	After data lands (in warehouse/lake)	Before data is written (at the door)
Where	Data pipelines, batch jobs	Source system integration points
Model	Scan data at rest	Validate data in flight
Latency	Minutes to hours (batch)	Milliseconds (API call)
Who calls it	Data engineers	Data engineers, developers, CRM admins

They're complementary. Use Great Expectations to monitor your warehouse. Use OpenDQV to stop bad data from getting there in the first place.

Contracts

43 production-ready contracts ship inside the opendqv package covering GDPR, HIPAA, SOX, MiFID II, UK Building Safety Act, Martyn's Law, Natasha's Law, Ofcom Online Safety Act, EU DORA, and 20+ other regulatory frameworks across UK, EU, and US. pip install opendqv gives you all of them — opendqv list works with zero configuration.

See docs/compliance-contracts.md for the full list with regulatory context, or browse opendqv/contracts/ directly. 17 minimal starter templates are in examples/starter_contracts/.

Performance

EC2 c6i.large, 2 workers, 12-rule contract, mixed 50/50 workload: ~482 req/s, p99 ~182 ms. Sizing rule: WEB_CONCURRENCY = number of vCPUs.

See docs/benchmark_throughput.md for full platform comparison, methodology, and monthly volume extrapolation.

Documentation


Quickstart	Build your first contract in 15 minutes
Rules Reference	All rule types with parameters and examples
Compliance Contracts	44 contracts with regulatory context
API Reference	REST endpoints, SDK, GraphQL, webhooks
Security	Deployment checklist, threat model, RBAC
Production Deployment	Token auth, TLS, Docker Compose, hardening
Integrations	Salesforce, Kafka, Snowflake, dbt, Databricks, MCP, and more
All docs →	76 documentation files

API Stability

OpenDQV is in Beta as of 2.0.0. The following stability commitments apply to the v2.x series:

REST API endpoints — paths, request bodies, and response shapes are stable within v2.x. Backwards-incompatible changes require a major version bump and follow a deprecation cycle (one minor release of warnings before removal).
YAML contract format — the contract schema (rules, fields, types) is stable within v2.x. New rule types may be added; existing rules will not change semantics without a deprecation cycle.
Python SDK — OpenDQVClient, AsyncOpenDQVClient, and LocalValidator public method signatures are stable within v2.x. Internal helpers (prefixed _) are not covered.
MCP tools — tool names and parameters are stable within v2.x.
Security fixes — backported to the latest 2.x line on a best-effort basis.

Known limitations in v2.2.x

Rule null handling is inconsistent. Most format rules fail when the target field is missing; a few (max_length, allowed_values) pass silently; field_sum and ratio_check coerce missing operands to 0. Single-record and batch paths disagree in a few cases. See docs/rules/core_rules.md for the full matrix and the safe pattern to use today. v2.3.0 will make this consistent (loud-by-default with an optional: true opt-out).
Unknown rule types pass silently at runtime. A typo in type: (e.g. min_lenght) is caught by opendqv lint but not by the engine — a typo'd rule is a disabled rule. Always lint before deploy. v2.3.0 will reject unknown types at contract load.

Contributing

See CONTRIBUTING.md for setup instructions, coding guidelines, and how to submit changes.

License

MIT — see LICENSE.

Acknowledgements

Led by Sunny Sharma, BGMS Consultants Ltd. The vision, the architecture, every contract, and every design decision in this repository are directed by a human who believes data quality is a write-time responsibility.

OpenDQV is built with a hybrid team. Sunny leads — carbon and silicon. Three AI collaborators execute: Claude Sonnet 4.6 (primary developer), Claude Opus 4.6 (strategic auditor), and Grok (market intelligence). All answer to the same ethos: trust is easier to build than to repair.

Release History

Version	Changes	Urgency	Date
v2.3.25	## Summary Two related fixes shipped together for cohesive checksum-correctness hardening on the v2.3.x line. Found during the post-v2.3.24 inside-check (Mac BT outside-review defect for the rename, BT-7274 inside-check for the fail-closed gap). ## What changed ### 1. Rename \`isin_mod11\` → \`isin_luhn\` The algorithm key was mathematically misnamed in v2.3.23. The math is Luhn mod-10 over the expanded numeric encoding (A=10..Z=35), not mod-11. Comments, error message text, and explaine	High	4/29/2026
v2.3.9	## CRT172 / K1 + K2 — auditor-readable audit event surface Adds two auth-gated endpoints over the `quality_stats` audit table so an auditor can read the rows every `/validate` and `/validate/batch` call already writes. ### New endpoints - `GET /api/v1/audit/events/{event_id}` (K1) — single row by the event_id UUID emitted on the validation response. Returns full audit detail incl. JSON-decoded `rule_failure_counts`. 404 when not found. - `GET /api/v1/audit/events` (K2) — cursor-pagin	High	4/26/2026
v2.2.5	### Added - \`opendqv fork <src> <dst>\` — copy a contract to a new name as a clean DRAFT. Rewrites \`name:\`, \`version: \"1.0\"\`, \`status: draft\`, and \`asset_id:\` in place while preserving all comments, descriptions, and rules from the source. Replaces the \`cp + edit + reset\` workflow with one command. - Linter rule \`FILENAME_NAME_MISMATCH\` — \`opendqv lint\` now errors when the filename stem differs from the YAML's internal \`name:\` field. Catches the footgun where \`cp med	High	4/17/2026
v2.2.4	### Shipped - 43 bundled contracts now ship inside the Python package. `pip install opendqv` followed by `opendqv list` works with zero configuration. Before v2.2.4, pip-install users saw "No contracts found" because the `contracts/` directory lived at repo root and never entered the wheel. They now live at `opendqv/contracts/` inside the package. - `opendqv init --all` — new flag copies every bundled contract (43+ regulated domains) plus reference lookup files into the target director	High	4/17/2026
v2.2.3	### Fixed - 4 broken `max_length` rules in banking_transaction, fmcg_product, retail_product, and media_content contracts. Rules used `max:` instead of `max_length:` in YAML — Pydantic alias mapped to wrong field, so rules silently never fired. Found via MCP-driven sample record audit. - 16 sample record files aligned with v1.1 contracts. 11 full rewrites (v1.0→v1.1 field name changes), 5 minor fixes. 142/142 sample records now validate correctly. - proof_of_play samples — panel_id	High	4/15/2026
v2.2.2	### Fixed - MCP server version was hardcoded as "1.8.4" — now reads from `config.ENGINE_VERSION` dynamically	Medium	4/12/2026
v2.2.1	## Highlights - Security: Removed yaml.full_load() fallback — eliminated RCE vector from YAML loading path - Performance: O(n squared) to O(n) grouped uniqueness — 954x faster at 2,000 records (131s to 0.14s) - Maintainability: _check_rule() dispatch table — 417-line god function to 23 handlers + dict lookup - DX: 62 broken import paths fixed across 27 docs files for pip users ### Full changelog 15 code quality improvements shipped via PICK methodology (Possible/Implement/Chal	Medium	4/11/2026
v2.2.0	## Highlights - Security: Removed yaml.full_load() fallback — eliminated RCE vector from YAML loading path - Performance: O(n squared) to O(n) grouped uniqueness — 954x faster at 2,000 records (131s to 0.14s) - Maintainability: _check_rule() dispatch table — 417-line god function to 23 handlers + dict lookup - DX: 62 broken import paths fixed across 27 docs files for pip users ### Full changelog 15 code quality improvements shipped via PICK methodology (Possible/Implement/	Medium	4/11/2026
v2.1.0	## What's Changed Critical distribution fix — `pip install opendqv` now works correctly. `import opendqv` succeeds, all modules live under the `opendqv/` namespace, and no PyPI package collisions. ### Changes - Namespace restructure: All modules moved under `opendqv/` package — eliminates collisions with `sdk`, `security`, `core` top-level PyPI packages - SEC-001 hardened: `regex` library is now a required dependency — ReDoS timeout protection guaranteed on every install - **`opend	Medium	4/11/2026
v2.0.0	OpenDQV Core graduates from Alpha to Beta. No breaking changes from 1.9.8 — this release is a status milestone, not an API break. Existing 1.9.x deployments upgrade in place. ## What Beta means - Public API surface is stable. REST endpoints, contract YAML schema, MCP tool names, and Python SDK signatures will not change without a deprecation cycle (one minor release of warnings before removal). - Security fixes are backported to the latest 2.x line. - Coverage 93%, 3,398 tests	Medium	4/7/2026
v1.9.8	## Performance 4× regex throughput improvement — `_safe_match()` now calls `compiled_pattern.match(str_val, timeout=...)` directly on the pre-compiled `regex.Pattern` object. Valid-record mean latency: 0.161 ms → 0.040 ms. Invalid-record: 0.234 ms → 0.052 ms. ## Bug Fix Latent ReDoS timeout masking — `except _regex_lib.TimeoutError:` would have raised `AttributeError` if a regex timeout actually fired, masking the SEC-001 control. Fixed to `except TimeoutError:` (the builtin that `reg	Medium	4/3/2026
v1.9.7	## Coverage sprint: 90.87% → 93.0% 3398 tests (up from 3314 / +84 tests). `fail_under` raised from 90 to 93. ### What was covered - JSON decode exception handlers — `rule_heatmap`, `rule_failure_velocity`, `observation_fields` in `core/quality_analytics.py` - Auth function edge paths — open-mode invalid Bearer fallback, non-Bearer 401, `get_current_role` validator fallback in `security/auth.py` - Batch validation edge cases — `compare_to="now"` sentinel, date-parse string fall	Medium	4/2/2026
v1.9.6	## v1.9.6 — Coverage sprint continuation (89.8% → 90.9%) ### Summary This release addresses the three open items from CRT152: 1. Dead code removed — `api/routes_contracts.py`: the `except UnknownContextError` block in `generate_code_endpoint` was unreachable. `get_rules_with_context()` logs and falls back to base rules for unknown contexts; it never raises this exception. The dead try/except has been deleted. 2. `core/onboarding.py` coverage 80.8% → 91.9% — New tests in `test_onboard	Medium	4/2/2026
v1.9.5	## Coverage Sprint: 80.4% → 89.8% Aimed for 100%, landed at 89.8%. Coverage threshold raised from 80% to 89%. 3,251 tests (up from 2,933 / +318 new tests) ### New Test Files - `tests/test_cli_extended.py` (69 tests) — in-process `cmd_` function coverage - `tests/test_explainer.py` (63 tests) — all rule type handlers. `core/explainer.py` → 100%* - `tests/test_linter_extended.py` (31 tests) — required_if, allowed_values, date_diff, age bounds - `tests/test_storage_extended.py` (15 tests	Medium	4/2/2026
v1.9.4	## What's in v1.9.4 ### Coverage raised to 80% - 101 new tests across 7 new/extended test files - Rule types: `field_sum`, `forbidden_if`, `conditional_value`, `date_diff`, 8 checksum algorithms (`mod10_gs1`, `iban_mod97`, `isin_mod11`, `isrc_luhn`, `lei_mod97`, `nhs_mod11`, `cpf_mod11`, `vin_mod11`), `compare` edge cases - Import API `save=True` branches: dbt, soda, csv, CSVW, OTel, NDC, ODCS - Analytics endpoints: rejection-summary, rule-velocity, observation/summary/trend/fields - Worker hea	Medium	4/2/2026
v1.9.3	## CRT150 — Professional Quality Baseline Beta polish sprint — the trust signals that make a project credible before you read the code. ### Added - py.typed markers (PEP 561) — sdk/, core/, api/, security/ now declare type information. IDEs and type checkers will provide proper autocomplete and type safety for downstream users. - Coverage threshold — 77% enforced in CI. Measured baseline is 77.5%; threshold prevents silent regression across sprints. - SDK unit tests (`tests/tes	Medium	4/2/2026
v1.9.2	## What's changed ### Security - N3: `GET /tokens` now requires `admin` role — token metadata was visible to any authenticated user - N1: `SECURITY.md` updated to reflect M1 DNS rebinding fix and L2 token revocation fix ### Bug fixes - N4: `encoding="utf-8"` added to 12 `open()` calls in H1-refactored router files (Windows portability) - N8: Blocking DNS resolution in webhook `_send()` wrapped in `asyncio.to_thread()` - N5: `revoke_system_tokens` open-mode guard harmonised	Medium	3/31/2026
v1.9.1	## Changes ### Refactoring - H1: `api/routes.py` (2,764L) split into 8 domain sub-routers + `api/deps.py`. No URL or behaviour changes. Contributor onboarding significantly improved. ### Security - H2: Removed dead `require_role()` from `security/auth.py` — unused factory creating false sense of centralised RBAC - M1: Webhook SSRF hardened — hostname resolved and IP-checked at send time, not just registration. Mitigates DNS rebinding. - L1: `init_db()` removed from module imp	Medium	3/31/2026
v1.9.0	## Security fix (RT148 C2) High → fixed: Contract state machine now enforces valid lifecycle transitions. `set_status()` previously accepted any transition including `archived → active`, allowing an approver to bypass the maker-checker review workflow entirely. ### Transition map (enforced at `core/contracts.py`) \| From \| Allowed to \| \|------\|-----------\| \| `draft` \| `active`, `archived` \| \| `review` \| `active`, `draft`, `archived` \| \| `active` \| `archived`, `draft` \| \| `archived` \| `dra	Medium	3/31/2026
v1.8.9	## Security fix (RT148 C1) Critical: `POST /tokens/generate` now requires `admin` role in `AUTH_MODE=token`. Previously any authenticated user (even `validator` role) could call this endpoint and generate tokens with any role including `admin`, completely bypassing the RBAC model. ### Changes - `api/routes.py`: `caller_role: str = Depends(get_current_role)` + 403 guard for non-admin callers - `config.py`: `read_text(encoding="utf-8")` — Windows portability fix (M2) - `tests/test_rbac.py`:	Medium	3/30/2026
v1.8.8	## What's new in v1.8.8 ### Observation mode — analytics and workbench Observation-only mode (introduced in v1.8.7) is now fully instrumented with analytics and a dedicated dashboard. New API endpoints: - `GET /api/v1/observation/summary?days=7&contract=X` — would_have_failed count, enforcement_readiness_pct, by_contract breakdown - `GET /api/v1/observation/trend?contract=X&days=7` — daily time-series of observation violations - `GET /api/v1/observation/fields?contract=X&days=7` — top fai	Medium	3/28/2026
v1.8.7	## What's new in v1.8.7 ### Observation-Only Mode Run validation without blocking — the pilot entry-point feature. CLI: ```bash opendqv validate-file my_contract data.csv --observe-only ``` Exits 0 regardless of violations. Output labelled `OBSERVATION RUN`. `--output-failures` still works to export what would have been rejected. API: ```json POST /api/v1/validate {"contract": "my_contract", "record": {...}, "observe_only": true} ``` Returns HTTP 200 with `"mode": "observation_only"`	Medium	3/27/2026
v1.8.6	## What's in this release ### New features - Typed error codes — every validation failure carries `error_code: OPENDQV_{RULE_TYPE}_001`. Stable across contract versions. Safe for Kafka DLQ routing, PagerDuty rules, ServiceNow auto-tickets. See [docs/error_codes.md](docs/error_codes.md). - `opendqv validate-file <contract> <path>` — validate CSV/TSV/Parquet without starting the API server. Optional `--output-failures failed.csv` flag. - Benchmark suite — five standard workloads cover	Medium	3/27/2026
v1.8.5	## Bug fix `ENGINE_VERSION` was hardcoded as `"1.0.0"` since the project began. Every audit trail entry produced since v1.1.0 has been stamped with the wrong engine version — a credibility issue for any regulated customer reviewing the hash-chained audit chain. ### Fix `ENGINE_VERSION` now reads from `pyproject.toml` at runtime (source installs) with `importlib.metadata` as fallback (pip installs). The version in audit entries will always match the actual running version. A CI assertion	Medium	3/26/2026
v1.8.4	## What's new - New MCP tool: `get_quality_trend` — daily pass-rate with improving/declining/stable summary - Per-contract latency in `get_quality_metrics` (previously all contracts returned identical global figures) - `agent_id` filter on `get_quality_metrics` for single-source attribution - MCP server icon — OpenDQV logo now displayed in Claude Desktop and MCP-compatible clients - 2,635 tests passing (9 MCP tools total)	Medium	3/26/2026
v1.8.3	## What's new - `agent_id` as first-class analytics dimension — filter quality metrics by source agent - Rule failure velocity — track which rules are failing fastest over time - SQLite fallback for analytics when DuckDB is unavailable - MCP `get_quality_metrics` updated with `agent_id` support - 2,622 tests passing	Medium	3/26/2026
v1.8.2	## What's new - 10 customer demo scripts covering: HR, GDPR DSAR, Healthcare, MiFID II, DORA, SOX, Companies House, Martyn's Law, Building Safety, OOH proof_of_play - `context="demo"` persistence — demo context survives across validation calls - `teardown_demo.py` — one-shot cleanup including Marmot catalog entries - `DELETE /api/v1/quality/stats?context=` endpoint for selective stats reset - Bug fix: `date_diff` rule no longer fires when field is absent - 2,601 tests passing	Medium	3/26/2026
v1.8.1	## What's new - PPDS demo script for food allergen / Natasha's Law validation walkthrough - Contract metadata improvements (`catalog_visible` flag for allereasy_dish) - CHANGELOG updated Patch release — no new API surface; demo tooling and contract metadata only.	Medium	3/26/2026
v1.8.0	## What's new ### DuckDB OLAP analytics layer Completes the OLTP/OLAP split introduced in v1.7.1: - `core/quality_analytics.py` — new `QualityAnalytics` class. DuckDB attaches the SQLite `quality_stats` table directly via its built-in SQLite extension — zero data duplication from the OLTP write path. - `GET /api/v1/analytics/summary?days=N` — cross-contract pass rate summary, sorted worst-first (most useful for triage). - `GET /api/v1/analytics/rule-heatmap?days=N` — top-50 failin	Medium	3/25/2026
v1.7.1	## Bug fixes Single-record validations now persist to SQLite — `/validate` (single-record) previously wrote only to in-memory stats, so all per-contract pass rates were lost on API restart. Now also writes to the `quality_stats` SQLite table — same path batch validation already used. `push_quality_lineage.py` reads from SQLite, not in-memory stats — replaced `GET /api/v1/stats` (resets on restart) with per-contract `GET /api/v1/contracts/{name}/quality-trend?days=30` (SQLite-backed). P	Medium	3/25/2026
v1.7.0	## What's new MCP constraint field exposure — `get_contract` now returns `allowed_values`, `pattern`, `min_value`, `max_value`, `min_length`, `max_length` on every rule. AI agents no longer need to trigger validation failures to discover valid values. Real `window_hours` filtering — `get_quality_metrics(window_hours=N)` now actually scopes stats to the last N hours. `ValidationStats` gains a timestamped `_events` deque (`maxlen=10,000`). `GET /api/v1/stats?window_hours=N` added for RES	Medium	3/25/2026
v1.6.0	## What's new ### Marmot downstream consumers Contracts now support `downstream_consumers` — a list of Marmot MRNs for assets that consume the validated dataset (dashboards, dbt models, etc.). `push_quality_lineage.py` stitches direct `downstream` edges in Marmot automatically, completing the full lineage graph: ``` [source] → [opendqv:validate:X] → [opendqv:X] → [tableau/sales_dashboard] ``` ### `catalog_visible` flag Set `catalog_visible: false` on a contract to exclude it from Marmot catal	Medium	3/25/2026
v1.5.1	## What's in this release Maintenance patch — no new features, no breaking changes. ### DRY refactoring (PR #30) - 18 copy-paste violations eliminated across `api/routes.py`, `config.py`, `security/auth.py`, `main.py`, `cli.py` - 5 new route helper functions: `_get_contract_or_404`, `_get_contract_versioned_or_404`, `_get_contract_hash`, `_check_validate_in_states`, `_assert_contract_mutable` - `VALID_ROLES` centralised in `auth.py` — single source of truth for both API and CLI - `IS_OPEN_	Medium	3/24/2026
v1.5.0	## What's new ### Workbench UX Overhaul The Streamlit governance workbench has been significantly redesigned: - Grouped sidebar navigation — sections now organised under CORE, INTEGRATIONS, and CONTRACT TOOLS headers - Validate — "Validate Record" and "Validate Batch" merged into one section with a mode toggle; sample JSON generation is now explicit opt-in (no more auto-reset when switching contracts) - Audit Trail — "Version History" renamed to "Audit Trail" - Catalogs & AI —	Medium	3/24/2026
v1.3.3	## What changed Two compliance gaps in the `qsr_menu_item` contract closed. This contract enforces all 14 Natasha's Law allergens on Pre-Packed for Direct Sale (PPDS) food items. ### Fixes `sulphites_ppm` is now required when sulphites are declared Previously, if `contains_sulphites = "true"` but `sulphites_ppm` was omitted, the `min: 10` threshold rule silently never fired — the record passed with sulphites declared but no concentration recorded. Added `required_if: {field: contains_sul	Medium	3/23/2026
v1.3.2	### Windows Compatibility (RT96 — Python 3.13.12, real hardware benchmark) - Windows test runner — `scripts/windows_test.bat`: 3-run benchmark (matching RT72 Pi 400 methodology), pre-flight disk space + Python 3.11+ checks, UTF-8 mode, summary block with per-run timing, full cleanup. Verified: 2387 passed, 6 skipped, ~4:48 per run - UTF-8 encoding — explicit `encoding="utf-8"` on all `read_text()` / `write_text()` calls touching YAML files across `core/contracts.py`, `core/onboarding.p	Medium	3/22/2026
v1.3.1	## Developer Experience Postman collection — explore all 50 API endpoints in one click. Import `postman/OpenDQV.postman_collection.json` + `postman/OpenDQV.postman_environment.json` into Postman. 10 folders, auto-auth wiring, ready to run against `AUTH_MODE=open`. See [docs/postman.md](docs/postman.md). Demo Docker environment — pre-seeded, zero configuration. ```bash cp .env.example .env docker compose -f docker-compose.demo.yml up -d ``` ~740 validation events across 7 contracts.	Low	3/22/2026
v1.3.0	## What's new in v1.3.0 17 contracts upgraded from thin/weak presence checklists to production-grade validation with deep domain-specific rules and regulatory commentary. Contract portfolio: 0 thin/weak contracts remaining (was 14). ~71% production-grade (was 45%). Tests: 2,383 passing (was 2,261). ### Highlights - ISO 3779 VIN validation in `automotive_vehicle` - ClinicalTrials.gov NCT + ICH-GCP rules in `pharma_clinical_trial` - ISIN + LEI + Incoterms 2020 across financial and log	Low	3/22/2026
v1.2.3	### Features - `allowed_values` rule type — validate that a field value is one of an inline list without needing a separate lookup file. Supports single-record and DuckDB batch validation. ```yaml - name: status_valid field: status type: allowed_values allowed_values: [active, inactive, pending] severity: error error_message: "status must be one of: active, inactive, pending" ``` - Lifecycle webhooks — three new webhook events fire on contract lifecycle	Low	3/22/2026
v1.2.2	### Fixes - Code generator — silent gap eliminated: Rule types not implemented by a generator target previously emitted nothing (silent drop). Now emit an explicit `// NOTE: requires API validation` comment for known API-only types (`required_if`, `lookup`, `compare`, `date_diff`, `checksum`, etc.) and a `// TODO` comment for any unknown future types. Users deploying generated code can now see exactly which rules are enforced and which require the live API. - **Salesforce generat	Low	3/22/2026
v1.2.1	### UI - Governance Audit Trail — "Version History" tab renamed "Contract Audit & Lifecycle". Now shows hash chain integrity banner (✅ intact / ❌ broken), timeline view with proposed-by / approved-by / rejected-by / rejection-reason per entry, and raw history table in collapsible expander. All governance fields were already stored in the DB; this release surfaces them. ### Documentation - `docs/faq.md` — new FAQ covering: LLM/Claude scripts vs OpenDQV, GE/Soda/dbt compariso	Low	3/22/2026
v1.2.0	### Contracts - `dora_ict_incident` — EU DORA (Digital Operational Resilience Act), Articles 17-19. ICT incident reporting for EU financial entities (in force 17 January 2025). Enforces incident classification, 24h early warning and 72h notification windows via `date_diff` rule, root cause documentation for major/significant incidents, and remediation tracking. 30 rules. 3 new reference files. - `hipaa_disclosure_accounting` — US HIPAA 45 CFR 164.528. Accounting of disclosures	Low	3/21/2026
v1.1.0	### Contracts - `gdpr_processing_record` — UK GDPR Article 30 Record of Processing Activities (ROPA). Enforces lawful basis declaration (all 6 Article 6 bases), consent-specific fields (mechanism, timestamp, withdrawal) via `required_if`, Legitimate Interests Assessment gating, special category data basis (Article 9), international transfer safeguard, and DPO audit trail. 29 rules. 7 new reference files. - `gdpr_dsar_request` — UK GDPR Article 15 Data Subject Access Request ha	Low	3/21/2026
v1.0.7	## What's changed ### Fixes PyPI publish has been broken since v1.0.1. All six releases since the initial v1.0.0 failed to publish with `400 Bad Request` — `pyproject.toml` was stuck at `1.0.0` and every release attempted to re-upload a version that already existed on PyPI. This release fixes it permanently: - `publish.yml`: added `poetry version ${GITHUB_REF_NAME#v}` step — the package version is now derived from the git tag automatically on every future release - `publish.yml`: pinned	Low	3/21/2026
v1.0.6	## What's changed ### Contract #35 — `martyns_law_event` Martyn's Law (Terrorism (Protection of Premises) Act 2025) qualifying events contract — the follow-on to `martyns_law_venue` for temporary and one-off events where 200 or more persons are expected to attend. Key distinctions from `martyns_law_venue`: \| \| `martyns_law_venue` \| `martyns_law_event` \| \|-\|---------------------\|---------------------\| \| Responsible party \| Accountable Person (AP) \| Event organiser \| \| SIA obligation \| Reg	Low	3/21/2026
v1.0.5	## What's changed ### Contracts — two UK corporate and building safety compliance laws - `building_safety_golden_thread` — Building Safety Act 2022. Enforces the Act's own obligation — "accurate and up-to-date information throughout the building lifecycle" — for higher-risk buildings (18m+ or 7+ storeys). Mandatory fields: named Accountable Person, Building Safety Manager, BSR registration number, Safety Case documentation, fire and emergency file, residents engagement strategy, and gold	Low	3/21/2026
v1.0.4	## What's changed ### Contracts — two named-victim UK compliance laws - `qsr_menu_item` — Natasha's Law (Food Information (Amendment) (England) Regulations 2019, in force 1 October 2021). Allergen compliance contract for Pre-Packed for Direct Sale (PPDS) food. All 14 major allergens are mandatory fields — omission triggers a 422 before the record enters the system. Named after Natasha Ednan-Laperouse (2001–2016). 49 rules. - `martyns_law_venue` — Terrorism (Protection of Premises) Act	Low	3/21/2026
v1.0.3	## What's changed ### Fixes - Three additional unprotected context endpoints — `POST /generate`, `GET /export/gx/{name}`, and `GET /export/odcs/{name}` were missing the `UnknownContextError` guard. An unknown `context` parameter would produce an unhandled exception. Now returns 422 consistently across all six context-accepting endpoints. - Regex rule with no `pattern` now fails records — previously a misconfigured `regex` rule (no `pattern` field) silently passed every value. Now retu	Low	3/21/2026
v1.0.2	## What's changed ### Security - `python-jose` → `PyJWT` — `python-jose` carried `ecdsa` as a transitive dependency (CVE-2024-23342, Minerva timing attack on P-256 ECDSA). OpenDQV uses `HS256` exclusively — ECDSA operations are never called. Migrated to `PyJWT>=2.10.0`, which has zero extra dependencies. The packages `ecdsa`, `pyasn1`, and `rsa` are removed from the dependency tree entirely. No API changes — `jwt.encode`/`jwt.decode` signatures are identical. - *Starlette CVEs dismissed	Low	3/21/2026
v1.0.1	## What's changed Five issues identified through community stress-testing after the v1.0.0 launch, all fixed and fully tested. ### Bug fixes - `date_format` ignores `rule.format` — custom date/datetime formats specified in the contract (e.g. `'%Y-%m-%d %H:%M:%S'` for SQL Server-style timestamps) were silently ignored. The validator now uses `rule.format` first, then falls back to common formats. - `/explain` returns 401 in `AUTH_MODE=open` — the auth check order was inverted. Calling	Low	3/21/2026
v1.0.0	## OpenDQV v1.0.0 — Initial Public Release Trust is cheaper to build than to repair. OpenDQV is an open-source, contract-driven data quality validation platform. Validate records against YAML data contracts at the point of write — before data enters the pipeline. --- ### What's included - 24 rule types — regex, min/max, range, not_empty, date_format, compare, lookup, checksum, min_age, max_age, required_if, unique, cross_field_range, field_sum, date_diff, ratio_check, and	Low	3/14/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

MCP---Agent-Starter-Kit🚀 Build and explore multi-agent AI workflows with ready-to-use projects for document serving, Q/A bots, and orchestration.main@2026-06-05

OpenACMSelf-hosted autonomous AI agent — runs on your PC, controls your environment, connects to any MCP server.main@2026-06-02

headroomThe Context Optimization Layer for LLM Applicationsv0.22.4

contrastapiSecurity intelligence API and MCP server for AI agents. 25 tools, 35+ endpoints: CVE/EPSS/KEV, domain recon, SSL, IP reputation, threat intel, email security, code scanning. Free, no signup.v1.33.22

openecon-dataGive your AI agent accurate economic data. 330K indicators from FRED, World Bank, IMF, Eurostat & more. MCP server + web UI.main@2026-05-31

More in MCP Servers

claude-plugins-officialOfficial, Anthropic-managed directory of high quality Claude Code Plugins.

langchain4jLangChain4j is an open-source Java library that simplifies the integration of LLMs into Java applications through a unified API, providing access to popular LLMs and vector databases. It makes impleme

hyperframesWrite HTML. Render video. Built for agents.

claude-code-guideClaude Code Guide - Setup, Commands, workflows, agents, skills & tips-n-tricks go from beginner to power user!