Why Kirimana

The problem

Data contracts shouldn’t be click-through. But for most teams, they are: a YAML file nobody reads, an SLA nobody enforces, a classification flag nobody honors. The catalog tool says “owner: unknown”. The audit ran in March but nobody can find the file. The migration to the new platform drags every legacy decision with it because no one had a way to say drop this one.

We borrowed kirimana, Māori for contract, literally “where your mana shows”, because it captured the meaning English doesn’t have a word for. Mana is honor staked, not claimed. We’re trying to make data agreements feel that weighty again.

The thesis

Make the contract operational. Make ownership mandatory. Make the authoring path fail-closed where it matters. Make every action auditable from the start. Make migration a workflow rather than a slide deck. Do all of it in open source so any team can adopt it.

What we do that others don’t

1. Mandatory ownership, enforced structurally

Every contract names a real human. Every schema property names a classification. The platform refuses what doesn’t. No more owner: unknown — the CLI rejects the apply, not as policy guidance, as a hard gate.

2. Fail-closed authoring guardrails

Inferred PII (names, e-mails, free-text identifiers, classified attributes) cannot be authored as public without an audited override and a written reason. kiri project init cannot leave the project in an invalid state. Default is refuse, not warn. Authoring time, PR time, apply time — all gated.

3. Mandatory one-line Databricks audit

Every Kiri → Databricks action — CREATE, INSERT, SHOW, job submit, classification override — emits a single audit line. CLI calls and MCP calls are audited identically with the same correlation id. tail -f databricks-audit.log is the entire demo. The regulator can read it without us being in the room.

4. Migration as a first-class workflow

AI translates legacy business-vault objects into ODCS contracts. Every output lands on a review queue with KEEP / DROP / MERGE / REDESIGN as first-class SME decisions — not just accept / reject. A parity harness row-hash-checks AI-translated output against a deterministic baseline; the input snapshot is pinned, the hash algorithm is closed-menu, mismatches block promotion. Cutover is shadow → diff → promote → rollback as first-class verbs. Quarantine reprocess is a typed state machine with a closed-menu of transitions. No silent rotting.

5. Start from the source you already have

Guided source-schema introspection samples a live source, classifies columns, and proposes populated bronze contracts — editable scaffolding, not auto-applied. Today the supported source surface is Databricks. Additional source-system targets are on the roadmap.

6. PR-time governance gates

kiri contract lint, kiri contract validate, kiri contract diff run in CI before merge. Classification present, owner valid, lineage resolves, no drift between docs and code. The governance moves left, into the developer’s flow. Problems are caught before data moves, not after it lands.

7. Federation without a central catalog vendor

Cross-project catalog snapshots over three transports (in-process, HTTP+ETag, fs-static). health() returns OK / STALE / UNAVAILABLE; browse serves stale-with-marker; lint, apply and AI-policy fail closed when a federated source isn’t reachable. Multi-team governance across domains and repositories without depending on a central paid catalog.

8. Technique-neutral build path

The Semantic Intent Model maps business outcomes to columns regardless of which silver technique sits underneath — Flat, Data Vault 2.0, or conformed Kimball silver all feed the same gold contract. Gold compiles to a Kimball star schema with computed columns as a declarative property; no SQL strings hand-authored.

9. Production Data Vault silver

Raw vault + business vault + PIT bridges as a production technique with an optional DataVault4dbt backend wired through the contract. Pick the engine; the contract stays canonical.

10. Apache-2.0, no BSL planned

The product ships under Apache-2.0. There is no “community edition” feature-gating in the codebase today. We don’t plan a BSL relicense; if that ever changes, we’d say so before doing it.

How we compare

	Kirimana	Atlan / Collibra / Alation	dbt Cloud	Gable.ai / Datatera	Unity Catalog / Purview
Open source	✓ Apache-2.0 today	✗	Engine OSS, Cloud paid	✗	✗
ODCS v3 canonical artefact	✓ on disk in git	✗	partial (model.yml)	partial	✗
Start from existing source	✓ guided source-schema introspection (Databricks today)	✗	✗	✗	partial
Fail-closed authoring guardrails	✓ PII inferred → public refused with audited override	✗	✗	partial	✗
Mandatory action-level audit	✓ one line per Kiri → Databricks action	configurable	limited	limited	configurable
Migration as a verb set	✓ translate · parity · cutover · KEEP/DROP/MERGE/REDESIGN	✗	✗	✗	✗
Federation with fail-closed health	✓ three transports · health states	✗	✗	✗	✗
Compliance evidence generators (DORA / EU AI Act / GDPR)	roadmap	add-on	✗	✗	partial

Where the contract becomes operational

Where Atlan stops at the catalog and dbt stops at transformation, Kirimana operationalises the contract. From draft → apply → audit → cutover, all in one Apache-2.0 codebase. Migration is a verb set, not a project plan.

Try it

Talk to Kiri — ask any of the above to the AI assistant
See what’s shipped today — capability-by-capability
See the solutions — Databricks today, OSS edition on the roadmap
Compare in detail — vendor-by-vendor tables