Why Kirimana
The problem
Data contracts shouldn’t be click-through. But for most teams, they are: a YAML file nobody reads, an SLA nobody enforces, a classification flag nobody honors. The catalog tool says “owner: unknown”. The audit ran in March but nobody can find the file. The migration to the new platform drags every legacy decision with it because no one had a way to say drop this one.
We borrowed kirimana, Māori for contract, literally “where your mana shows”, because it captured the meaning English doesn’t have a word for. Mana is honor staked, not claimed. We’re trying to make data agreements feel that weighty again.
The thesis
Make the contract operational. Make ownership mandatory. Make the authoring path fail-closed where it matters. Make every action auditable from the start. Make migration a workflow rather than a slide deck. Do all of it in open source so any team can adopt it.
What we do that others don’t
1. Mandatory ownership, enforced structurally
Every contract names a real human. Every schema property names a
classification. The platform refuses what doesn’t. No more
owner: unknown — the CLI rejects the apply, not as policy
guidance, as a hard gate.
2. Fail-closed authoring guardrails
Inferred PII (names, e-mails, free-text identifiers, classified
attributes) cannot be authored as public without an audited
override and a written reason. kiri project init cannot leave
the project in an invalid state. Default is refuse, not warn.
Authoring time, PR time, apply time — all gated.
3. Mandatory one-line Databricks audit
Every Kiri → Databricks action — CREATE, INSERT, SHOW, job
submit, classification override — emits a single audit line. CLI
calls and MCP calls are audited identically with the same
correlation id. tail -f databricks-audit.log is the entire demo.
The regulator can read it without us being in the room.
4. Migration as a first-class workflow
AI translates legacy business-vault objects into ODCS contracts.
Every output lands on a review queue with KEEP / DROP /
MERGE / REDESIGN as first-class SME decisions — not just
accept / reject. A parity harness row-hash-checks AI-translated
output against a deterministic baseline; the input snapshot is
pinned, the hash algorithm is closed-menu, mismatches block
promotion. Cutover is shadow → diff → promote → rollback
as first-class verbs. Quarantine reprocess is a typed state
machine with a closed-menu of transitions. No silent rotting.
5. Start from the source you already have
Guided source-schema introspection samples a live source, classifies columns, and proposes populated bronze contracts — editable scaffolding, not auto-applied. Today the supported source surface is Databricks. Additional source-system targets are on the roadmap.
6. PR-time governance gates
kiri contract lint, kiri contract validate, kiri contract diff
run in CI before merge. Classification present, owner valid,
lineage resolves, no drift between docs and code. The governance
moves left, into the developer’s flow. Problems are caught before
data moves, not after it lands.
7. Federation without a central catalog vendor
Cross-project catalog snapshots over three transports (in-process,
HTTP+ETag, fs-static). health() returns OK / STALE / UNAVAILABLE;
browse serves stale-with-marker; lint, apply and AI-policy fail
closed when a federated source isn’t reachable. Multi-team
governance across domains and repositories without depending on a
central paid catalog.
8. Technique-neutral build path
The Semantic Intent Model maps business outcomes to columns regardless of which silver technique sits underneath — Flat, Data Vault 2.0, or conformed Kimball silver all feed the same gold contract. Gold compiles to a Kimball star schema with computed columns as a declarative property; no SQL strings hand-authored.
9. Production Data Vault silver
Raw vault + business vault + PIT bridges as a production technique with an optional DataVault4dbt backend wired through the contract. Pick the engine; the contract stays canonical.
10. Apache-2.0, no BSL planned
The product ships under Apache-2.0. There is no “community edition” feature-gating in the codebase today. We don’t plan a BSL relicense; if that ever changes, we’d say so before doing it.
How we compare
| Kirimana | Atlan / Collibra / Alation | dbt Cloud | Gable.ai / Datatera | Unity Catalog / Purview | |
|---|---|---|---|---|---|
| Open source | ✓ Apache-2.0 today | ✗ | Engine OSS, Cloud paid | ✗ | ✗ |
| ODCS v3 canonical artefact | ✓ on disk in git | ✗ | partial (model.yml) | partial | ✗ |
| Start from existing source | ✓ guided source-schema introspection (Databricks today) | ✗ | ✗ | ✗ | partial |
| Fail-closed authoring guardrails | ✓ PII inferred → public refused with audited override | ✗ | ✗ | partial | ✗ |
| Mandatory action-level audit | ✓ one line per Kiri → Databricks action | configurable | limited | limited | configurable |
| Migration as a verb set | ✓ translate · parity · cutover · KEEP/DROP/MERGE/REDESIGN | ✗ | ✗ | ✗ | ✗ |
| Federation with fail-closed health | ✓ three transports · health states | ✗ | ✗ | ✗ | ✗ |
| Compliance evidence generators (DORA / EU AI Act / GDPR) | roadmap | add-on | ✗ | ✗ | partial |
Where the contract becomes operational
Where Atlan stops at the catalog and dbt stops at transformation, Kirimana operationalises the contract. From draft → apply → audit → cutover, all in one Apache-2.0 codebase. Migration is a verb set, not a project plan.
Try it
- Talk to Kiri — ask any of the above to the AI assistant
- See what’s shipped today — capability-by-capability
- See the solutions — Databricks today, OSS edition on the roadmap
- Compare in detail — vendor-by-vendor tables