Architektur

The data contract — doctrine, naming grammar, worked field-shapes

references/contract.md — direkt aus der Skill-Doktrin gerendert.

Shape-truth moved to the schema (Plan 15). The machine-checkable JSON Schema at contract/schema/*.schema.json, run by contract/validate.mjs (shape + semantic tier + the MR-5 doc-lint), is now the single source of truth for every field’s shape. This document is the augmenting literature: the doctrine, the why, the naming grammar, and worked field-shape examples that sit on top of the schema — the same inversion as tokens.ts → tokens.css. Where any shape statement here disagrees with the schema, the schema wins. Plan 15 also made the core target-only (zero back-compat): several shapes this file used to narrate were retired and are corrected throughout —

axisWeights / weightingPresets → weights / presets (drift ①)

forkPositions / facetForks / liveFillerForks → the generalized options[].prune:{set,keep} typed-set narrowing (MR-1/MR-3; forkPositions was a phantom — never built)

mission.json + criteria.json → folded into requirements.json (drift ④); the standalone files are not read by the core

kind enum → render-only string (no enforced enum; drift ③)

knowledge facet build → effort (MR-2)

every field carries a description, machine-enforced (MR-5)

This is the always-open doctrine lookup from Phase 2 onward. It explains how to think about the JSON files the kernel reads and the cockpit renders — paired with the schema, which defines their exact shape — plus the canonical naming grammar that settles every key-vs-id, German-vs-ascii, file-naming tension, and the thin scoring discipline that keeps the data honest. One contract spans the whole engagement — each file is one phase’s residue, and the schema is the common thread that keeps requirements, research, and presentation aligned.

This file owns architectures. It does not re-teach the model: the four axes, the single-placement rule, the funnel, the fork’s two coordinates, coverage, and the three registers all live in model.md — cited here, never restated. The cost method (teardown → questionnaire → bands → 4-bucket rollup) lives in pricing-tco.md; this file carries only the pricing field shapes.

The weight tiers

Every scored criterion carries a tier that sets how much it counts. The four tiers (orthogonal to the axis it sits on — see model.md on Gate · Fit · Cost · Risk):

Tier	Type	Weight	Purpose
`hard`	Binary (pass/fail)	n/a	Eliminate
`critical`	Scored 1–5	×5	Differentiate decisively
`standard`	Scored 1–5	×3	Differentiate meaningfully
`nice-to-have`	Scored 1–5	×1	Tiebreakers

A solution’s Fit score is the weighted sum over its scored criteria; the kernel computes it. Hard filters are not scored — a single failure eliminates, full stop. hard ≡ the Gate axis; critical/standard/nice-to-have are the scored tiers used by both Fit and Risk (which answer different questions — see model.md). There is no cost tier: cost is priced into the 4-bucket TCO, never scored as a weighted criterion (see pricing-tco.md).

Schema-bound scoring discipline

These are the rules the data must obey — the doctrine behind them (single placement, cost-out-of-Fit, the consultant-authored verdict) is in model.md.

Filters first. A failed Gate is elimination — don’t score the rest. Record it as a hard-filter 0, never a row deletion (see “Eliminations are recorded” below).
Eliminations are recorded, not deleted. Keep every eliminated candidate in the data (hard-filter 0, or noted in the loop log) with the reason it fell. Criteria shift and understanding grows; a struck-through tool is the first place a later round looks, so never drop it from the matrix.
Rescore visibly. When the red team or a broadening round surfaces new evidence, update the score and note the change in the reasoning. Track it; don’t silently overwrite.
Back-score new criteria across everyone. When a criterion emerges mid-engagement, score every existing candidate on it before trusting the ranking. A discovered criterion left as a footnote that never touches the totals is the classic way a “rescore” changes nothing it should have.
Cross-check quant vs. qual. If the scores rank A first but the verdict prose names B, something is wrong in one of them — investigate the underlying assumption, don’t reconcile silently. The ranking is advisory; the human makes the final call (the verdict is authored, not computed — model.md).
Coverage ↔ score ↔ narrative must agree (the consistency invariant). For a use case, a solution’s coverage code, its Fit scores[critId], and its reasoning/qualitative prose are three views of one truth and must not contradict. An in (core delivers it natively) that scores Fit 2, a build claimed at Fit 5 with no work package, or prose praising a capability the score marks weak is a defect to investigate, not to silently reconcile. Coverage and Fit stay separate fields — neither is computed from the other (model.md § Coverage) — so the rule is that they be consistent, and the mismatch stays visible until resolved. contract/validate.mjs’s semantic tier flags the mechanical cases (a scored cell with no reasoning, a coverage code with no matching criterion).
Compress dynamic range deliberately. If every candidate scores 4–5 on a criterion, sharpen the rubric or remove it — it isn’t differentiating.
Cost stays a separate axis in the data. No criterion carries axis:"cost"; the weighted score never contains money. Pricing fills the pricing / costModel fields and is presented as a band against the Fit score (pricing-tco.md).

Writing a hard filter — gate phrasing

A hard filter is a loose, literal minimum bar — not a quality judgment. Phrase it so it can be judged true/false against its own text, escape hatches included (e.g. “EU/EEA or SCC + TIA”). A candidate passes if it clears the written bar; how well it clears it is not the gate’s business. Where degree matters, pair the gate with a scored criterion that judges quality — e.g. the loose F-DSGVO-01 gate (“basic GDPR compliance is reachable — DPA available, EU/EEA or SCC + TIA”) paired with the scored N-GDPR-DEPTH-01 (“how thorough: default EU residency, sub-processor transparency, certs, audit log”) — and keep the quality judgment out of the gate. When writing any hard filter, ask: am I smuggling a “how good” judgment into a yes/no? If so, split it into a loose gate plus a scored depth criterion. Over-tight or compound gates are how good tools get silently and wrongly eliminated.

The same caution applies to fragile or optional capabilities — an AI layer, a third-party API, anything that’s the first thing to break is never itself a hard requirement. If it matters, gate the independence from it instead — require every use case to work without it (graceful degradation) — and let the enhancement’s quality score on Fit/Risk. Gating the fragile thing makes your whole shortlist hostage to it.

File schema

The cockpit (and your ingestion) expects these files in the engagement folder.

`config.json` — cockpit configuration

{
  "title": "Acme Tool Evaluation",
  "subtitle": "Choosing a booking + CRM stack",
  "tierWeights": { "critical": 5, "standard": 3, "nice-to-have": 1 },
  "categories": [
    { "key": "booking", "label": "Booking Software", "matrixFile": "matrix-booking.json" },
    { "key": "crm",     "label": "CRM",              "matrixFile": "matrix-crm.json" }
  ]
}

For a single-category engagement, use one category. tierWeights are adjustable live in the UI, but these are the defaults.

Solution-layer additions — axis weighting. Beyond title / tierWeights / categories, the solution layer adds Fit/Cost/Risk weighting via weights and presets (drift ①: the canonical keys every instance ships; the former axisWeights / weightingPresets are retired):

{
  "weights": { "fit": 1, "cost": 1, "risk": 1 },
  "presets": [
    { "key": "ausgewogen", "label": "Ausgewogen", "weights": { "fit": 1, "cost": 1, "risk": 1 } },
    { "key": "sicherheit", "label": "Sicherheit & geringes Risiko", "weights": { "fit": 1, "cost": 0.5, "risk": 2.5 } },
    { "key": "empfehlung", "label": "Meine Empfehlung", "recommended": true,
      "weights": { "fit": 1.5, "cost": 1, "risk": 2 },
      "reasons": "Kleine Organisation, Kinderdaten, harte Deadline — darum Risiko ×2." }
  ]
}

weights are the live Fit/Cost/Risk weights; presets are named settings the analyst and client switch between (these are the Phase-5 value-fork positions in data form). One preset may carry recommended:true and a reasons string — that’s the consultant’s weighting, not “the answer”; the verdict is authored separately (model.md § the verdict).

`requirements.json` — the consolidated authored input (Phase 1 + 2)

The single authored input a solution cockpit reads: it folds the former mission.json (Phase 1) and criteria.json (Phase 2) into one file, alongside the authored narrative that used to live only in requirements.md. requirements.md is generated from it (the criteria tables are regenerated from criteria[], so they can never drift from the data the cockpit scores on — cockpit.md build step; generator template/scripts/generate-requirements-md.mjs). The core reads only ctx.data.requirements (drift ④, zero back-compat) — there is no mission / criteria fallback. The standalone mission.json / criteria.json files are retired in the core; the field-shape sections below document the shapes that now live under requirements.json (purpose/useCases/… top-level, criteria[] under criteria).

{
  "title": "Anforderungen & Constraints — <Kunde>",
  "statusBanner": "One-line status (markdown) shown as a blockquote in the generated .md.",
  "purpose": "…", "useCases": [ … ], "stakeholders": [ … ], "stressTest": "…",
  "values": [ { "value": "A stated value, in the client's terms.", "evidence": "What it rests on." } ],
  "narrative": {
    "leitsatz": "The one sentence carrying the whole thrust (markdown).",
    "sections": [ { "h": "Section heading", "md": "Authored markdown, emitted verbatim." } ]
  },
  "criteria": [ … ],
  "costNote": "…", "notRequirements": [ "…" ],
  "constraints": { "Category": "…" }, "openQuestions": [ "…" ]
}

purpose / useCases / stakeholders / stressTest — identical to the mission.json fields below (now top-level here). useCases[] may carry id / currentState / improvementLevel / enables exactly as in mission.json.
criteria — shape-identical to criteria.json’s criteria[] (same id / tier / axis / riskDim / useCase / area fields). The read path is ctx.data.requirements.criteria. This is the single source the kernel scores on — never duplicate it into the generated .md.
values (CR-A-4) — the client’s stated values as { value, evidence }. Harvested from the engagement record; not fabricated as verbatim meeting quotes unless a transcript supplies them.
narrative.leitsatz + narrative.sections[] — authored prose (register-2), emitted verbatim by the generator; this is where the bespoke layer model / sources / etc. live.
costNote / notRequirements[] / constraints{} / openQuestions[] — the remaining generated-doc sections, each optional (the generator drops a heading whose field is absent).
statusBanner / title — drive the generated .md header. Register-2 (analyst-facing).

The Mission fields — `purpose` / `useCases` / `stakeholders` / `stressTest` (Phase 1, under `requirements.json`)

Station 1 of the cockpit (the Mission page) reads these — now top-level fields of requirements.json (the former standalone mission.json is retired in the core, drift ④). They are the Phase-1 deliverable in machine form: the purpose, the tool-neutral use cases, the stakeholders, and the one stress test. Call them use cases, never “jobs” — see § Naming grammar (D13). The station renders useCases[] directly.

{
  "purpose": "One-sentence mission: what the system is for, stated tool-neutrally.",
  "useCases": [
    { "id": "followup", "title": "Short label",
      "what": "What the system must DO — never naming a product category.",
      "currentState": "manual", "improvementLevel": "digitize", "enables": [] }
  ],
  "stakeholders": [
    { "role": "Decides & pays", "who": "Who signs off the purchase." },
    { "role": "Uses daily",     "who": "The lead users; their comfort is a requirement, not a nice-to-have." },
    { "role": "The sceptic",    "who": "Who is most likely against it, and why." }
  ],
  "stressTest": "The one concrete, near-future high-stakes scenario every option must pass."
}

useCases: tool-neutral — the test is can you state it without naming a product category? “Track follow-ups with reminders” is a use case; “needs a CRM module” is not. Use cases are the spine of the functional requirements (the spine semantics — what descends from a use case and what is free-standing — live in model.md).
id: stable key. The Fit criteria that serve a use case reference it via useCase — the spine link (see criteria.json and § Naming grammar D7).
currentState × improvementLevel: the transition the use case asks for (Ist → Soll) — what tells you how far it must move, and therefore how demanding its Fit criteria are. currentState: "none" (no process yet) · "manual" (done by hand) · "digitized" (already in some tool, with data). improvementLevel: "enable" (stand up a process that didn’t exist) · "digitize" (lift a manual process into a tool) · "improve" (sharpen an existing digital one) · "ai-assist" (dock an AI layer on top).
enables: ids of use cases this one is a precondition for. A supporting / enabling use case discovered mid-engagement — e.g. a knowledge base an ai-assist process needs to work — is added here with an enables edge to the use cases it unblocks, so a discovered enabler visibly feeds the others instead of floating free.
stakeholders: { role, who } — at least decides / uses-daily / sceptic.
stressTest: the Phase-1 stress case (EFI’s was an 18-city tour) — one string.
Optional for a pure single-tool pick; required for a solution cockpit built from template/. contract/validate.mjs checks requirements.json carries title / purpose / criteria (shape) and that the Mission fields are present (semantic warnings).

The `criteria[]` array — the requirements, as data (Phase 2, under `requirements.json`)

An array of criteria objects under requirements.json’s criteria key (the former standalone criteria.json is retired in the core, drift ④). Produced in Phase 2 — it is the requirements in machine form, and the human signs it off before research (the elicitation instrument is requirements-interview.md).

[
  { "id": "SH-H1", "name": "GDPR compliance",   "category": "shared",  "tier": "hard",        "weight": "0",
    "description": "DPA available, EU hosting or adequacy, ISO 27001 / SOC 2. Binary." },
  { "id": "SH-C1", "name": "Approachability",   "category": "shared",  "tier": "critical",    "weight": "5",
    "description": "Non-technical owner productive in <1 week. 3 = needs training; 5 = self-evident." },
  { "id": "SH-P1", "name": "Vendor stability",  "category": "shared",  "tier": "standard",    "weight": "3",
    "description": "No acquisition / sunset risk; healthy funding and roadmap." },
  { "id": "BK-C1", "name": "Online self-booking","category": "booking","tier": "critical",    "weight": "5",
    "description": "Clients book 24/7 with no phone call. 5 = branded, books in <3 taps." },
  { "id": "SH-N1", "name": "API quality",       "category": "shared",  "tier": "nice-to-have","weight": "1",
    "appliesTo": ["crm", "booking"], "description": "REST depth, webhooks, sane rate limits." }
]

category: "shared" (applies to every category) or a category key from config.
tier: "hard" | "critical" | "standard" | "nice-to-have" (see § The weight tiers).
axis (optional): "gate" | "fit" | "risk" — which bucket the criterion measures (the axis doctrine and single-placement rule are in model.md). Omit it and the criterion falls back to the tool-level default (Gate if hard, else Fit) — so existing tool-pick engagements need no change. There is no "cost" axis on a criterion (see § Scoring discipline). Enum is lowercase in data; Gate/Fit/Cost/Risk are the prose forms (§ Naming grammar D8).
riskDim (only when axis:"risk"): the risk dimension this criterion feeds — e.g. "datensicherheit" | "resilienz" | "abhaengigkeit" | "terminSicherheit". A solution’s score on a dimension is the average of its risk criteria with that riskDim (some dimensions are instead hand-scored per solution — see solutions.json riskScores). Field keys are lowercase-ascii; the German display label (Datensicherheit, Termin-Sicherheit) is a presentation concern (§ Naming grammar D8).
description: a one-line definition, and for preferences what separates a low score from a high one. This is what the human confirms in the Phase-2 sign-off, and the cockpit shows it on hover — so a vague or compound criterion gets caught before research, not after. Strongly recommended on every criterion.
weight: string; "0" for hard filters (they gate, not score). The UI’s tier weights override per-tier, so the per-criterion weight is mostly documentation.
appliesTo (optional): restrict a shared criterion to specific categories.
useCase (optional, Fit criteria): the requirements.json useCases[] use-case id this criterion serves — the spine link. Absent = a free-standing criterion: a constraint or non-functional bar not descended from one use case (the spine semantics are in model.md).
area (optional): a cluster label for grouping criteria in the composition views (e.g. "Wissen", "Offerte", "Technik"). The views group by the distinct area values present — generalizes a hardcoded epic list, so the clustering is engagement-defined, never baked in.
ID convention: SH- for shared, a 2-letter prefix per category (CR-, BK-, EV-…), then H#/C#/P#/N# for hard/critical/standard/nice-to-have. IDs are keys — keep them stable once research references them (the id vs key rule is § Naming grammar D7).

`matrix-<category>.json` — the filled form

An array of tool objects, one file per category. This is what deep research fills in.

[
  {
    "name": "ToolName",
    "scores": { "SH-H1": "1", "SH-C1": "4", "BK-C1": "3.5" },
    "reasoning": {
      "SH-H1": "EU-hosted, ISO 27001, DPA available. 2–4 sentences citing evidence.",
      "BK-C1": "Self-booking works but no group bookings; walked through the scenario..."
    },
    "qualitative": {
      "verdict": "One-sentence final read on this tool.",
      "strengths": ["Notable strength", "Another"],
      "weaknesses": ["Weakness", "Another"],
      "risks": "Key risks / red-team findings.",
      "fit": "Why it does or doesn't fit THIS customer specifically."
    },
    "pricing": {
      "model": "per-seat + per-resolution",
      "drivers": ["seats", "resolutions"],
      "oneTime": 0,
      "estimate": { "optimistic": 3730, "expected": 6920, "pessimistic": 16300, "basis": "year1-seasonal" },
      "verified": true,
      "pricingUrl": "https://...",
      "notes": "EEA add-on; annual-commit discount in optimistic; unverified AI rate high in pessimistic"
    }
  }
]

scores: keyed by criterion ID — no positional alignment. Hard filters are "0"/"1"; preferences "1"–"5" (floats like "3.5" allowed). Omit a key entirely for “not yet evaluated” (the cockpit shows it as ?).
reasoning: 2–4 sentence note per scored criterion, with specific data points. Shows as a tooltip on hover in the matrix.
qualitative: every tool, including eliminated ones, gets a verdict + strengths/weaknesses/risks/fit. This powers the detail view.

The `pricing` field — cost teardown + 3-point TCO (per tool)

The cost teardown and 3-point band for a tool, produced by the pricing pass. The cost method that fills these — the teardown, the parameter-union questionnaire, the optimistic/expected/pessimistic computation, band-width-as-cost-risk — lives in pricing-tco.md; the field shapes are here:

model — the charging model in words (e.g. "per-seat + per-resolution", `"tiered flat
- one-time", “per-seat OR self-host”`).
drivers — the 2–3 parameters that actually move the cost (["seats","resolutions"]).
oneTime — one-time onboarding/integration cost (number; 0 if none).
estimate — the 3-point band: optimistic / expected / pessimistic (numbers) plus a basis string (e.g. "year1-seasonal", "steady-state-annual"). The cockpit renders this band and its width (= cost risk).
verified — true once a human confirmed the rates against the vendor’s pricing page.
pricingUrl, notes — source link and the key billing-mapping assumptions / excluded costs (ops labour, a mandatory second product, etc. — annotate these or the figure lies; see pricing-tco.md).
Back-compat: a simple "annual": <number> (with optional tier/perUser) still renders as a single list-price row when no estimate is present. Keep numbers as plain numbers/strings.

`landscape.json` — the raw candidate survey (Phase 4, optional)

Backs the cockpit’s Tools view (renderLandscape). The raw, pre-score output of the landscape survey — a flat list of candidate tools with gate verdicts and capability attributes, before any scoring. (The scored surface is solutions.json → Cockpit/Matrix, or the bundled explorer.html for a pure tool pick — see cockpit.md.) Optional: omit the file and the Tools view shows an empty note.

{
  "tools": [
    {
      "name": "Acme Core", "vendor": "Acme GmbH", "category": "Core platform",
      "url": "https://…", "summary": "One-line what-it-is.", "capabilities": "Free-text capability note.",
      "gates": { "G-1": "pass" },
      "tags": ["Cloud", "Open API"],
      "attrs": { "Hosting": "EU cloud", "Pricing": "Sub €/mo" }
    }
  ]
}

gates: an object keyed by the Gate-axis criterion IDs from requirements.json criteria[] (G-1 above) → "pass" | "fail" | "unknown". The view renders one badge per Gate-axis criterion and reads its verdict here — so gates are engagement-defined, never hardcoded.
category: a free grouping tag; the view’s category buttons auto-derive from the distinct values present.
tags: free capability flags; the filter chips auto-derive from their union.
attrs: free label → value rows shown in the click-detail. summary/capabilities are the detail prose; url is the outbound link.
This is engagement substance — re-derive it from this job’s survey; never carry another engagement’s landscape.json across (transferring-between-engagements.md).

`research-index.json` — the research narrative + append-only log (Phase 4, optional)

Backs the cockpit’s Report view (renderReport). The readable synthesis of the survey: one proposal sketch per architecture (a position), the cross-cutting assist bricks, the open meeting questions, links to the raw research files — and the canonical append-only round log.

{
  "meta": { "note": "One-line state of the research (e.g. Round 1 done, deep-dive open)." },
  "rounds": [
    { "when": "2026-06", "focus": "Landscape survey across all three categories.",
      "found": "5 live candidates; two anchors confirmed unviable.",
      "shifted": "Killed the fragmented best-of-breed architecture; F-2 now leans 'fertige Basis'.",
      "findings": [
        { "id": "H1", "revisedIn": "R2", "note": "Andock-Paradox: licensable ⇒ closed; open ⇒ pool-gated." }
      ] }
  ],
  "positions": [
    { "pos": 1, "posLabel": "Finished platform", "name": "Acme Core", "role": "recommended",
      "confidence": "How solid the finding is.", "proposition": "The core pitch in one line.",
      "optimizes": "What it optimizes for.", "tradeoff": "What you give up.", "briefRef": "Source brief.",
      "standouts": [{ "name": "Acme Core", "tag": "Favourite" }],
      "openQuestions": ["What must still be clarified."], "expectedFailure": "Where it likely breaks." }
  ],
  "assists": [
    { "label": "Cross-cutting brick", "proposition": "What it does across positions.",
      "found": "What the research found.", "open": ["Open question."] }
  ],
  "meetingQuestions": ["Open question for the meeting."],
  "researchFiles": [{ "label": "Landscape findings", "path": "…", "what": "Readable synthesis." }]
}

rounds[] (optional but canonical): the append-only research log — one entry per research round, in order. when · focus (what the round investigated) · found (what it surfaced) · shifted (how it moved the picture — eliminations, rescores, a fork that tilted). This is what makes the Report screen a log of how the engagement learned: a regression appends a round here while the live judgments (scores, shortlist, fork positions) are updated in place — so history lives in the Report and current truth lives in the stations (the append-not-rewind regression rule is owned by process.md). Never rewrite an old round; add a new one. (rounds[] vs loop-log.md — see § Naming grammar D12: distinct roles, keep both.)
- findings[] (optional) — the round’s named structural findings, each a stable { id, revisedIn, note }. The id (e.g. "H1") is what an architecture’s verdict.basis[] points at; revisedIn is the round the finding last materially changed (it appears, an elimination/gate-flip lands, a score crosses a tier). This is the bottom of the judgement hierarchy (model.md): a finding’s revisedIn, read against a parent verdict’s verdict.asOf, is what the kernel’s staleness() compares to decide whether a standing architecture/comparison verdict needs re-examination. Bump revisedIn only on a real change — cosmetic edits must not trip the badge.
positions[]: one per architecture/architecture; role is "recommended" or "anchor". A standouts[] chip in the view jumps to that tool in the Tools view.
assists[], meetingQuestions[], researchFiles[]: all optional sections — omit and the view drops the heading.
Engagement substance, like landscape.json — re-derived per job, never ported.

The solution-level layer (Scenario + solutions)

Everything above is the tool-level layer — criteria + a tool matrix — and it is what the bundled explorer renders. A solution-design engagement adds a solution-level layer on top: the shared Scenario and the candidate solutions composed from those tools (the model is in model.md § Composition). These files are consumed by the solution cockpit (built by copying template/ onto the pinned shell + kernel — see cockpit.md), not by the bundled explorer. Tool scores still live in matrix-<category>.json; solutions reference tools by name and add build + integration + cost + hand-scored risk.

In the cockpit these ship as cockpit.json (the shared Scenario + the shared workPackages library, hoisted out because shared setup is identical across paths) and solutions-<variant>.json (the scored compositions) — not a standalone scenario.json. The field schemas below are unchanged; only the file packaging differs (§ File-name reconciliation).

`scenario` (shipped in `cockpit.json`) — the shared demand model

A flat object of demand knobs. The exact keys are engagement-specific; what matters is that every solution’s cost and load-sensitive scores read from this one object (the Scenario as forcing function — model.md).

{
  "seats": 4,
  "activeMonths": 6,
  "emailsSeason": 8500,
  "chatYr": 5000,
  "deflectPct": 60,
  "peakMult": 10,
  "peakWeeks": 2,
  "dayRate": 500,
  "annual": false
}

`solutions-<variant>.json` — the candidate compositions

An array of solutions. Each references tools (by matrix-<category>.json name), adds its build/integration work packages, a cost model, and hand-scored risk dims.

[
  {
    "key": "composed-chatwoot",
    "name": "Chatwoot + Sidecar",
    "kind": "composed",                       // render-only label (no enforced enum; drift ③)
    "tools": ["Chatwoot"],                     // catalog tools in this composition (→ Fit, Cost ③④)
    "coverage": {                              // per use-case: what the core does vs. how a gap is filled
      "F-TRIAGE-01": "in", "F-DRAFTS-01": "aug:llm", "F-LOOKUP-01": "part:knowledge", "F-MIGRATE-01": "proj" },
    "architecture": "miete",                          // architecture key (→ architectures[].key); the Phase-4 board groups by this
    "recommended": true,                       // optional — the AUTHORED verdict (register-1/2): this is the favoured solution
    "verdictNote": "Offen, andockbar, EU-self-host — der pragmatische Mittelpunkt.",  // one-line why, shown as the lead
    "knowledge": {                             // per-facet CONFIDENCE (not value); absent facet → "offen". MR-2: effort, not build.
      "spezif": "bekannt", "kosten": "geschätzt", "effort": "offen", "risiko": "offen", "fit": "bekannt" },
    "status": "live",                          // live | offen | neu | raus — derived when absent
    "riskScores": { "abhaengigkeit": 3, "terminSicherheit": 3 },  // hand-scored risk dims (1–5)
    "costInputs": {                             // generic cost inputs (setup/recurringYr/…); skischule's typed costModel is an EXTENSION
      "setup": 0, "recurringYr": 0 },
    "workPackages": [                           // → Cost bucket ① effort; bands = Termin-risk
      { "name": "Sidecar build + mailbox glue", "covers": ["Triage","Drafts"],
        "effort": { "opt": 4, "exp": 7, "pess": 12 } }
    ],
    "maintHrsMo": 2,                            // → Cost bucket ② maintenance
    "risk": { "longevity": "...", "peak": "...", "dataControl": "...", "lockIn": "...", "busFactor": "..." }
  }
]

kind — a render-only string on the buy↔build spectrum (drift ③: no enforced enum). Each instance maps its own vocabulary via its KIND_LABEL (buy|rent|build, saas|chatwoot|eigenbau, composed, …); the kernel branches on none of it. Drives nothing mechanically; it labels the buy↔build spectrum.
tools — the catalog tools in the composition; their matrix scores feed the solution’s Fit, their cost lines feed license/usage. A build-kind solution may have an empty tools array — which is exactly why build needs the solution layer to appear at all (model.md § the failure it prevents).
costInputs — the generic cost inputs the base schema carries (setup / recurringYr / …), priced under the Scenario. An instance with a richer cost engine ships it as an extension: skischule’s typed costModel.components (seat × seats × months, usage × a demandVar, onetime, flatRange, aiSeat) lives in contract/schema/ext/skischule/solutions.schema.json, not the base. workPackages + maintHrsMo add the build/ops layer.
workPackages[].effort — optimistic/expected/pessimistic person-days. Count each package once per solution; never sum across solutions (shared setup is identical across paths). Bands, not points — the spread is the Termin-Sicherheit signal (the method is pricing-tco.md).
riskScores — hand-scored risk dimensions (1–5) that aren’t derivable from criteria (typically abhaengigkeit, terminSicherheit). The criteria-derived dims (datensicherheit, resilienz) come from the matrix via riskDim. Keys are lowercase-ascii (§ Naming grammar D8).
coverage (optional) — the composition, keyed by Fit-criterion id (i.e. by use case): how this solution meets each use case. The vocabulary — "in" / "part" / "part:<fillerId>" / "aug:<fillerId>" / "build" / "proj" — and the coverage→Fit bridge (coverage is the qualitative shape, the score is the graded quality; render side by side, derive neither from the other) are defined in model.md § Coverage. The <fillerId> keys into fillers.json.
(retired — forkPositions) The solution no longer carries a fork-positions map. Narrowing now lives entirely in decisions.json: a fork’s options[].prune:{set,keep} declares which typed set (architectures | solutions | fillers) the option keeps live, and the kernel intersects keep across chosen options (MR-1; liveArchitectures/livePrune, engine/README.md). forkPositions was a phantom — never built (Phase 0 ②) — so it is gone from the core, not migrated.
architecture (optional) — the architecture key (→ architectures[].key) this solution fills. The Phase-4 long list is grouped per architecture off this field (board column = a architecture, cards = the solutions whose architecture matches) — no schema nesting, the grouping is derived. Absent → the solution floats ungrouped.
knowledge (optional) — the per-facet knowledge-state: { spezif, kosten, effort, risiko, fit }, each "offen" | "geschätzt" | "bekannt" (MR-2: the facet is effort, never the old build; the base schema’s knowledge $def is closed, so a legacy build key is a hard error). This is confidence, not value — distinct from the score (the score is the answer, this is how far it is trusted), the doctrine in model.md § The knowledge grid. A missing map, or a missing facet, defaults to offen — so a fresh candidate carries no knowledge and reads as all-open (the designer’s “init empty”). N Lücken = facets not bekannt (derived, not stored). Drives the Arbeitsbrett dots, the Schärfe grid, and infoGain (cockpit.md).
status (optional) — "live" | "offen" | "neu" | "raus" for the board pill. Derived when absent: gate-fail → raus; all-offen → neu; else live. But status:"raus" is also legal with no gate-0: a solution ruled out by a made decision (an unchosen fork option — model.md § Two elimination causes) is raus though it passes every gate. The two raus causes are distinct and must stay so — a gate-raus names the failed gate and is irreversible; a decision-raus names the causing decision and is reversible (re-open that decision and the solution returns to live). Never invent a gate-0 to force a decision-elimination to read as raus (the never-fabricate-a-gate rule, model.md). A raus solution stays in the data either way (the hard-filter-0 / never-deleted rule); the board renders it eliminated with its cause named — gate or decision.
(retired — facetForks / liveFillerForks) The “a knowledge-gap that hides a value-fork” link is no longer a solution-side map. It is subsumed by MR-1’s generalized forks (MR-3): a gap that hides a decision is authored as a decisions.json fork whose options[].prune targets the relevant solutions/fillers. facetForks and liveFillerForks are gone from the core (the latter was read by zero code — dead R4 output, Phase 0).
recommended (optional, bool) + verdictNote (optional, string) — the authored verdict: which solution the consultant favours, and the one-line why. Register-1/2 data, not client copy — the cockpit renders it as the lead and the kernel ranking below it as the turnable instrument; the two may legitimately diverge (model.md § Three registers, § the verdict is authored; cockpit.md Station 4). Never encode the favourite in a shell copy-map — changing the recommendation must never require a code edit.
revisedIn (optional, string round marker, e.g. "R2") — the round in which this solution last materially changed (a gate flip, an elimination, a score crossing a tier boundary). It is the solution layer of the judgement hierarchy: an architecture/comparison verdict.basis[] that names this solution’s key goes stale (kernel staleness()) when revisedIn is newer than the verdict’s asOf. Bump it only on a real change; missing → never trips a parent badge.

The 4-bucket TCO (computed, not stored)

A solution’s cost is computed under the Scenario into four buckets — ① project/build + integration effort (Σ work-package days × dayRate) ② maintenance (maintHrsMo → €/yr) ③ license/rent (Σ over tools costModels) ④ usage. Year-1 = ①+②+③+④; ongoing = ②+③+④. Each is evaluated at three points (opt/exp/pess); the band width is cost risk. The full method — including how buy/compose/build all reduce to the same four buckets — is in pricing-tco.md. Nothing here is stored: the kernel computes the buckets from the fields above.

`fillers.json` — the gap-filler library (Phase 4, optional)

When a solution’s coverage says a use case is aug:<id> or part:<id>, that <id> names a filler — a reusable way to cover what the core doesn’t (a bought sidecar, an API, a separate tool, or a custom-built brick). Fillers are factored out of the solutions into one library so the same gap-filler can be referenced by every solution that needs it, and so the Composition view can auto-collect “what extra pieces does this solution need” without hardcoding.

{
  "fillers": {
    "knowledge": {
      "name": "Knowledge / RAG layer",
      "fillsAssistFor": ["F-LOOKUP-01"],
      "options": [
        { "label": "Buy: muffinGPT",          "kind": "buy",   "effort": { "opt": 1,  "exp": 2,  "pess": 4 },  "costRef": "filler-knowledge-buy" },
        { "label": "Build: EU-LLM + vector DB","kind": "build", "effort": { "opt": 15, "exp": 25, "pess": 35 } }
      ],
      "eliminated": [                  // optional — same { key, label, why } as a fork; ways ruled out, kept with the reason
        { "key": "buy-muffin", "label": "muffinGPT (buy)", "why": "Legal-RAG halluziniert → Haftung; nur intern, menschgeprüft" }
      ]
    }
  },
  "glue": { "name": "Integration layer", "appendWhenAnyFiller": true }
}

fillers.<id> — the filler keyed by the <id> used in coverage (aug:knowledge → fillers.knowledge).
name — the human label shown on the filler card.
fillsAssistFor (optional) — Fit-criterion ids this filler provides as an optional assist (the ✨ “optional AI-assist” row), as opposed to being the primary cover for a part:/aug: gap. Solution-aware: fillsAssistFor only makes the assist available — the ✨ row renders for a given solution only when that solution’s coverage actually references this filler (via aug:<id>/part:<id>). A solution that can’t carry the filler (e.g. a closed spine with no open API) shows no assist row for that criterion. (Cockpit half: cockpit.md § Composition.)
options[] — the buy-vs-build ways to provide it. kind: "buy" | "build". effort reuses the work-package band shape { opt, exp, pess } — so a chosen filler’s effort folds into the solution’s workPackages (Cost bucket ①) and the band feeds Termin-Sicherheit. costRef (optional) keys into the instance cost model for buckets ③/④.
glue — the integration middleware that wires fillers to the core. appendWhenAnyFiller:true means the Composition view adds it automatically whenever a solution needs ≥1 filler — the fragmented-vs-single-core cost made visible.
eliminated (optional) — filler ways ruled out, each { key, label, why } (same architecture as a fork’s eliminated), kept with the reason and rendered struck under the Baustein card — the below-solution home of the never-deleted rule.
Engagement substance — the vocabulary (in/part/aug/build/proj) and the mechanism travel to a new engagement; the actual fillers and options are re-derived from this job’s survey, never ported (transferring-between-engagements.md).

The spine as data — the fork and the architecture record

The files above aren’t a flat schema — they are the phases as data (the funnel in model.md). This section names the two objects the earlier schema left implicit — the fork and the architecture — and is additive: every field below already appears in a live engagement, and nothing here is required of a plain tool pick.

Phase	Object	File(s)	Status
1 Mission	the use cases (tool-neutral)	`requirements.json` `useCases[]`	above
2 Requirements	criterion	`requirements.json` `criteria[]`	above
3 Architecture	fork + architecture	`decisions.json`, `architectures*.json` (unfilled)	below
4 Solutions	solution (a filled architecture)	`solutions.json` (filled) + `matrix-.json`	above
5 Decisions	value-fork positions + presets	`config.json` `presets`	above

The fork object — `decisions.json` `forks[]`

A fork is a decision whose options map to architectures (the fork’s two coordinates and two axes are in model.md § the fork). The schema (verbatim from a live engagement):

{
  "id": "F-1",
  "title": "Eigentum vs. Convenience",
  "question": "Portablen Kern besitzen — oder schlüsselfertige Miete?",
  "decides": "Eigene dünne Spine besitzen oder Plattform mieten.",
  "modes": ["wert"],                 // resolution coordinate → 🎯wert | 📊rating | 🔍fakt
  "scope": "architecture-hinge",            // leverage: architecture-hinge | within-architecture | cross-cutting | tool-level
  "dependsOn": { "fork": "F-2", "option": "ja" },   // optional inter-fork constraint
  "options": [
    { "key": "eigen", "label": "Eigene dünne Spine besitzen",
      "prune": { "set": "architectures", "keep": ["s4-eigene-spine-plus-satelliten", "s5-weitgehend-eigenbau"] } },
    { "key": "miete", "label": "Plattform mieten",
      "prune": { "set": "architectures", "keep": ["s1-schlanke-plattform", "s2-plattform-plus-ki-wissensschicht", "s3-best-of-breed"] } }
  ],
  "eliminated": [                    // optional — options ruled out, kept with the reason (never deleted)
    { "key": "kaufen", "label": "muffinGPT (kaufen)", "why": "R2/H2: pool-/makler-geformt, kein Einzel-Vertreter-Fit" }
  ]
}

The two fork coordinates from model.md, mapped onto the fields:

Resolution coordinate → modes (array; an entry is "wert"/"rating"/"fakt", the 🎯/📊/🔍 router that names which step closes the fork — value at 5, rating at 4, fact at 1–2). A companion resolutionModes dict in the same file defines each mode’s icon/label/who/when. An optional modeDetail string carries a mixed-mode note.
Leverage axis → scope (architecture-hinge = high-leverage, narrows the architecture set directly → surface first; within-architecture / cross-cutting / tool-level = lower leverage).
The narrowing map → options[].prune:{set,keep} (MR-1) — set names the typed reference set (architectures | solutions | fillers) and keep the keys an option keeps live. The kernel intersects keep across chosen positions (livePrune / liveArchitectures / architectureFromForks, engine/README.md). An option with no prune doesn’t touch the space. (The former architecture-only options[].architectures shorthand is retired — zero back-compat in the core.)
Discovery coordinate → optional discoveredAt (the phase a fork surfaced). Absent on existing forks, which is fine — discovery defaults to “Phase 3, when you first go looking.”
Plus id · title · question · decides · status (free), and dependsOn { fork, option } for a fork that’s only live under another’s position.
eliminated (optional) — options ruled out during the loop, each { key, label, why } with the literal gate/finding reason. This is the below-solution-level home for the “eliminations are recorded, never deleted” rule (process.md § Eliminations): a fork keeps its discarded options visible (struck through in the cockpit) instead of silently dropping them. The same { key, label, why } architecture lives on a filler (below).

reviewAgenda (in decisions.json) groups the forks by who closes them (clientDecides vs weDecide), which is just the resolution coordinate read across all forks.

Architecture record vs. solution record — one file family, two maturities

The record holds one evolving form per architecture at two maturities — this is the architecture-vs-solution split as data (the conceptual split is model.md § Architecture is not Solution). The file name tracks the dominant layer: architectures live in architectures*.json; once Phase 4 fills and scores them the file is renamed to solutions*.json (the architectures*.json → solutions*.json maturity/naming rule is in model.md):

As an architecture (step 2): tool-agnostic { key, name, summary, optimizes, tradeoff, longName, anchor, status, verdict } (architectures.schema.json). Which fork narrows to which architecture lives on the fork side now — decisions.json options[].prune.keep — not as a map on the architecture record.
As a solution (step 3, after the loop fills it): the same record grows scores{}, costInputs (or the instance’s costModel extension), workPackages, maintHrsMo, riskScores, coverage, knowledge — the scored composition documented under “The solution-level layer” above. (No forkPositions — retired.)
A architecture may also carry recommended/verdictNote (the authored verdict at the architecture level — same meaning as on a solution) and status: "deferred" — a architecture that is named but not yet surveyed or filled. A deferred architecture with no solutions stays a labelled empty column on the board (cockpit.md § Arbeitsbrett), never silently dropped, so the architecture set reads consistently across the Reise and the board.
An architecture (and the overall-comparison object) may also carry a verdict sub-object — provenance + a timestamp, never the value itself: { "asOf": "<round>", "basis": ["H1", "<solKey>", "<gateRef>"] }. asOf is the round the standing judgement was last authored; basis[] lists the children it rests on — finding ids (from rounds[].findings[]), solution keys, and/or tool gate refs. The value of the verdict stays recommended/verdictNote/verdictTone (authored — P1); the verdict.* block only lets the kernel’s pure staleness() detect when a basis[] child has a revisedIn newer than asOf and surface a “neu zu prüfen” badge. All optional → missing renders exactly as today (no badge). This is the architecture layer of the judgement hierarchy (model.md § The judgement hierarchy); the operational stamp-and-flag protocol is in process.md. ⚠ A basis[] ref that names no known finding id / solution key / gate is an orphan — it silently stops matching, so the verdict looks fresh forever. contract/validate.mjs warns on it (the semantic tier’s verdict-basis orphan check); keep basis[] in sync when you rename a key.

So an unfilled architectures.json (architectures only) and a filled solutions-<variant>.json (scored) are the same object before and after step 3 — name which fields are architecture (form) and which are solution (the fill), never call an empty architecture a scored solution, and rename the file at the Phase-4 roll-up.

The naming grammar — every key/id/label tension, settled once

This is the single home for the conventions that keep field names consistent across files, the kernel, and the cockpit. When two pieces of the contract seem to disagree on a name, the answer is here.

D6 — narrowing lives on the fork (`options[].prune`), not on the solution

Plan 15 MR-1/MR-3 collapsed what used to be a two-record relationship (forkPositions on a solution ↔ forks{} on an architecture) into one home: the fork’s options. There is no solution-side or architecture-side fork map any more.

decisions.json forks[].options[].prune:{set,keep} is the single source of narrowing. set is the typed reference set (architectures | solutions | fillers); keep is the keys the option keeps live. Choosing options sets positions = { forkId → optionKey }; the kernel intersects each chosen option’s keep (livePrune, specialized as liveArchitectures/architectureFromForks).
A solution carries no forkPositions, and an architecture carries no forks{} — both were retired (forkPositions was a phantom, never built). The live cockpit derives dimming and sibling grouping from the fork positions the user sets, not from a stored map on the records.

So when you want “this option narrows the space to these architectures,” author it as the fork option’s prune, never as a map hung off the solution or the architecture.

D7 — `id` vs `key`: two key namespaces, documented (not mass-renamed)

The contract uses two key fields, by what references them:

id — on criteria (SH-C1) and use cases (followup). These are the things research references (a matrix scores map and a coverage map are keyed by criterion id; a Fit criterion’s useCase points at a use-case id). id marks “a stable handle other data points at — do not churn it once research starts.”
key — on compositions/solutions (composed-chatwoot), categories (booking), fork options (eigen), and weighting presets (ausgewogen). These are selectors the config/cockpit switches between, not things research cites.

The rule is descriptive of the live data, not a refactor mandate: document the convention and follow it for new fields; do not mass-rename existing live data to chase perfect consistency (churning a referenced id is exactly the breakage the rule guards against). New criteria/use cases get id; new solutions/categories/options/presets get key.

D8 — casing: lowercase-ascii data keys, German display labels, lowercase axis enums

Three casing rules that keep data machine-clean while letting prose read German:

Field keys and dimension ids are lowercase-ascii — datensicherheit, terminsicherheit, abhaengigkeit (ASCII fold of ä → ae). These are object keys the kernel reads; they never carry umlauts or capitals.
TitleCase German is a display concern only — Datensicherheit, Termin-Sicherheit, Abhängigkeit appear as labels in the cockpit (and in this skill’s prose), produced by a label map, never used as a data key.
Axis enums are lowercase in data ("gate" / "fit" / "risk" on axis; fit/cost/risk in weights), Gate / Fit / Cost / Risk TitleCase in prose — the four axes are written capitalized in every reference, lowercase in every JSON value.

When you see terminSicherheit (camelCase) in older live data, that is a legacy spelling of the terminsicherheit key — same dimension; per D7, don’t mass-rename it, just write new keys lowercase-ascii.

D12 — `rounds[]` vs `loop-log.md`: distinct roles, keep both

Two records track the engagement’s history at different granularities — both are kept:

research-index.json rounds[] — the canonical, append-only DATA history. One structured entry per research round (when/focus/found/shifted), which drives the Report view. This is the machine-read log of how the picture moved; a regression appends a round here (above, and the append-not-rewind rule is process.md).
loop-log.md — an optional per-iteration PROSE nav aid: a human-readable scratchpad of what each loop asked and found, finer-grained than a round, useful while you’re mid-loop. Not read by the cockpit.

They are not redundant: rounds[] is the structured history the client-facing Report renders; loop-log.md is the working prose at a finer granularity. Keep both — don’t collapse the prose log into the data rounds or vice-versa.

Three registers in the data

Every field above is one of two registers; the third never appears in these files (the doctrine is model.md § Three registers):

Model (the kernel reads): ids/keys, tier, axis, riskDim, modes, scope, options[].prune, scores, costInputs/costModel, weights/presets.
Rationale (analyst prose, on the model, internal voice): description, summary, question, decides, expectedFailure, tradeoff, gewinnt/gibtAuf, reasoning, qualitative.
Client copy is not in these files — it’s the shell’s register-3 override layer (keyed by id, falling back to register 2 when unmapped; see cockpit.md). The data stays blunt and internal; the shell polishes.

File-name reconciliation (same schema, two packagings)

Two harmless variations exist across engagements — same field schemas, different packaging:

config.json categories[].matrixFile (a tool-level matrix) vs. categories[].solutionsFile (a solution-level set). Either points the cockpit at that category’s records; pick by which layer the category is at.
The shared Scenario ships either in cockpit.json (alongside the workPackages library) or as a standalone scenario.json. Identical object (the demand knobs above); the cockpit reads whichever is present.

Validating the data

A truncated or inconsistent file breaks the cockpit silently, so don’t eyeball it — run the system validator (the graduated successor to the old scripts/validate.py) and fix any errors before you report back or move on. It runs three tiers: the MR-5 schema-lint, the JSON-Schema shape pass (Ajv 2020), and the semantic cross-reference/consistency tier (contract/validate.mjs, contract/semantic.mjs, contract/schema-lint.mjs).

node contract/validate.mjs <engagement>/data [--instance <name>]   # full validation
node contract/validate.mjs --lint-only                             # MR-5 doc-lint of the schemas

An instance with a per-project extension (skischule’s cost engine, ehimare’s integration facet) ships it under contract/schema/ext/<instance>/; the validator composes base + extension automatically when --instance matches (or when the data lives at <instance>/data/).

It verifies every JSON file parses, criteria have valid tiers, scores reference real criterion IDs, hard filters are 0/1, preferences are 1–5, and flags scored cells with no reasoning note. Non-zero exit = fix it first. (The full ingest/audit write-procedure for a returned research report — discard the report ranking, re-check the literal filters, then write and validate — is owned by research-briefs.md; this file owns only the schema the validator enforces.)

← Architektur-Überblick

The data contract — doctrine, naming grammar, worked field-shapes

The weight tiers

Schema-bound scoring discipline

Writing a hard filter — gate phrasing

File schema

config.json — cockpit configuration

requirements.json — the consolidated authored input (Phase 1 + 2)

The Mission fields — purpose / useCases / stakeholders / stressTest (Phase 1, under requirements.json)

The criteria[] array — the requirements, as data (Phase 2, under requirements.json)

matrix-<category>.json — the filled form

The pricing field — cost teardown + 3-point TCO (per tool)

landscape.json — the raw candidate survey (Phase 4, optional)

research-index.json — the research narrative + append-only log (Phase 4, optional)