Process Design (build a new agent-runnable process)

Purpose

Turn a fuzzy 'we do X somehow' into a published, versioned, agent-runnable process that has been independently reviewed AND proven by an AI-as-human dry-run end-to-end — with no human in the approval path. Run by Forge (build), Verdict + panel (independent review), Pulse (provision/publish), and Proxy (AI-as-human acceptance pilot).

Sponsor Intake: Design Brief, Objectives Rubric & Continuous-Input Channel

AI+Human Agent: Forge

Stand up the design engagement from the sponsor's brief — what process they want, what it must accomplish (the target outcome), any provided information/data, and typed constraints — and crystallize it into the ONE design source-of-truth: (1) a one-page Process Charter (name, single testable purpose, named terminal deliverable, scope in/out, trigger, owner, mode=New), (2) a FROZEN, versioned, hashed PROCESS-DESIGN OBJECTIVES RUBRIC (the meta-acceptance criteria every later review scores bricks against by line id), (3) an Assumption Register (each row tagged source=sponsor|assumption|process-default), and (4) an Open-Questions-to-Sponsor log. The sponsor PROVIDES information throughout but NEVER gates. Intake COMPLETES, never waits. CLARIFY-AND-SIZE (Sprint 117): Forge ASKS clarifying questions until the brief is properly scoped and captures a 1-10 SIZING score that drives the functional depth/breadth, user/access model, and controls.

Deliverables

[AI+Human · Forge] Process Charter (one page): name; one-line testable purpose; triggering event; terminal deliverable; scope in/out; named owner; mode=New.

Acceptance: Purpose is a single testable sentence; terminal deliverable is named; scope in/out is explicit; owner named; mode recorded. Charter carries a version + content hash.

[AI · Forge] Process-Design Objectives Rubric (objectives_rubric.vN.json): the binary meta-acceptance criteria a finished design must satisfy (e.g. every brick has one objective + a testable acceptance criterion; no orphan handoffs; 0 unintended human gates; AI-opportunity grounded; guardrails encode no-commit/no-fabricate/escalate; runnable by the assigned agents).

Acceptance: File exists with version + hash; every line is a yes/no check with a stated pass condition (zero prose-only criteria); it is the single rubric every downstream review cites by hash.

[AI · Forge] Assumption Register + Open-Questions log (assumptions.json, sponsor_questions.json): provided data inventoried by reference/classification/access-path only.

Acceptance: Every row tagged source=sponsor|assumption|process-default; nothing labeled sponsor-stated unless it came from the sponsor; no secrets/credentials written to any tracked file.

[AI+Human · Forge] Scope & Sizing decision: the sponsor's 1-10 sizing score with its explicit interpretation (users/access · security/controls · functional depth & breadth · the IN/OUT list at that level), derived from the clarifying-question dialogue.

Acceptance: A sizing score (1-10) is recorded with a concrete IN/OUT scope it maps to; the scope was reached by asking clarifying questions (not assumed); the Objectives Rubric reflects this level.

Questions the agent asks (10)

What process do you want to design, and what does it accomplish in one line?
What event triggers it, and what concrete deliverable does 'done' produce?
Where does it start and stop (scope in / scope out), and who owns it?
What information, data, examples, or existing docs can you provide (by reference)?
What hard constraints apply (compliance, systems, timing, who must stay human)?
On a scale of 1-10, how ambitious should this be? (1 = bare MVP / proof-of-concept; 5 = a solid, genuinely useful product for a team; 10 = best-in-class, enterprise-grade). I will scope functionality, the user/access model, security, controls, and depth/breadth to that level — and I'll tell you what each level includes before we proceed.
Who will use this and roughly how many (one person, a team, a whole org, multiple tenants)?
What is the single most important job this must do well to be genuinely useful (not a toy)?
Any hard constraints I should scope to now — access/permissions, security/compliance, integrations, scale?
What would 'best-in-class' look like for this, and how much of that is in scope at your sizing score?

Do (6)

Pin the purpose to ONE testable sentence and name the terminal deliverable.
Freeze + hash the Objectives Rubric here — it is the law every later review scores against.
Inventory provided data by reference + classification only.
Tag every register row with its source; flag unknowns as assumptions rather than inventing facts.
ASK clarifying questions and iterate until the product/process is properly scoped — never proceed on a vague brief; if the sponsor is unsure, propose options and a recommended default.
Capture a single SIZING score (1-10) and translate it into concrete scope: who/how-many users, the access/permission model, authentication, security controls, auditability, and the depth/breadth of features — then state plainly what is IN and OUT at that level.

Don't (4)

Don't start designing steps — frame first.
Don't invent a purpose/deliverable/owner the sponsor didn't state.
Don't write any secret or credential into a tracked artifact.
Don't insert a human approval gate — the sponsor provides input, never approves.

Guardrails (5)

GATE-FREE INVARIANT: this brick contains no human approval/sign-off mechanism; the sponsor provides input and answers questions but never sits in an approval gate.
HONESTY-RECEIPT INVARIANT: any 'done'/'converged'/'SOLID' claim is valid only with a receipt (artifact hash, rubric hash, GapLog state, per-reviewer verdicts citing line ids); reviewers re-derive receipts rather than accept a summary. No claim without a receipt.
DATA-PROVENANCE: nothing presented as sponsor-stated unless it actually came from the sponsor.
NO-SECRETS: provided data inventoried by reference/classification/access-path only.
RUBRIC-IS-LAW: the Objectives Rubric is frozen + hashed here and is immutable-referenceable; any edit creates a new version preserving the audit trail.

Map the Ideal Flow (SIPOC + step map)

AI Agent: Forge

Capture the ideal end-to-end flow as a SIPOC-grounded step map — an ordered sequence of steps each with actor, inputs, outputs, and handoff, plus the natural loops (rework/approval bounce-backs) and branches (it-depends-on-X) that become real control flow later. AUGMENT: Forge elicits and structures; the sponsor/SME supply facts and recognize it as real.

Deliverables

[AI · Forge] End-to-end step map: ordered steps with actor/inputs/outputs/handoff, named loops and branches.

Acceptance: Flow runs start-to-finish with no orphan handoffs (every output consumed, every input sourced); loops and branches noted; sponsor recognizes it as real.

[AI · Forge] SIPOC summary (Suppliers→Inputs→Process→Outputs→Customers).

Acceptance: Each column populated and consistent with the step map; boundary suppliers/customers named.

Questions the agent asks (4)

Walk it start to finish — what happens first, then what?
Who does each part; what does each step need (inputs) and produce (outputs)?
Where does work hand off between people or systems?
Where does it naturally loop back, and where does it branch on conditions?

Do (3)

Capture the work as a sequence of steps with actor/inputs/outputs/handoff.
Note loops and branches — they become real control flow.
Keep every output consumed and every input sourced (no orphan handoffs).

Don't (3)

Don't force a straight line if the work loops or branches.
Don't skip a handoff because it seems minor.
Don't diagnose AI opportunities yet — that's the next brick.

Guardrails (3)

GATE-FREE INVARIANT: this brick contains no human approval/sign-off mechanism; the sponsor provides input and answers questions but never sits in an approval gate.
Map facts; do not assert how a step works without the sponsor/SME confirming it.
STRICT-STEP: do not classify opportunities or author contracts here.

Diagnose the AI Opportunity (Decision Tree)

AI Agent: Forge

For each step decide whether and how AI should do it — grounded in the AI Opportunity Decision Tree (chatura-method-internal.md), not enthusiasm — classifying every step AUTOMATE / AUGMENT / ELIMINATE / CREATE / RESTRUCTURE, deciding AI-vs-human-vs-hybrid, and confirming the sponsor WANTS each AI step.

Deliverables

[AI · Forge] Per-step opportunity classification with supporting data; ELIMINATE calls cite the <5% rejection / no-value basis; RESTRUCTURE precedes automation of broken flows.

Acceptance: Every step has exactly one classification with its supporting data; rationale references the decision tree, not enthusiasm; eliminated steps listed.

[AI · Forge] AI-vs-human-vs-hybrid decision per retained step, sponsor desire confirmed.

Acceptance: The split is decided for every retained step; the sponsor has explicitly confirmed (not merely not-objected) the desire to pursue each AI step.

Questions the agent asks (5)

Is this step rule-following (AUTOMATE), judgment-heavy (AUGMENT), or low-value control (ELIMINATE)?
What is the rejection/exception rate on this control step?
Is there something you can't do today for lack of capacity that AI could CREATE?
Are handoffs broken or is there no end-to-end owner (RESTRUCTURE)?
Do you WANT AI to take this step on, or should a human keep it?

Do (4)

Classify each step into exactly one of the five types before solutioning.
Eliminate rubber-stamps / no-value controls (rejection <5%).
Fix the flow first (RESTRUCTURE) before automating a broken process.
Validate desire explicitly per AI step.

Don't (4)

Don't automate a broken process.
Don't keep no-value steps.
Don't push an opportunity the sponsor doesn't want.
Don't solution before classifying.

Guardrails (4)

GATE-FREE INVARIANT: this brick contains no human approval/sign-off mechanism; the sponsor provides input and answers questions but never sits in an approval gate.
Classification grounded in the AI Opportunity Decision Tree (docs/methodology/chatura-method-internal.md), never in enthusiasm or invention.
Do not fabricate rejection rates or benefits; use the sponsor's facts or flag as unknown.
STRICT-STEP: classify and decide only.

BUILD: Author the Brick Contracts

AI Agent: Forge

Turn each retained step into an agent-runnable contract per the step-contract schema (_SCHEMA.md) — the part that makes the process executable and stops agents improvising. CREATE: Forge generates a structured, testable contract per step that did not exist before.

Deliverables

[AI · Forge] A step CONTRACT per retained step: one concrete objective; inputs + the exact questions_to_ask that gather them; dos/donts; deliverable(s) WITH testable acceptance criteria (Definition-of-Done); guardrails (no-commit, no-fabricate, escalate pricing/scope/legal).

Acceptance: Each retained step has exactly ONE objective; >=1 deliverable each with a TESTABLE acceptance criterion; explicit dos/donts; questions that gather every declared input (no orphan inputs); guardrails present. No step left as a vague label.

Questions the agent asks (4)

For this step, what is the single concrete objective?
What exact artifact must it complete, and how do we know it's done (acceptance criteria)?
What inputs does it need, and what questions gather them?
What must the agent do, and never do, on this step?

Do (3)

Give every step ONE objective, a testable deliverable, and acceptance criteria.
Write the exact questions that gather every input (no orphan inputs).
Make guardrails explicit; treat each contract as the agent's script: specific enough to execute, bounded enough not to run ahead.

Don't (4)

Don't leave a step as a vague label.
Don't let a deliverable be unmeasurable.
Don't let scope bleed across steps.
Don't leave orphan inputs.

Guardrails (5)

GATE-FREE INVARIANT: this brick contains no human approval/sign-off mechanism; the sponsor provides input and answers questions but never sits in an approval gate.
Conform to the step-contract schema (_SCHEMA.md); do not improvise the shape.
Every contract encodes no-commitments, no-fabrication, and Human escalation for pricing/scope/legal.
Do not invent specifics the sponsor/SME haven't confirmed; mark undefined rather than fabricate.
STRICT-STEP: author contracts only; review and staffing are later bricks.

Staff: Assign Skills & Match Agents (RACI)

AI Agent: Forge

Put the right worker on each brick — derive each brick's required skill from its contract, match a single accountable agent, and flag any skill gap as a new-agent spec for Pulse to create. AUGMENT: Forge proposes the RACI staffing.

Deliverables

[AI · Forge] Staffed process: each brick matched to its required skill and a single accountable assigned_agent, with supports/approvers named and a list of new-skill-to-train flags.

Acceptance: Every brick has exactly one accountable assigned_agent matched to its skill OR an explicitly logged skill gap; no agent self-approves a commitment; supports/approvers named where applicable.

Questions the agent asks (4)

What skill does this brick's contract require?
Which roster agent best matches that skill?
Is there any brick whose skill no existing agent covers (a gap to flag)?
Who supports and who approves this brick (RACI)?

Do (3)

Derive each brick's skill from its contract, then match an agent by skill.
Name exactly one accountable driver per brick.
Flag a new skill to train where no agent fits, rather than forcing a poor match.

Don't (4)

Don't assign by convenience.
Don't let an agent self-approve a commitment.
Don't hide a missing skill.
Don't leave a brick with two accountable drivers.

Guardrails (4)

GATE-FREE INVARIANT: this brick contains no human approval/sign-off mechanism; the sponsor provides input and answers questions but never sits in an approval gate.
Match strictly by skill; flag gaps as new-skill-to-train.
If a brick needs a NEW specialist agent, Pulse creates it (create_agent) during provisioning; Forge specifies what it must be.
STRICT-STEP: staff only.

Independent Review & Iterate: EACH Brick

AI Agent: Verdict

Replace per-step human sign-off with an AI-owned review-and-iterate loop applied to EVERY brick contract. A non-author panel — Verdict (objectives-coverage + independence + adjudication), Proof (acceptance-criteria testability), and Cipher (guardrail/abuse-case completeness AND the adversarial red-team pass) — critiques each contract line-by-line against the FROZEN Objectives Rubric. A domain specialist is added ONLY when the brick's subject demands it and ONLY if staff_and_match_agents actually assigned one (never a phantom panelist); when present it produces its own verdict-deliverable and is named in the convergence record. Every gap becomes a GapLog entry; Forge fixes; the loop re-reviews (delta + regression). Each brick converges only on receipt-backed SOLID with open sev>=major gaps == 0.

Deliverables

[AI · Verdict] Frozen per-brick Review Rubric (review_rubric.vN.json) derived from the Objectives Rubric: binary checks for objective-singularity, acceptance-testability, input/question coverage (no orphan inputs), guardrail completeness, agent-runnability, and decision-tree grounding.

Acceptance: File exists with version + hash; >=1 binary check per named dimension; every line a yes/no with a pass condition; referenced by hash in every verdict.

[AI · Proof] Acceptance-Criteria testability ledger per brick: each acceptance criterion assessed observable / has-measurable-threshold / has-named-oracle / no-subjective-terms.

Acceptance: Ledger lists every brick's acceptance criteria with pass/fail per check + cited oracle; any failing AC auto-logged to GapLog at sev>=major; Proof's SOLID invalid while any AC fails.

[AI · Cipher] Red-team report per brick: missing inputs, contradictory dos/donts, guardrail gaps (commitment without escalation, fabrication path), untestable acceptance.

Acceptance: Every finding mirrored as a first-class GapLog entry with red-teamer-set severity; closing requires a fix (verified) or a recorded rationale (wont-fix) — zero silent dismissals.

[AI · Verdict] GapLog (gaplog.json) across all bricks — the loop's auditable state machine (id, source, raising-agent, severity, status, history, rationale).

Acceptance: Every gap present with all fields; reviewer-owned severity with recorded changes; at convergence zero entries open with sev>=major.

[AI · Forge] Revised brick contracts iterated to close gaps; each revision bumps version+hash; Forge's authorship recorded for the COI check.

Acceptance: Latest version closes every gap assigned to the author; each revision bumps hash; Forge is recorded as author and NOT on the review panel.

[AI · panel] Receipt-backed per-reviewer verdicts (verdicts.json), one per panel reviewer + the red-teamer, each bound to the FINAL brick version+hash.

Acceptance: A verdict counts SOLID only if it cites the brick version+hash, rubric line ids with per-line pass, and the brick fields inspected; bare 'SOLID' rejected; every verdict's hash matches the final brick hash.

[AI · Verdict] Convergence Record (convergence_record.json) for the each brick — the signed exit receipt: frozen rubric hash, GapLog state, the verdicts of exactly these non-author reviewers (Verdict + Proof + Cipher (red-team)) bound to the FINAL hash, and the independence/COI attestation.

Acceptance: Record is signed by Verdict and asserts each as a checkable true: frozen rubric exists (hash); GapLog open sev>=major == 0; EACH of the named reviewers (Verdict + Proof + Cipher (red-team)) — and NO phantom reviewer the brick did not produce — has a receipt-backed verdict bound to the FINAL each brick hash; independence attestation confirms no reviewer authored/co-authored the artifact and the red-teamer (where present) != the adjudicator. DONE only when every box is true; if still open after the 3-iteration cap, the loop TERMINATES in a logged DESIGN-BLOCKED escalation to Atlas (not published), never a softened verdict.

Questions the agent asks (3)

For each gap tagged needs-human-info: what is the sponsor's answer? (logged + flagged assumption until answered — never blocks)
Which brick's acceptance criterion is not yet observable/measurable/oracle-backed?
Has the sponsor provided new information that re-opens a closed assumption?

Do (5)

Freeze + hash the rubric before any verdict.
Require receipt-backed verdicts; reject bare 'SOLID'.
Make convergence a mechanical function of GapLog state + verdicts vs ONE final hash, per brick.
Promote every red-team finding and failing AC into the GapLog.
Enforce non-authorship: Forge (author) may not sit on the panel.

Don't (5)

Don't insert a human approval step.
Don't let Forge review her own brick.
Don't accept a stale-hash verdict.
Don't downgrade severity to force convergence.
Don't acknowledge-and-ignore a red-team finding.

Guardrails (9)

INDEPENDENCE INVARIANT: no agent reviews its own work; a reviewer may not have authored or co-authored the artifact under review or a parent it derives from; the red-teamer is distinct from the convergence adjudicator (Verdict). Attested in the convergence record.
Frozen rubric: derive the review rubric from the Objectives Rubric, hash it, and record the hash BEFORE any verdict is cast; every line is a yes/no check with a stated pass condition.
Receipt-backed verdicts: a SOLID verdict must cite the brick version+hash, the rubric line ids checked with per-line pass, and the specific section/ids inspected; bare 'SOLID' is rejected; every verdict's hash matches the single final brick hash.
GapLog governance: every gap (incl. every red-team finding) is a first-class entry with reviewer-owned severity and recorded change history; only the raising reviewer or Verdict may change severity; convergence-by-downgrade is prohibited; no silent dismissals (close = fix or recorded rationale-for-no-action).
Delta + regression re-review: on each new version re-review changed sections AND re-confirm prior SOLID verdicts still hold against the new hash.
NEVER-STALL INVARIANT: a gap needing human-only info produces BOTH an open question to the sponsor AND an explicit flagged assumption the team proceeds on; the loop converges on the resolvable remainder and never stalls into a de-facto human gate.
Mechanical exit: the loop exits ONLY when the Convergence Record's binary checklist is fully true (rubric frozen; open sev>=major gaps == 0; EACH named non-author reviewer the brick actually produces a verdict-deliverable for is SOLID vs the FINAL hash; independence/COI attested). No phantom reviewer: the asserted reviewer set equals the deliverables present.
LOOP-CONTROL INVARIANT: a hard cap of 3 re-review iterations. If open sev>=major gaps remain after the 3rd iteration, the loop does NOT soften verdicts or wave anything through — it TERMINATES in a logged DESIGN-BLOCKED outcome escalated to Atlas with the open GapLog attached (the design is rejected and NOT published). DESIGN-BLOCKED is a terminal state, not a human approval gate; the run ends without a publish.
HONESTY-RECEIPT INVARIANT: any 'done'/'converged'/'SOLID' claim is valid only with a receipt (artifact hash, rubric hash, GapLog state, per-reviewer verdicts citing line ids); reviewers re-derive receipts rather than accept a summary. No claim without a receipt.

Independent Review & Iterate: WHOLE Design (integration)

AI Agent: Verdict

Review the ASSEMBLED process for integration, not just brick-by-brick: no orphan handoffs (every output consumed, every input sourced), loop/branch/exception paths defined end-to-end, schema-lint clean, 0 unintended human gates, and alignment with the platform moat (contract-per-step, receipt honesty) and the 3-pillar vision. Atlas reviews against the moat/vision; Verdict runs convergence. Iterate to SOLID.

Deliverables

[AI · Verdict] Integration review rubric + flow-integrity matrix: every brick output mapped to a consumer and every input to a source; loops/branches/exceptions enumerated with defined paths.

Acceptance: Matrix shows zero orphan handoffs and zero undefined exception paths; any orphan/undefined path is a GapLog entry at sev>=major; convergence blocked while any remains.

[AI · Atlas] Moat & vision review note: design honors contract-per-step + receipt honesty and the 3-pillar BPO vision; no over-claim, no human gate smuggled in.

Acceptance: Note cites specific bricks; any moat/vision violation is a GapLog entry; signed by Atlas as non-author.

[AI · Verdict] Schema-lint result over the assembled template (one objective/brick, >=1 testable acceptance/brick, valid owners/agents, no parallel-schema registers).

Acceptance: Lint passes with zero errors; any error is a GapLog entry blocking convergence.

[AI · Verdict] Convergence Record (convergence_record.json) for the whole design — the signed exit receipt: frozen rubric hash, GapLog state, the verdicts of exactly these non-author reviewers (Verdict + Atlas) bound to the FINAL hash, and the independence/COI attestation.

Acceptance: Record is signed by Verdict and asserts each as a checkable true: frozen rubric exists (hash); GapLog open sev>=major == 0; EACH of the named reviewers (Verdict + Atlas) — and NO phantom reviewer the brick did not produce — has a receipt-backed verdict bound to the FINAL whole design hash; independence attestation confirms no reviewer authored/co-authored the artifact and the red-teamer (where present) != the adjudicator. DONE only when every box is true; if still open after the 3-iteration cap, the loop TERMINATES in a logged DESIGN-BLOCKED escalation to Atlas (not published), never a softened verdict.

Questions the agent asks (3)

Is any brick output consumed by no one, or any input produced by no one?
Does every loop/branch/exception have a defined path (failure, rejection, go-back-a-step)?
Does any brick smuggle in a human approval gate the design shouldn't have?

Do (3)

Trace every handoff end-to-end before declaring convergence.
Run schema-lint and treat every error as a blocking gap.
Have Atlas check the moat + vision as a non-author reviewer.

Don't (3)

Don't treat unresolved loops/branches as 'good enough'.
Don't pass with a known orphan handoff.
Don't let the design re-introduce a human gate.

Guardrails (9)

INDEPENDENCE INVARIANT: no agent reviews its own work; a reviewer may not have authored or co-authored the artifact under review or a parent it derives from; the red-teamer is distinct from the convergence adjudicator (Verdict). Attested in the convergence record.
Frozen rubric: derive the review rubric from the Objectives Rubric, hash it, and record the hash BEFORE any verdict is cast; every line is a yes/no check with a stated pass condition.
Receipt-backed verdicts: a SOLID verdict must cite the assembled design version+hash, the rubric line ids checked with per-line pass, and the specific section/ids inspected; bare 'SOLID' is rejected; every verdict's hash matches the single final assembled design hash.
GapLog governance: every gap (incl. every red-team finding) is a first-class entry with reviewer-owned severity and recorded change history; only the raising reviewer or Verdict may change severity; convergence-by-downgrade is prohibited; no silent dismissals (close = fix or recorded rationale-for-no-action).
Delta + regression re-review: on each new version re-review changed sections AND re-confirm prior SOLID verdicts still hold against the new hash.
NEVER-STALL INVARIANT: a gap needing human-only info produces BOTH an open question to the sponsor AND an explicit flagged assumption the team proceeds on; the loop converges on the resolvable remainder and never stalls into a de-facto human gate.
Mechanical exit: the loop exits ONLY when the Convergence Record's binary checklist is fully true (rubric frozen; open sev>=major gaps == 0; EACH named non-author reviewer the brick actually produces a verdict-deliverable for is SOLID vs the FINAL hash; independence/COI attested). No phantom reviewer: the asserted reviewer set equals the deliverables present.
LOOP-CONTROL INVARIANT: a hard cap of 3 re-review iterations. If open sev>=major gaps remain after the 3rd iteration, the loop does NOT soften verdicts or wave anything through — it TERMINATES in a logged DESIGN-BLOCKED outcome escalated to Atlas with the open GapLog attached (the design is rejected and NOT published). DESIGN-BLOCKED is a terminal state, not a human approval gate; the run ends without a publish.
HONESTY-RECEIPT INVARIANT: any 'done'/'converged'/'SOLID' claim is valid only with a receipt (artifact hash, rubric hash, GapLog state, per-reviewer verdicts citing line ids); reviewers re-derive receipts rather than accept a summary. No claim without a receipt.

Finalize: Version, Changelog & Dry-Run Acceptance Criteria

AI Agent: Forge

Freeze the converged design for the dry-run: stamp a semantic version, write a changelog, record the Source-of-Truth manifest (artifact hashes, register ids+versions+row counts, rubric hash), and write the binary DRY-RUN ACCEPTANCE CRITERIA a clean run-through must satisfy.

Deliverables

[AI · Forge] Finalized design package: semantic version (1.0 first), changelog, Source-of-Truth manifest with hashes.

Acceptance: Version + changelog present; manifest lists every artifact with its hash and every register id+version+row count; the manifest's whole-design hash EQUALS the hash the whole-design Convergence Record signed (any drift re-opens the whole-design loop); matches the converged review records.

[AI · Verdict] Dry-Run Acceptance Criteria (dryrun_criteria.json): binary checklist — every brick completes with an acceptance-passing deliverable; zero stalls; zero improvisation (no claim without a receipt); zero unbounded loops; the AI-as-human supplied only sponsor-appropriate input.

Acceptance: Every criterion is a yes/no with a measurable check; the set covers completion, acceptance, stalls, improvisation, and loop-bounding.

Questions the agent asks (3)

Is this a first publish (1.0) or a later version?
What goes in the changelog?
What must a clean dry-run demonstrate, criterion by criterion?

Do (3)

Stamp a semantic version and write a changelog.
Record the Source-of-Truth manifest with hashes.
Write binary dry-run acceptance criteria — the test the run must pass.

Don't (3)

Don't finalize a design whose review loops haven't converged.
Don't leave the dry-run criteria as prose.
Don't skip the manifest.

Guardrails (4)

GATE-FREE INVARIANT: this brick contains no human approval/sign-off mechanism; the sponsor provides input and answers questions but never sits in an approval gate.
HONESTY-RECEIPT INVARIANT: any 'done'/'converged'/'SOLID' claim is valid only with a receipt (artifact hash, rubric hash, GapLog state, per-reviewer verdicts citing line ids); reviewers re-derive receipts rather than accept a summary. No claim without a receipt.
Finalize only after both review loops recorded receipt-backed convergence.
Dry-run criteria must be binary and measurable, not prose.

Provision the Design into the Platform (staging)

AI Agent: Pulse

Pulse instantiates the finalized design in the platform via the receipt-backed authoring capabilities — create_process, add_step, set_step_contract, assign_agent_to_step, and create_agent for any new specialist — in a STAGING posture: attachable to a flow but isolated from production, so it can be run end-to-end as a dry-run. Every authoring action writes a ledger receipt.

Deliverables

[AI · Pulse] Provisioned process in staging: created via authoring capabilities (never hand-edited DB), all bricks + contracts + agent assignments present, any new specialist agent created and onboarded, each action's ledger receipt recorded.

Acceptance: The staged process matches the finalized design brick-for-brick (verified by hash/diff); every authoring action has a ledger receipt; any newly created agent has a DB user + key; the process is runnable but not yet published to production.

Questions the agent asks (2)

Does any brick require a NEW specialist agent that must be created + onboarded first?
Is the staged process isolated from any production Flows?

Do (4)

Use create_process/add_step/set_step_contract/assign_agent_to_step (never hand-edit the DB).
Create + onboard any new specialist agent Forge specified.
Verify the staged process matches the finalized design by hash/diff.
Record a ledger receipt for every authoring action.

Don't (3)

Don't hand-edit the database.
Don't publish to production here — this is staging.
Don't claim a brick/agent exists without its authoring receipt.

Guardrails (4)

GATE-FREE INVARIANT: this brick contains no human approval/sign-off mechanism; the sponsor provides input and answers questions but never sits in an approval gate.
HONESTY-RECEIPT INVARIANT: any 'done'/'converged'/'SOLID' claim is valid only with a receipt (artifact hash, rubric hash, GapLog state, per-reviewer verdicts citing line ids); reviewers re-derive receipts rather than accept a summary. No claim without a receipt.
Authoring only through receipt-backed capabilities; never hand-edit the DB.
Staging isolation: the dry-run must not touch a production Flow.

AI-as-Human: Initiate the Run as the Sponsor

AI Agent: Proxy

A NON-author AI — the Proxy (Sponsor Proxy) — plays the human: it STARTS a real process run on the staged design, supplies the brief + intake exactly as a human sponsor would, and thereafter ONLY observes and performs human-assigned actions when a brick requests human-only input. This is the literal 'AI acting as human to initiate the run and take the actions assigned to humans, then observe.' Proxy must not be an author of the design (independence).

Deliverables

[AI · Proxy] Initiated dry-run: a started ProcessInstance on the staged design with the sponsor brief + intake supplied, and a Sponsor-Proxy log recording every human-role action taken (inputs supplied, questions answered) vs observed.

Acceptance: A run exists in_progress on the staged design; the brief+intake were supplied by Proxy acting as sponsor; the proxy log distinguishes human-assigned actions taken from passive observation; Proxy is attested non-author of the design.

Questions the agent asks (2)

(as sponsor) What is the brief + intake this run starts from?
(when a brick asks) What is the sponsor's answer to this open question?

Do (3)

Start a real run on the staged design and supply the brief as a human would.
Answer only the questions a sponsor would answer; take only human-assigned actions.
Log every human-role action vs observation.

Don't (3)

Don't do the agents' work — only the human's.
Don't approve or gate anything — the sponsor provides input, never approves.
Don't let an author of the design play the sponsor (COI).

Guardrails (4)

GATE-FREE INVARIANT: this brick contains no human approval/sign-off mechanism; the sponsor provides input and answers questions but never sits in an approval gate.
INDEPENDENCE INVARIANT: no agent reviews its own work; a reviewer may not have authored or co-authored the artifact under review or a parent it derives from; the red-teamer is distinct from the convergence adjudicator (Verdict). Attested in the convergence record.
Proxy acts strictly in the human-sponsor capacity: provides input + takes human-assigned actions, never performs an agent's brick or approves a gate.
Proxy must be a non-author of the design under test.

Full Run-Through: Agents Execute the Whole Process End-to-End

AI Agent: Proxy

The process's own assigned agents execute the ENTIRE run end-to-end autonomously (the executor auto-advances; the design's own review-loop bricks gate, not a human; Proxy observes + supplies sponsor input). Every brick must produce a deliverable that passes its own acceptance criterion. This is 'AI running through the whole process run.'

Deliverables

[AI · run agents] Completed dry-run: every brick executed by its assigned agent, producing the brick's deliverable; a run telemetry record (per-brick completion, acceptance pass/fail, any stall/improvisation/loop-cap event).

Acceptance: The run reached its terminal brick; per-brick telemetry exists for every brick; each brick's deliverable is recorded so the dry-run review can score it against acceptance.

Questions the agent asks (1)

(Proxy, as sponsor) any open question a brick raised during the run?

Do (3)

Let the assigned agents run every brick end-to-end; do not shortcut the run.
Capture per-brick telemetry: completion, acceptance pass/fail, stalls, improvisation, loop caps.
Let the design's review-loop bricks gate — not a human.

Don't (3)

Don't hand-complete bricks to make the run pass.
Don't suppress a stall or failure — it's the signal the dry-run exists to surface.
Don't publish off an incomplete run.

Guardrails (4)

GATE-FREE INVARIANT: this brick contains no human approval/sign-off mechanism; the sponsor provides input and answers questions but never sits in an approval gate.
HONESTY-RECEIPT INVARIANT: any 'done'/'converged'/'SOLID' claim is valid only with a receipt (artifact hash, rubric hash, GapLog state, per-reviewer verdicts citing line ids); reviewers re-derive receipts rather than accept a summary. No claim without a receipt.
The run-through is real: agents produce real deliverables; a claimed completion without a deliverable+receipt is itself a defect.
Telemetry must record stalls/improvisation/loop-caps honestly — these are the defects to fix.

Independent Review & Iterate: the Dry-Run (acceptance test)

AI Agent: Verdict

Verdict independently reviews the dry-run against the finalized Dry-Run Acceptance Criteria: did every brick complete with an acceptance-passing deliverable? Where did an agent stall, improvise (claim without a receipt), loop unbounded, or fail acceptance? Each defect is a GapLog entry tied to the offending BRICK; Forge/Pulse fix the brick CONTRACT (not the run); re-provision + RE-RUN until the run completes cleanly and Verdict signs a receipt-backed SOLID. The dry-run is the design's acceptance test.

Deliverables

[AI · Verdict] Dry-run scorecard vs the acceptance criteria: per-brick completion + acceptance pass/fail, stall/improvisation/loop-cap findings, each failing item a GapLog entry tied to a brick.

Acceptance: Every acceptance criterion scored yes/no with cited run evidence; every failure is a GapLog entry naming the brick to fix.

[AI · Forge/Pulse] Brick-contract fixes + re-provision + re-run record for each defect.

Acceptance: Each dry-run defect maps to a brick-contract change (not a run hack); the design was re-provisioned and re-run; the re-run telemetry is attached.

[AI · Verdict] Convergence Record (convergence_record.json) for the dry-run — the signed exit receipt: frozen rubric hash, GapLog state, the verdicts of exactly these non-author reviewers (Verdict) bound to the FINAL hash, and the independence/COI attestation.

Acceptance: Record is signed by Verdict and asserts each as a checkable true: frozen rubric exists (hash); GapLog open sev>=major == 0; EACH of the named reviewers (Verdict) — and NO phantom reviewer the brick did not produce — has a receipt-backed verdict bound to the FINAL dry-run hash; independence attestation confirms no reviewer authored/co-authored the artifact and the red-teamer (where present) != the adjudicator. DONE only when every box is true; if still open after the 3-iteration cap, the loop TERMINATES in a logged DESIGN-BLOCKED escalation to Atlas (not published), never a softened verdict.

Questions the agent asks (3)

Which brick failed its acceptance, and what contract change fixes it?
Did any brick stall on a human instead of logging an open question + flagged assumption?
Did any agent claim completion without a receipt (improvisation)?

Do (3)

Score the run mechanically against the dry-run acceptance criteria with cited evidence.
Tie every defect to a brick and fix the CONTRACT, then re-provision + re-run.
Converge only when a clean run-through passes every criterion.

Don't (3)

Don't fix the run by hand to make it pass — fix the design.
Don't publish on a dry-run with open acceptance failures.
Don't soften a criterion to terminate.

Guardrails (9)

INDEPENDENCE INVARIANT: no agent reviews its own work; a reviewer may not have authored or co-authored the artifact under review or a parent it derives from; the red-teamer is distinct from the convergence adjudicator (Verdict). Attested in the convergence record.
Frozen rubric: derive the review rubric from the Objectives Rubric, hash it, and record the hash BEFORE any verdict is cast; every line is a yes/no check with a stated pass condition.
Receipt-backed verdicts: a SOLID verdict must cite the dry-run version+hash, the rubric line ids checked with per-line pass, and the specific section/ids inspected; bare 'SOLID' is rejected; every verdict's hash matches the single final dry-run hash.
GapLog governance: every gap (incl. every red-team finding) is a first-class entry with reviewer-owned severity and recorded change history; only the raising reviewer or Verdict may change severity; convergence-by-downgrade is prohibited; no silent dismissals (close = fix or recorded rationale-for-no-action).
Delta + regression re-review: on each new version re-review changed sections AND re-confirm prior SOLID verdicts still hold against the new hash.
NEVER-STALL INVARIANT: a gap needing human-only info produces BOTH an open question to the sponsor AND an explicit flagged assumption the team proceeds on; the loop converges on the resolvable remainder and never stalls into a de-facto human gate.
Mechanical exit: the loop exits ONLY when the Convergence Record's binary checklist is fully true (rubric frozen; open sev>=major gaps == 0; EACH named non-author reviewer the brick actually produces a verdict-deliverable for is SOLID vs the FINAL hash; independence/COI attested). No phantom reviewer: the asserted reviewer set equals the deliverables present.
LOOP-CONTROL INVARIANT: a hard cap of 3 re-review iterations. If open sev>=major gaps remain after the 3rd iteration, the loop does NOT soften verdicts or wave anything through — it TERMINATES in a logged DESIGN-BLOCKED outcome escalated to Atlas with the open GapLog attached (the design is rejected and NOT published). DESIGN-BLOCKED is a terminal state, not a human approval gate; the run ends without a publish.
HONESTY-RECEIPT INVARIANT: any 'done'/'converged'/'SOLID' claim is valid only with a receipt (artifact hash, rubric hash, GapLog state, per-reviewer verdicts citing line ids); reviewers re-derive receipts rather than accept a summary. No claim without a receipt.

Publish & Version the Dry-Run-Proven Process

AI Agent: Pulse

Pulse publishes the dry-run-proven process under semantic versioning with a changelog and receipts; prior versions preserved (nothing a running Flow depends on is overwritten); status flips active. Publication is gated by the dry-run convergence receipt, not a human approval.

Deliverables

[AI · Pulse] Published, versioned Process: semantic version + changelog, status active, prior versions preserved, publish action's ledger receipt, and a reference to the dry-run convergence record.

Acceptance: Version + changelog correct; status active; prior versions preserved; publish receipt recorded; the dry-run convergence record is referenced as the go-live evidence; process attachable to Flows.

Questions the agent asks (3)

Is this a first publish (1.0), additive (minor), or a flow change (major)?
What goes in the changelog?
Are any Flows running on a prior version to preserve?

Do (4)

Stamp the semantic version + changelog and set status active.
Preserve prior versions; later improvements are new versions.
Reference the dry-run convergence record as the go-live evidence.
Use publish_process (never hand-edit the DB).

Don't (4)

Don't publish without a converged dry-run.
Don't overwrite a version Flows are running on.
Don't skip the changelog.
Don't claim published without the publish receipt.

Guardrails (4)

GATE-FREE INVARIANT: this brick contains no human approval/sign-off mechanism; the sponsor provides input and answers questions but never sits in an approval gate.
HONESTY-RECEIPT INVARIANT: any 'done'/'converged'/'SOLID' claim is valid only with a receipt (artifact hash, rubric hash, GapLog state, per-reviewer verdicts citing line ids); reviewers re-derive receipts rather than accept a summary. No claim without a receipt.
Go-live is gated by the dry-run convergence receipt, not a human approval.
Never overwrite/silently mutate a version a Flow depends on; preserve prior versions.

Handover: Register Live + Telemetry to Watch

AI Agent: Forge

Register the published process as live, link the dry-run evidence, and log residual follow-ups + the telemetry to watch in production (real-run completion rates, stalls, acceptance failures) so the next improvement is data-driven via process-redesign.

Deliverables

[AI · Forge] Handover note: live registration, link to the dry-run convergence record + publish receipt, residual follow-ups, and the production telemetry to watch.

Acceptance: Note links the go-live evidence; lists residual follow-ups (if any) with owners; names the telemetry signals that would trigger a process-redesign.

[AI · Forge] Rich HTML PROCESS WALKTHROUGH (process_walkthrough.html): a branded, self-contained HTML presentation that walks the user BRICK-BY-BRICK through the process that was built — phase-grouped; each brick shows its label, objective, owner/agent, deliverables + acceptance, and key guardrails — surfaced as a tab/link in the handover for the sponsor to review.

Acceptance: A single self-contained HTML file renders every brick in order, grouped by phase, each with objective + owner/agent + deliverables; it is linked from the handover and opens standalone in a browser (no external assets required).

Questions the agent asks (2)

What residual follow-ups remain, and who owns them?
What production signals would tell us this process needs redesign?

Do (3)

Link the go-live evidence.
List residual follow-ups with owners.
Name the telemetry that would trigger a redesign.

Don't (2)

Don't close out without linking the dry-run evidence.
Don't leave follow-ups unowned.

Guardrails (2)

GATE-FREE INVARIANT: this brick contains no human approval/sign-off mechanism; the sponsor provides input and answers questions but never sits in an approval gate.
HONESTY-RECEIPT INVARIANT: any 'done'/'converged'/'SOLID' claim is valid only with a receipt (artifact hash, rubric hash, GapLog state, per-reviewer verdicts citing line ids); reviewers re-derive receipts rather than accept a summary. No claim without a receipt.

Generated from the live Flowtely process library · self-contained · Process Design (build a new agent-runnable process)