Bridge-Card Sign-Off Procedure — Plan v4

Date: 2026-05-12 Status: ready for pilot Target schema version on adoption: v0d.7 Pilot scope: the 5 proposed bridge cards in wiki/.audit/weave-pass3-run3-2026-05-08.md Supersedes: plan v3 (wiki/.audit/bridge-card-signoff-plan-2026-05-12.md)

Pilot scope clarification (front-loaded from §15). This plan specifies a sign-off procedure to test in pilot form. Sign-off does not authorize apply-mode writes; the apply-mode-discipline / bridge-card-specific-calibration gate remains a separate authorization. v0d.7 codification proceeds only after pilot success, and apply-mode authorization is a v0d.7 question, not a pilot-output question. See §15 for the full carve-out.

v4 changes from v3: integrates Reviewer Z's patch list (wiki/.audit/bridge-card-signoff-plan-v4-patch-list-2026-05-12.md), Claude's counter-pushback on Z (recorded in the patch list's Adoption record and v4 changelog below), and two further patches the maintainer added on adoption — confidence-tracking on reviewer hypotheses (§7), and loosened research-scope discipline reflecting that philosophy has no ground-truth oracle (§6). Adoption decisions are documented in the v4 changelog at the bottom.


1. Background

The wiki's bridge-card layer (introduced in schema v0d.5, 2026-05-08) stages typed cross-page relations as "proposed" audit artifacts. Each card has 10 fields plus an Approved by maintainer: line that must carry a date before apply-mode writes are authorized. Schema v0d.5 is silent on (a) who the maintainer is, and (b) what discipline sign-off requires.

This document specifies a procedure for Claude-as-maintainer sign-off that is structurally non-rubber-stamping. The core load-bearing insight: the procedure must protect against bias not only at the per-card reviewer level (math-olympiad's peer-output isolation) but also at the adjudication seam (where three adversarial reviewers independently identified the highest-severity defect in plan v2 — main-thread adjudicator collapse) and at the artifact seam (where sign-off decisions must be written onto the artifact, not into an out-of-band note that apply-mode agents might miss).

2. Architecture

2.1 Three reviewer roles + one pre-step interpretation-check + one adjudicator subagent

Stage Role Stance Method
Pre-step I — interpretation-checker (1 subagent for the batch) reading-pluralist per card, generates 2-3 alternative readings (different Primary home, Bridge type, relation specification); marks trivial-vs-non-trivial; output read by A/B/C
Per-card A — gate-applier supportive structural applies the 7-test schema gate (§3)
Per-card B — adversarial-attacker hostile / math-olympiad-style probes against the 16-pattern failure-mode taxonomy (§4) with MP-scholarship-context briefing
Per-card C — evidence-tracer empirical verifies anchors against extraction notes, source pages, and raw, including the General Rule 18 local-context check
Adjudication D — adjudicator subagent (1 subagent for the batch) mechanical-decisional reads the 3 reports per card + card text + taxonomy + dependency map; produces a per-card recommendation under the quorum rule

For a 5-card batch: 1 interpretation-checker + 2 agents per role × 3 reviewer roles + 1 adjudicator = 8 agent invocations. Each per-card reviewer agent handles 2-3 cards. Each card receives 1 interpretation map + 3 reviews (one A, one B, one C) + 1 adjudicator recommendation.

All agents: Opus 4.7. Tools: Read, Grep, Glob, Write. Per-subagent write isolation is harness-enforced via a PreToolUse hook that denies Write/Edit operations outside wiki/.audit/bridge-card-signoff/run-N/<role>/<agent-K>/. Hook configuration lives in .claude/settings.json under a bridge-card-signoff profile, active for the pilot run only. Pilot prep checklist must include a smoke test confirming the hook fires — a deliberate cross-directory write attempt by a test invocation must return blocked. If hook verification fails at pilot-prep time, fall back to behavioral enforcement (prompt-scoped Write) and log the gap in adjudication.md as a known pilot weakness; do not delay pilot to wait on hook implementation. (Per v4 §2.1 patch.)

Note on pre-step interpretation check. Adapted from math-olympiad §1 ("Interpretation check, catches 50/63 of one class of errors"). The interpretation map is adversarial context, not pro-card reasoning — it expands the question (what are the alternative readings) without answering it. A/B/C reviewers see the map alongside their card as a starter set of alternatives (see §18 for the starter-set prompt framing); their job is then to evaluate the card's stated reading against the alternatives the interpretation-checker named, AND any alternatives they themselves identify that the map missed.

2.2 Reviewer isolation — what is and is not isolated

Each subagent receives only:

  • The bridge card's text (with strip-pass applied for Roles B and C; see §3.5)
  • The interpretation map (output of the pre-step)
  • Permission to read cited anchors (silent-keys report, extraction notes, source pages, raw — with discipline; see §6)

No subagent receives:

  • The framing prose of the Pass 3 report that motivated the card
  • The silent-keys Phase 2 verdict prose
  • Other reviewers' outputs (in this run or prior runs)
  • This plan document (bridge-card-signoff-plan-v4-2026-05-12.md) or its review reports

Subagent prompt language (replacing v2's "you are the only reviewer"): "You are reviewing this bridge card independently. You have no information about whether or how this card has been reviewed elsewhere. Do not speculate about other reviews; reason from the card and the cited anchors." This is operational honesty about isolation rather than a contestable claim of singularity.

Scope of isolation — explicit. (Reframed in v4 per §2.2 patch, with Claude's pushback applied: cross-card view is taxonomy-informed methodological choice, not apologetic weakening.) Each reviewer agent handles 2-3 cards within a single session. Information about Card N is available to the reviewer when evaluating Card N+1 in the same session. This is a deliberate methodological trade: the taxonomy's pattern #12 (motifs-weight tipping) explicitly requires cross-card view, since the pattern asks whether the combined typed-connection additions across cards would tip a motif's weight class. A reviewer evaluating cards one-at-a-time in pure isolation could not fire #12; the load-bearing detector for the run3 batch's HUB-page touches (§12c) would be invisible.

The cost — weaker per-card independence within a single reviewer's session — is accepted because (a) the taxonomy's bite into the motifs-weight-tipping case depends on cross-card view, and (b) the load-bearing isolation property is reviewer-from-peer-reviewer-output, which the file-path isolation in §2.3 plus the PreToolUse hook in §2.1 enforce structurally.

The math-olympiad analogy's "dual isolation" maps onto reviewer-from-peer-reviewer-output, not onto reviewer-from-other-cards-in-the-same-session. Calling it "dual isolation" in v3 was a usage stretch; v4 calls it "peer-output isolation" and treats cross-card session view as a taxonomy-informed choice.

Subagent definitions are self-contained: the schema gate, the pattern taxonomy, and the verdict format are embedded in the prompt, NOT loaded from weave/SKILL.md. See §17 for the skill-loading discipline.

2.3 File-path isolation, not worktrees

Each agent writes to a dedicated subdirectory. No agent reads peer directories. Enforced by the PreToolUse hook described in §2.1.

wiki/.audit/bridge-card-signoff/run-N/
  interpretation-map.md                       # single output from the pre-step
  role-A/agent-1/card-1.hypothesis-pre.md     # frozen at report write
  role-A/agent-1/card-1.hypothesis-post.md    # optional, post-verdict revisions
  role-A/agent-1/card-1.report.md
  ...
  role-C/agent-2/card-5.report.md
  adjudication/per-card-recommendations.md    # output of adjudicator subagent
  adjudication.md                             # main-thread's ratification log

The two-file hypothesis-note structure (hypothesis-pre + hypothesis-post) is new in v4 — see §7 for rationale.

3. The schema gate (Role A — 7 tests)

  1. Anchor traceability. Every evidence anchor traces to silent-keys report / extraction note / source page / verified raw passage.
  2. Counterpressure has a counter-test. A specific empirical or textual check that, if executed, could falsify the bridge.
  3. Apply-as specificity. Each apply-as item names target page, target section, target relation type, and the attestations cited.
  4. Single-source promotion blocker present where applicable.
  5. Primary home selection justified, especially where the silent-keys report left it ambiguous.
  6. Counterpressure field present and specific. The card's Counterpressure field states at least one objection that names a check that could be performed (rather than gesturing at "could be wrong").
  7. Apply-as wikilinks resolve. Every wikilink in the Apply-as list resolves to an existing wiki page (per Glob over wiki/{sources,concepts,entities,questions}). Wikilinks in the Note-on-Primary-home paragraphs are checked separately and flagged if they would seed apply-mode confusion.

3.5 Strip-bridge-prose pre-pass + anchor-string neutralization (Roles B and C only)

Per the math-olympiad "strip thinking before verifying" lever: Roles B and C receive a stripped version of the card. The strip removes reasoning trace and keeps structural content.

Strip what:

  • "What it makes visible" field (the polished claim-prose)
  • "Note on Primary home selection" paragraphs (the drafter's argument for primary-home choice)
  • Relation glosses (the prose after each relation bullet's [[page-A]] -> [[page-B]] type label)

Keep:

  • Status, Bridge type, Primary home, Secondary homes
  • Relation bullets (page → page → type label, but not the gloss)
  • Evidence anchors (the bare list — with anchor-string neutralization applied; see below)
  • Evidence status
  • Counterpressure (kept; it's adversarial content, not pro-card reasoning)
  • Counter-test
  • Apply-as items
  • Single-source promotion blocker (kept verbatim)

Role A receives the unstripped card — Role A's gate includes test #6 (Counterpressure specificity) and is checking the prose.

The strip is performed by main-thread before subagent spawn, written to a separate stripped-card file each subagent reads. The strip is mechanical (regex or section-header-based), not a judgment call.

Anchor-string neutralization

(New in v4 per §3.5/§5 patch, with Claude's pushback on tier-N over-strip risk applied.)

Evidence-anchor strings can encode upstream verdicts in their labels — silent-keys-2026-05-07.md §Source 2 tier-1 PASS row 3 verdict cues Roles B and C toward the upstream conclusion before they verify anything. The strip-pass neutralizes verdict tokens:

Tokens stripped from anchor strings: PASS, FAIL, verdict

Tokens kept: file path, section heading, row/line numbers, and structural labels including tier-1 / tier-2 / tier-3 / work-test. These are section-organizational labels in silent-keys reports — stripping them would risk Role C navigating to nonexistent sections. Their verdict-cue value is significantly weaker than the explicit verdict tokens; the over-strip risk on structural labels outweighs the leakage risk.

Example: silent-keys-2026-05-07.md §Source 2 tier-1 PASS row 3 verdictsilent-keys-2026-05-07.md §Source 2 tier-1 row 3

Role A receives anchor strings unstripped — consistent with §3.5's existing treatment of Role A as evaluator of the card's whole prose-and-anchors. Neutralization is mechanical (regex), not a judgment call.

Why it matters: the "What it makes visible" field opens with polished claims; the anchor strings can carry upstream verdicts. A reviewer reading those is reading the conclusion before checking the premises. Math-olympiad's strip-thinking discipline applies to both prose and anchor labels.

4. The adversarial taxonomy (Role B — 16 patterns)

Numbered for stable reference. Original 11 patterns from v2; pattern #1 split into #1a + #1b; 5 patterns added per v3 reviewer findings.

v2-original patterns (refined)

  • #1a Lexical laundering. Secondary author's term presented as the primary author's term. Check: does any anchor cite the primary author's own text using this terminology, or only secondary commentary?
  • #1b Scaffolding promotion. Secondary author's organizing scaffolding (a term they coined to organize their reading of the primary author) presented as a stable primary-author-corpus register. Check: is the term's function in the cited secondary literature scaffolding (organizing the secondary author's reading) or attested (a stable corpus register)? Cards 1, 2, 4 in run3 are at risk.
  • #2 Single-source promotion creep. Card claims single-source with blocker but apply-as items upgrade to claim/motif moves. Check: does any apply-as item, executed as written, write a claims.md entry or a motifs.md HUB/STRUCTURAL upgrade?
  • #3 False genealogy. Typed connection asserts "is a middle term between" / "is a reformulation of" without chain attested. Check: does evidence attest the chain, or only the endpoints?
  • #4 Anchor-counterpressure inversion. Counter-test unfalsifiable, or bridge already fails it. Check: would the bridge survive its own counter-test if executed?
  • #5 Primary home arbitrariness. Home-selection rationale could equally support another home. Check: does the justification appeal to features unique to the selected home?
  • #6 Apply-as scope creep. Apply-as exceeds evidence. Check: does each apply-as item have a directly-warranting evidence anchor?
  • #7 Bridge-type conflation. Mechanism / modal-temporal / mechanism-attitudinal mismatched with evidence. Check: do anchors attest the claimed type or a different one?
  • #8 Hovering register. Vague typed-connection language papers over absent specifics. Check: is each typed connection specific enough that a maintainer could write the corresponding concept-page subsection?
  • #9 Fabricated-citation pattern (per General Rule 16 / Faul incident). Check: does every cited source / page / line exist in the wiki's source inventory and raw corpus?
  • #10 "Would settle longstanding controversy" (math-olympiad pattern #4 translated). Check: does the bridge's general form encroach on a contested interpretive question? Applied with the MP-scholarship-context briefing (§4.5).
  • #11 Too-clean-bridge / extract-the-general-lemma (math-olympiad pattern #40 translated). Check: extract the general claim; look for a counterexample in MP corpus or adjacent. If general form falsifiable, what makes THIS instance special?

Patterns added in v3

  • #12 Motifs-weight tipping. Check: would the bridge's typed-connection additions, combined with existing typed connections on the same Primary/Secondary-home pages, tip an existing motifs.md THEME entry to STRUCTURAL or STRUCTURAL to HUB? Construct the post-apply attestation graph for the relevant motif; compare to current weight class. If it tips, the bridge requires motifs-delta authorization, not bridge-card sign-off.
  • #14 MP-period miscoding. Check: If the bridge crosses MP-period boundaries (PoP 1945 / 1953-55 transition / V&I-era 1959-61 / institution-lectures 1954-55 / late notes), does the bridge name the periodization explicitly, and does the relation language acknowledge the chronology? A bridge that asserts enacts across periods without naming the periodization is mishandling the corpus. Cards 2 and 4 in run3 are at risk.
  • #15 Claim-status laundering via subsection-write carve-out. Check: does any apply-as item write content whose epistemic force matches a live or supported claim, without the claim-promotion gate being run? If yes, either downgrade the apply-as item to Open Questions framing, or require the claim-promotion gate before sign-off. Cards 1, 3, 4 in run3 are at risk.
  • #18 Anchor underdetermination. Check: For each cited anchor, construct the most parsimonious reading of what the anchor attests. Does the bridge's claimed relation require interpretive moves beyond the parsimonious reading? If yes, the bridge is anchor-underdetermined and either needs a different anchor that does the work, or is over-claiming.
  • #19 Saturation-bridge / over-glossed-phrase. Check: is the bridge's primary phrase heavily glossed in secondary literature (≥3 distinct interpretive schools deploy it as a heading-level or chapter-organizing term)? If yes, the bridge is in saturation territory and must either (a) take an explicit position in the dispute with extra counterpressure work, or (b) downgrade from "mechanism" or "structural" to "vocabulary-track" — a different and lighter bridge type. Distinct from #10: #10 catches the bridge encroaching on a dispute by taking a position; #19 catches the bridge declining to take a position when the phrase requires one.

Deferred to post-pilot

Patterns #13 (philological inversion), #16 (wiki-internal terminology collision), #17 (false-friend evasion via bridge-card route). #13 fires only on bilingual cards; #16 is mostly structural (fits Role A's gate better); #17 overlaps with #15. Add post-pilot if pilot reveals their need.

4.5 Role B prompt addition — MP-scholarship-context briefing

Role B's prompt includes a briefing paragraph naming major MP-scholarship schools and famously contested points, sourced where Role B applies #10 and #19.

Briefing content (to embed in Role B subagent prompt):

MP-scholarship context. Major scholarly schools and famously contested points to factor when applying patterns #10 (longstanding controversy) and #19 (saturation-bridge):

  • Status of "flesh" / chair / Fleisch: element (Dillon) vs. ontology (Barbaras) vs. metaphor vs. structural register (Saint Aubert). Bridges that pair [[chair]] / [[flesh]] with structural-mechanism claims encroach.
  • Continuity vs. rupture between Phenomenology of Perception (1945) and The Visible and the Invisible (1959-61): continuity reading (Carbone, Vallier in some moods) vs. rupture reading (Barbaras, Toadvine). Bridges spanning PoP and V&I take an implicit position.
  • Husserl ↔ MP relationship: where MP extends vs. departs from Husserl (esp. on Lebenswelt, Stiftung, passive synthesis). Bridges that route through Husserlian terms (Sichten, Wesensschau, Stiftung) without explicit position-taking encroach.
  • Hegel ↔ MP via Hyppolite: the 1953-55 transition zone is known to involve Hegel reception. Bridges in this period have a periodization tax.
  • Late-MP and Marxism: the Adventures of the Dialectic / institution-of-the-proletariat status is contested between political-philosophical reading (Coole) and ontological reading (Caraus, Larison in M-C 2026). The "haunting / obsessive presence" register sits inside this dispute.
  • "Hyper-dialectic" as positive vs. critical move: is it MP's positive method (Lawlor) or a critique-of-dialectic (Saint Aubert)? Bridges to [[hyper-dialectic]] should name which reading they're presuming.
  • The 1953-55 transition zone is positionally load-bearing (CLAUDE.md: "is a middle term between [Earlier] and [Later]"). Bridges that ignore the transition zone in cross-period claims mishandle MP-corpus chronology.

Non-exhaustive. (Added in v4.) This briefing names disputes considered prominent at briefing time. It is not a complete map of MP-scholarship contested zones. If you identify a contested interpretive question the card encroaches on that is not in this list, fire #10 or #19 with the school named explicitly, and flag the gap in your hypothesis note's "Confidence trace" or "Convergence" fields (§7) for v0d.7 briefing-update consideration.

When Role B finds the bridge encroaches on or declines a position in any of these (or any contested zone outside the briefing) without explicit acknowledgment, fire #10 or #19 with the relevant school named.

5. Evidence-tracer protocol (Role C)

  1. Anchor labels are locators, not authority. (New in v4 per §5 patch.) Cited anchors may reference silent-keys reports, motifs.md aggregations, or audit findings that have their own verdict labels. Verdict tokens (PASS, FAIL, verdict) have been stripped before delivery to you (§3.5), but do not infer evidential weight from any surviving labels including tier-N or work-test. Verify the cited row/passage's content against extraction note, source page, and (where needed) raw. Treat anchors as pointers to text, not as endorsements.

  2. List every evidence anchor in the card.

  3. For each anchor, locate it (extraction note line, source page passage, raw file).

  4. Verify the anchor says what the card claims it says. Include a 1-paragraph philological gloss: "the passage says X; the card claims it says Y; X and Y match because Z."

  5. For multi-source cards, check whether the cited convergence is genuine or post-hoc.

  6. For single-source cards, check whether the cited single source actually carries the weight the card assigns it (i.e., the cited passage is load-bearing, not incidental).

  7. If a cited anchor is unreachable or doesn't say what's claimed: DEFECT FOUND with location.

  8. Rule 18 local-context check. For any anchor that originates in an extraction note ingested before the bridge's primary motivating source, perform a General Rule 18 local-context check: read the cited passage in raw, with paragraph context, and verify that the passage carries the cross-source weight the bridge assigns. If the older extraction note extracted a phrase that perfectly matches the bridge's claim while the surrounding context is not about the bridge's relation, the anchor is stale and the card should be flagged with Stale-Anchor concern.

6. Research scope discipline

(Substantially rewritten in v4 per the maintainer's insight: philosophy has no ground-truth oracle, so artificially capping research scope produces sign-off based on stale or under-verified anchors. The v3 hard caps are replaced with a calibrated-budget approach. The reviewer self-allocates research effort to the card's needs and self-flags when the card exceeds the depth at which sign-off review can operate.)

Reviewers may read /raw/, extraction notes, source pages, audit reports, and motifs.md / claims.md as needed to discharge their role.

Scoping discipline. The reviewer's job is sign-off review, not re-ingest. Within that job:

  1. Default expected effort: ≤ 5 raw reads per card. Most cards' anchors are short and locatable; verification fits within this envelope.
  2. Calibrated extension: if Role C's Rule 18 local-context check (§5 item 7) or Role B's #11 / #19 counter-example probes require more reads, the reviewer extends the budget with rationale logged in the hypothesis note for each additional read.
  3. Self-flag threshold: at 10+ raw reads with no convergence on a verdict, the reviewer self-flags the card with Scope-Mismatch — "this card's evidence required reading depth that approaches an ingest pass; sign-off review may be the wrong frame for this card." The self-flag is informational and does not terminate the reviewer's work; the reviewer continues to a verdict, and the flag becomes a signal for §10 batch-level diagnosis (specifically diagnostic option (d): card cohort mis-scoping).

Required logging: Every raw read is recorded in the hypothesis note: file, location, purpose, finding. Reviewers who finish under 5 reads note this explicitly ("budget not binding") rather than leaving silence — silence about the budget is silence about whether it was checked.

Disallowed: unbounded exploratory reading not tethered to a specific anchor verification, pattern probe, or Rule 18 check. The discipline is "calibrated to the card's needs," not "free-form."

Why v4 loosened v3's hard caps. v3's 5-read cap (raised to 8 for Role C Rule 18) was implicitly assuming a hypothesis-test framing where reviewers prove or fail to prove within a fixed budget. In a domain without ground-truth oracles, artificial budget caps produce sign-off based on under-verified anchors when a card genuinely needs deeper reading. The calibrated-budget approach lets reviewer effort track card complexity while preserving terminate-or-flag discipline via the Scope-Mismatch self-flag.

7. Hypothesis-note discipline

(Updated in v4 with two patches: confidence tracking + two-file structure for trace integrity.)

Two-file structure

Each reviewer writes two hypothesis-note files per card:

  • card-N.hypothesis-pre.md — created at the start of reviewer work; frozen (by hook) when the matching card-N.report.md is written.
  • card-N.hypothesis-post.md — optional, created only if the reviewer notices, while drafting the report, that the pre-report hypothesis was incomplete or that a hypothesis-revision is warranted.

The PreToolUse hook (§2.1) enforces: once card-N.report.md exists, the matching card-N.hypothesis-pre.md is read-only for the rest of the session. The hypothesis-post.md file is freely writable. This preserves trace integrity for pre-verdict reasoning (the hypothesis-pre file is a frozen artifact of what the reviewer believed when forming the verdict) while permitting post-verdict revision (the hypothesis-post file captures revisions explicitly, without retroactively editing pre-verdict reasoning).

Hypothesis-pre file structure

# Card N — Role X — Agent K — hypothesis-pre

## Initial reading
[first impression after reading card only, before checking anchors]

## Initial confidence
[high | medium | low | uncertain] — explicit field, set BEFORE anchor checks
Why: [one-sentence rationale]

## Anchor checks
[per cited anchor: what was checked, what was found]

## Confidence trace
[per anchor check OR raw read, log a one-line update:]
- After anchor [X]: confidence now [high/medium/low/uncertain]; delta because [...]
- After raw read [Y]: confidence now [...]; delta because [...]

Confidence is allowed to move freely. A reviewer whose confidence does not move
across any anchor check is signaling that anchor checks did no work — that
itself is calibration data and surfaces at §10 batch diagnosis.

## Raw reads (if any)
[per raw read: file, location, purpose, finding. Note explicitly whether the
budget felt unbinding ("under 5 reads, no pressure") or binding ("approached
self-flag threshold; chose not to escalate because [...]"). If self-flag fired,
record under §6 Scope-Mismatch.]

## Hypothesis revisions
[as evidence accumulates: revise hypothesis, note what changed]

## What changed between initial reading and convergence
[explicit field. If nothing changed, explain why the anchor checks did not move
the impression. A reviewer consistently writing "nothing changed" across cards
is a calibration signal.]

## Final confidence
[high | medium | low | uncertain] — set BEFORE writing report
Why: [explicit reasoning, including reference to the confidence trace above]

## Convergence
[final reasoning leading to verdict]

## Verdict (preview)
[APPROVED / DEFECT FOUND / UNCLEAR-{subtype} — to be formalized in the report]

Hypothesis-post file structure (optional)

# Card N — Role X — Agent K — hypothesis-post

## Trigger
[what, during report-writing, prompted this post-verdict revision]

## Revision content
[the substance of the revision]

## Effect on verdict
[did the revision change the verdict? if so, the report should be rewritten
 within the session before final submission. if not, log the revision for §10
 batch-level diagnosis and leave the verdict standing.]

Two-tool-call enforcement and trace integrity

Two-tool-call enforcement (Stage 1: write hypothesis-pre; Stage 2: write report) is harness-enforced in v4 via the PreToolUse hook. The hook enforces the structural rule "hypothesis-pre is frozen once report exists" by blocking Edit/Write to *hypothesis-pre.md files when the matching *.report.md exists in the same directory. This resolves v3's mtime-check approach, which would have conflicted with v3 §7's allowed post-verdict revision pathway (mtime alone cannot distinguish "appended revision after verdict" from "rewrote earlier pre-verdict sections").

For pilot calibration: main-thread spot-checks at least one reviewer per role per run that the hypothesis-post.md file (if present) does not contain revisions that should have been pre-report. Operationally, main-thread reads any hypothesis-post.md and asks "could this content have been written before the report?" If yes for ≥1 reviewer per role across the batch, the pilot reveals a procedural gap (reviewers under-using pre-report depth) that v0d.7 codification should address with stronger pre-report discipline.

8. Verdict format

# Card N — Role X — Agent K — report

**Card slug:** [card title from Pass 3 report]
**Reviewer role:** A (gate-applier) | B (adversarial-attacker) | C (evidence-tracer)
**Verdict:** APPROVED | DEFECT FOUND | UNCLEAR-{UNDER-SPECIFIED | CANNOT-DETERMINE | OUT-OF-ROLE-SCOPE}
**Confidence:** high | medium | low

## Tests applied
[Role-specific. For Role A: per-test (1-7) result. For Role B: per-pattern (1a, 1b, 2-12, 14, 15, 18, 19) probed/skipped/fired. For Role C: per anchor verified/unverified/contested.]

## If DEFECT FOUND
- Location: [card field / line]
- Pattern or test: [#N from taxonomy, or test K from gate, or "other"]
- Why it breaks: [specific reasoning]
- Fixable: yes-with-X | no-fundamental

## If UNCLEAR
- Subtype: UNDER-SPECIFIED | CANNOT-DETERMINE | OUT-OF-ROLE-SCOPE
- Reason: [why cannot determine]
- What would resolve: [e.g., re-reading anchor, raw check, re-roled reviewer, card rewrite by drafter]

## If APPROVED with high confidence
[Required sentence: explain *why* confidence is high — what evidence specifically warrants high vs. medium. Without this, "high confidence" is uncalibrated.]

## Notes
[anything else; reference hypothesis-pre and hypothesis-post (if present) notes for trace]

UNCLEAR subtype semantics:

  • UNDER-SPECIFIED-CARD: the card itself is too vague to evaluate; needs rewrite by drafter (not by reviewer or adjudicator).
  • CANNOT-DETERMINE: the evidence is ambiguous; a specific check (named in the report) would resolve.
  • OUT-OF-ROLE-SCOPE: this card's defect is for a different role's lens.

9. Quorum rule (asymmetric, 3 reviewers)

A B C Decision
APPROVED APPROVED APPROVED adjudicator subagent confirms sign-off
2 APPROVED + 1 DEFECT (DEFECT Confidence: high) BLOCK
2 APPROVED + 1 DEFECT (DEFECT medium or low confidence) adjudicator subagent recommendation; sign-off requires explicit rebuttal of the DEFECT's named pattern in the recommendation
1 APPROVED + 2 DEFECT BLOCK — revise or retire
3 DEFECT RETIRE (or major rewrite)
Any UNDER-SPECIFIED return to drafter; do not adjudicate. Card status remains proposed with rewrite-required annotation.
2-of-3 OUT-OF-ROLE-SCOPE on the same card role assignment was wrong; re-spawn fresh reviewers with appropriate roles
Any CANNOT-DETERMINE without sufficient APPROVED to clear adjudicator subagent commissions targeted check (re-read anchor, raw probe) or recommends BLOCK pending resolution

Asymmetric because: DEFECT FOUND is a strong signal cheap to investigate; rubber-stamping is expensive to undo once apply-mode writes have landed.

10. Batch-level "same-gap-twice" check

(Expanded in v4 with two new triggers: Scope-Mismatch repeats and confidence-trajectory repeats. Diagnostic options expanded to four.)

After all 18 reports (15 reviewer + 3 interpretation/adjudication) are written, main-thread Claude reads them in aggregate and checks three triggers:

  • Same-pattern repeats: did 3+ cards fail the same pattern?
  • Same-scope-mismatch repeats: did 3+ cards earn Scope-Mismatch self-flags from §6?
  • Same-confidence-trajectory repeats: did 3+ cards show a reviewer whose final confidence is more than 2 levels lower than their initial confidence (e.g., started high, ended uncertain)? (New in v4; tracks calibration drift across the batch.)

Any of the three triggers requires a diagnostic pause. The diagnostic options are now four:

  • (a) The schema has a systematic bug.
  • (b) The upstream artifact (silent-keys report, motifs.md aggregation) has a systematic bug.
  • (c) The taxonomy is mis-tuned for this card cohort — pattern over-firing on cards that are actually well-formed.
  • (d) The card cohort itself is mis-scoped — multiple cards triggering Scope-Mismatch flags suggests this batch's cards exceed the depth at which sign-off review can operate; consider re-ingest or partial re-ingest of the underlying sources. (New in v4.)

Main-thread writes a diagnostic note in adjudication.md naming which of (a)/(b)/(c)/(d) the pause reflects, with rationale. Without explicit choice, do not re-run; doing so risks rewriting upstream reports that aren't broken, mis-tuning the taxonomy, or papering over a cohort mis-scoping that needs upstream attention.

11. Adjudicator subagent + main-thread ratification

After all per-card reviewers return, main-thread spawns a single adjudicator subagent (Role D) with:

  • The 3 reports per card (A, B, C)
  • The card text (unstripped)
  • The interpretation map from the pre-step
  • The schema gate (§3), the taxonomy (§4), the quorum rule (§9)
  • The cross-card dependency map (§11.5)

The adjudicator subagent does NOT receive:

  • This plan document
  • The reviewer hypothesis-pre or hypothesis-post notes
  • The Pass 3 motivating prose
  • Session memory of who drafted the Pass 3 cards
  • The adversarial-plan-reviewer reports (X, Y, Z) or Claude's pushback on Z

Adjudicator's job per card:

  1. Apply the quorum rule (§9) mechanically.
  2. For 2-APPROVED + 1-DEFECT (non-high-confidence) cases, write an explicit rebuttal of the DEFECT's named pattern.
  3. Recommend: SIGN-OFF / SIGN-OFF-WITH-NARROWING / BLOCK / ESCALATE-TO-USER.
  4. Write recommendation to adjudication/per-card-recommendations.md.

Main-thread ratification — clauses

(Updated in v4 per §11 signed-apply-scope and §11↔§16 contradiction patches.)

  • SIGN-OFF → main-thread writes onto the card:

    Approved by maintainer: YYYY-MM-DD
    Signed apply scope: all items
    
  • SIGN-OFF-WITH-NARROWING → main-thread writes onto the card:

    Approved by maintainer: YYYY-MM-DD
    Signed apply scope: items [N], [M] only; see `bridge-card-signoff/run-N/adjudication.md#card-X`
    

    Apply-mode agents MUST honor the Signed apply scope: line and ignore Apply-as items not enumerated there. The line's presence (in either form) is a precondition for apply-mode; absence blocks apply.

  • BLOCK → main-thread writes block rationale; escalates to user for confirmation before retire-or-revise decision.

  • ESCALATE-TO-USER → main-thread surfaces the case to user without acting.

Main-thread authority — procedural only

Main-thread does NOT escalate on substantive disagreement with the adjudicator's verdict. Main-thread escalates (to user) only when ratification detects:

  • Invalid quorum application — adjudicator's recommendation does not follow §9 from the three reports
  • Missing Signed apply scope: line on a SIGN-OFF or SIGN-OFF-WITH-NARROWING recommendation
  • Cross-card dependency conflict surfaced by §11.5 that adjudicator did not address
  • Procedural inconsistency between adjudicator recommendation and per-card reports (e.g., adjudicator cites a pattern no Role-B report fired)

Substantive disagreement is not a procedural defect; the fresh-context adjudicator call stands. If main-thread's reading of a defect report differs from the adjudicator's, that is exactly the cognitive seam the adjudicator was introduced to close. Main-thread is forbidden from reasoning into the adjudicator subagent's recommendation; main-thread's role is mechanical ratification of the recommendation, not adjudication of the adjudicator.

Main-thread writes adjudication.md summarizing all 5 card outcomes, with adjudicator-recommendation-plus-ratification per card.

11.5 Cross-card dependency reconciliation

After all per-card adjudications but BEFORE writing any Approved by maintainer: line: main-thread re-reads all SIGN-OFF / SIGN-OFF-WITH-NARROWING cards' apply-as items. For each apply-as item that cross-references another card:

  • If the referenced card was BLOCKED or RETIRED: narrow the apply-as item to remove the cross-reference, OR downgrade the entire approved card to held-pending-other-card-revision.
  • If the referenced card was also signed off: proceed.

Documented in adjudication.md under "Cross-card reconciliation."

Map-construction rule

(New in v4 per §11.5 patch, refined to wikilinks-only per Claude's pushback on mechanism-name regex.)

The cross-card dependency map is auto-extracted by main-thread before adjudicator spawn, via regex over each card's Apply-as items, capturing wikilinks to slugs matching other Pass-3 cards' Primary or Secondary home. Non-wikilink mechanism references (e.g., bare French verb forms like déposer) are NOT extracted by the regex; the rule is "wikilinks only" so the extraction stays mechanical. Pass-3 drafters are advised that cross-card mechanism references should be wikilinked at draft time to be captured by this rule.

The map is passed to the adjudicator subagent as part of its input. The adjudicator decides how to weight dependencies; main-thread's §11.5 reconciliation applies the adjudicator's decisions mechanically. (Removes the v3 "main-thread interprets dependencies" reading.)

Concrete example for run3: Card 5's apply-as items 2 and 3 cross-reference Card 1's déposer via wikilinks to [[institution]]. If Card 1 is BLOCKED, Card 5's items 2 and 3 are struck (Card 5 still applies item 1 on [[transtemporality]] and item 4's promotion deferral, which don't cross-reference Card 1).

11a Revise-and-resubmit pathway

A BLOCKED card whose DEFECT report carries Fixable: yes-with-X may be revised in-place by main-thread once per run, then re-submitted to a fresh-context A/B/C round (not the same agents — the prior verdicts are stale).

  • Revision must address the specific DEFECT (named pattern, named location).
  • Revision is documented in adjudication.md under "Revise-and-resubmit log."
  • Re-submission spawns fresh A/B/C agents with the revised card.
  • Cap: 1 revise-and-resubmit per card per run. If the second round also BLOCKS, card retires until the next Pass 3 run.

Main-thread may NOT revise a card whose DEFECT carries Fixable: no-fundamental — that's a retire.

12. Carve-out — what sign-off authorizes

Sign-off authorizes apply-mode writes for:

  • Subsection writes on concept pages under Open Questions (always), under What the Concept Does or Stakes (only if §12a / §12c constraints are met)
  • Open-question additions (always)
  • Typed-connection additions (subject to §12d motifs-weight check)
  • Bridge-card-anchored cross-references

Sign-off does NOT authorize:

  • Promotion of bridge to a claims.md entry at supported status (the 5-test claim-promotion gate applies; supported promotions halt for human review per existing CLAUDE.md rule)
  • Promotion to motif HUB/STRUCTURAL weight
  • Deletion or retirement of existing wiki content

Authoritative scope source. (New in v4 per §12 patch.) Apply-mode agents read the card's Signed apply scope: line (§11), not the card's original Apply-as list, to determine which items are authorized. Items in the original Apply-as list but not enumerated in Signed apply scope: are not authorized. Presence of Signed apply scope: is a precondition for apply-mode; absence blocks apply.

12a Epistemic-force constraint on subsection writes

Single-source-blocker cards (or cards whose adjudicator-recommendation narrows them to single-source-equivalent scope) may write subsections under Open Questions only, not under What the Concept Does or Stakes. The subsection prose must use provisional framing:

  • Acceptable (provisional): "Larison reads X as Y; the claim is not yet corroborated outside M-C 2026 and awaits a second-source anchor."
  • Unacceptable (declarative): "X is Y."

Multi-source cards may write under What the Concept Does or Stakes, but the subsection prose must annotate when the cross-source convergence is secondary-author-mediated (per #1b pattern's domain).

12b Apply-as-flag vs. apply-as-write distinction

Apply-as items that flag future promotion (Possible motifs.md candidate, flag for second-source confirmation, candidate claims.md slug X) authorize the flag itself, not the future promotion. Apply-mode writes implementing such items write only the flag (e.g., a one-line entry in the relevant Open-Questions section), not the promotion.

Apply-mode agents must treat language like "Possible motifs.md candidate" as "do not write a motifs.md entry," not as "write a motifs.md entry at BRIDGE weight."

12c HUB-page apply-as escalation

Cards whose Primary home or Secondary home is a HUB-weight concept page (per motifs.md weight class) require:

  • 3/3 APPROVED for apply-as items targeting What the Concept Does or Stakes on the HUB page (no adjudication path to APPROVE these specific items)
  • Normal quorum (per §9) for apply-as items targeting Open Questions on the HUB page

HUB pages have downstream impact disproportionate to the card's local stakes. In run3: Cards 1, 2, 3, 4, 5 all touch HUB pages; the escalation applies to apply-as items targeting non-Open-Questions sections on those pages.

12d Motifs-weight tipping check

Per pattern #12: if Role B fires #12 on a card, the card's apply-as items affecting motifs are deferred to a motifs-delta sweep, which is its own audit phase. Bridge-card sign-off authorizes only the non-motifs-tipping apply-as items in this case.

13. Pilot success criteria

(Substantially revised in v4: planted-defect control only (no seeded-clean), criterion 7 split into 7a/7b with Claude's sharper boundary.)

Pilot SUCCESS = all of the following:

  1. Planted-defect control fires DEFECT. (Per v4 §13.1 patch, with seeded-clean control removed per Claude's pushback — wikiwip does not have ground-truth oracles for "clean card" calibration.) The pilot batch includes 1 seeded card constructed with a known pattern-#1a defect (secondary author's term presented as primary author's). If the pipeline does not fire DEFECT on this card, taxonomy is under-firing and pilot fails. The seeded card is flagged in the dependency map as seeded but the adjudicator is blinded to seeding; main-thread reveals seeding only after adjudicator recommendation is written, then compares: did A/B/C fire #1a? Did the adjudicator recommend BLOCK or DEFECT-FOUND? If both yes → pilot calibration confirms taxonomy bite. If either no → taxonomy under-fires; pilot fails this criterion.

    Note on asymmetry: only a planted-defect control is run, not a seeded-clean control. A seeded-clean control would require ground-truth that the card is in fact clean — which is exactly the philosophical-defensibility judgment the pilot was designed to make. Importing the calibration-against-oracle pattern from oracle-having domains would be domain-confusion.

  2. Every DEFECT FOUND fires a specific pattern from the taxonomy (not "other"). If reviewers keep firing "other," the taxonomy is under-specified.

  3. At least one Role C raw-budget self-flag fires OR is explicitly noted as adequate in the hypothesis-pre notes. Silence about the budget means the budget wasn't actually checked.

  4. §10 same-gap-twice check either does not fire, or fires with a non-taxonomy diagnosis (schema bug, upstream bug, or cohort mis-scoping). Taxonomy mis-tuning diagnosis triggers pilot REVISE, not pilot success.

  5. No 1-of-3 DEFECT-high-confidence adjudication produced a silent APPROVE. Every such case routes to BLOCK per §9 quorum rule; if any was approved, §9 is being bypassed.

  6. ≥3 of 5 cards reach a verdict (SIGN-OFF, BLOCK, or RETIRE) without escalation-to-user during adjudication.

  7. Procedural-only failure check (7a/7b split). (Per v4 §13.2 patch with Claude's sharpened boundary: routed through the adjudicator's input set.)

    • 7a — procedural override = pilot failure. Human user does not override any per-card decision on the basis of inputs the adjudicator had. The test: could the adjudicator have decided correctly given its inputs? If yes and user overrides → procedure failed by reaching the wrong answer with adequate input; pilot fails this criterion.

    • 7b — context override = pilot working as designed. Where user overrides on the basis of context the adjudicator structurally could not have had (Pass 3 motivating prose, cross-session memory, recent ingest, the plan document, prior wikiwip state, the maintainer's reading of multi-source convergence not captured in any single report): the adjudicator could not have decided correctly given its inputs. Override is logged in adjudication.md under "user-context overrides" and does NOT count as pilot failure. 7b cases inform escalate-to-user thresholds in v0d.7 codification.

Pilot FAILURE = any of the above fails. Pilot AMBIGUOUS = a mixed outcome. In ambiguous cases, do NOT codify; revise the relevant component (taxonomy / gate / quorum / cohort scoping) and re-pilot.

14. Cards-that-reference-each-other

Handled by §11.5 (cross-card dependency reconciliation). Per-card adjudication remains independent; reconciliation runs after all adjudications complete.

15. What this plan does NOT do

  • Does not authorize apply-mode writes for the 5 Pass 3 cards. Sign-off and apply-mode are separate steps; apply-mode follow-up branch is separately authorized.
  • Does not modify CLAUDE.md or weave/SKILL.md or schema-changelog.md. Those updates land at the v0d.7 codification step, after pilot succeeds.
  • Does not retire or modify the existing v0d.5 bridge-card layer. This extends it.

v0d.7 codification will additionally add (anticipated, not specified here):

  • wiki/.audit/bridge-cards-applied.md provenance register
  • Lint check flagging concept-page subsections anchored to bridge cards not in the register
  • 11th bridge-card field: confidence (medium / low / speculative)
  • 12th bridge-card subfield: counter-test: broken out from inline-in-Counterpressure
  • Cohort-fingerprint discipline for tracking drift between pilot and live use
  • Schema-changelog entry acknowledging the math-olympiad analogy's transfer limits: "bridge-card sign-off is a calibrated probability of philosophical defensibility, not a proof of correctness"
  • Retrospective audit baseline (after 3-6 months of bridge-card-applied writes, audit a sample for "did the apply-mode writes hold up under subsequent ingest and audit?" to calibrate confidence labels)

16. Agent-teams escape hatch (documented for v0d.7, not for pilot)

(Reframed in v4 with verified isolation properties per §16-17 patch.)

When main-thread procedural escalation (per §11) results in BLOCK and the user wants to attempt resolution before retire/revise, the agent-teams debate is available as a post-isolation adversarial debate mode. It is not a clean-room continuation of sign-off review; it operates with weaker isolation guarantees than the main pilot path.

Verified isolation properties (Claude Code docs, fetched 2026-05-12):

  • Teammates honor the subagent definition's tools allowlist and model field.
  • Teammates do not inherit skills or mcpServers frontmatter from the definition.
  • Teammates load skills, MCP servers, and CLAUDE.md from project/user settings — the same as a normal session.
  • Team coordination tools (SendMessage, task management tools) are always available to teammates regardless of any tools: restriction. Whatever isolation the subagent-mode tools: allowlist enforced is invalidated for any teammate-mode invocation — by construction, not by bug.

This means the agent-teams debate is a fundamentally different mode where peer messaging is enabled. v4 keeps §16 documented (rather than dropping) to preserve post-pilot option value, but with weaker-guarantees explicitly named so any v0d.7 use of the mode is deliberate. Not implemented for pilot. Pilot operates on subagent isolation only, with main-thread escalating procedural deadlocks to user.

17. Subagent vs teammate skill-loading discipline

Subagent definitions for the four reviewer roles (I, A, B, C) and the adjudicator (D) MUST use a tools: allowlist excluding the Skill tool. Reason: prevents auto-loading of weave/SKILL.md, which would leak procedure metadata (existence of co-reviewers, asymmetric quorum, batch-level check) and break peer-output isolation.

Allowed tools per subagent: Read, Grep, Glob, Write (Write scoped via PreToolUse hook to per-agent subdirectory; §2.1).

If we later use the agent-teams debate escape hatch (§16), the teammate prompt must explicitly say "Do not invoke any skills regardless of activation suggestions" because teammates load skills, MCP servers, and CLAUDE.md from project/user settings, ignoring subagent tools: restrictions (verified §16).

18. Pre-step interpretation-checker (one subagent for the batch)

Before the per-card reviewer agents launch, a single interpretation-checker subagent runs against all 5 cards (7 in pilot, with planted-defect control). Its job per card:

  1. Generate 2-3 alternative readings: different Primary home choice, different Bridge type classification, different relation specification.
  2. Mark the card's stated reading as one of the alternatives.
  3. For each alternative: is it TRIVIAL (any reading any reviewer would assent to) or NON-TRIVIAL (substantive claim)?
  4. If the card's stated reading is the trivial one and another reading is non-trivial: flag as INTERPRETATION-AT-RISK. The card may be mis-stated.
  5. Output: interpretation-map.md with per-card alternative readings + trivial/non-trivial assessment + risk flag if applicable.

Prompt framing for A/B/C consumers — starter-set, not closed-set

(Per v4 §18 patch — Claude's option (b), with explicit starter-set framing.)

The interpretation map is delivered to A/B/C with this explicit framing in their prompt:

The interpretation map is a starter set of alternative readings identified by an upstream subagent. It is not exhaustive. If you identify alternative readings the map missed, evaluate the card against those as well, and note the missed-alternative in your hypothesis-pre note's "Confidence trace" or "Convergence" fields for §10 calibration.

This framing makes the single interpretation-checker subagent's single-point-of-failure risk qualitatively safer: if I misses an alternative, the downstream effect is "fewer probes for A/B/C," not "A/B/C biased toward the stated reading because the map looked authoritative." A/B/C remain free to identify alternatives I missed.

Tools: Read, Grep, Glob, Write. PreToolUse hook scope: wiki/.audit/bridge-card-signoff/run-N/interpretation-map.md. No Skill tool.

Subagent prompt embedding the same isolation language as A/B/C: "You are the only interpretation-checker reviewing this batch. You have no information about other reviews."

19. Retry policy

If any subagent (I, A, B, C, D) fails or times out:

  • One retry with the same prompt.
  • If retry fails too: drop the card from this run, mark in adjudication.md as "subagent unavailable — manual review required," and proceed with other cards.
  • A card that drops from the run is not signed-off and not retired; it stays at proposed for the next run.

20. 11th bridge-card field — confidence

v0d.7 codification will add this to the v0d.5 bridge-card template. Bridge cards gain an 11th field: confidence: medium | low | speculative.

  • Set at draft time (Pass 3).
  • Re-verified at sign-off (adjudicator subagent may recommend lowering).
  • Single-source-blocker cards default to low or speculative.
  • Multi-source cross-author-convergence cards default to medium.
  • Subsection writes implementing low-confidence cards must annotate the subsection prose with provisional framing (per §12a).

For the pilot, the 5 cards in run3 do not carry this field. v4 adds an explicit inference rule so adjudication is not a judgment call without scaffolding.

Pilot confidence inference rule

(New in v4 per §20 patch.)

Until v0d.7 codifies the field, the adjudicator infers confidence as follows:

  • medium — evidence-status names ≥2 independent source-anchors of comparable quality AND counterpressure's counter-test is specific and falsifiable
  • low — single-source AND single-source-blocker present AND counterpressure's counter-test is specific
  • speculative — single-source AND (single-source-blocker absent OR counterpressure's counter-test is gestural / could-be-wrong-shaped)

Adjudicator's expanded CONFIDENCE-AMBIGUOUS license. (Per Claude's pushback on the binary source-count rule.) If the adjudicator judges that the inference rule's mechanical output mis-calibrates the card — e.g., the rule says medium because there are 2 anchors but one anchor is a tertiary commentary and one is a primary-source passage, making the card closer to low — the adjudicator may mark CONFIDENCE-AMBIGUOUS and recommend rewrite-with-explicit-confidence rather than sign-off. The rule is a default; source-anchor quality (not just count) may override.

v0d.7 template delta. v0d.7 codification adds two subfields to the bridge-card template: (i) confidence: medium | low | speculative, and (ii) counter-test: as an explicit subfield rather than inline within Counterpressure. For pilot, counter-test stays inline in Counterpressure per v0d.5; the inference rule above operates on the inline form.


v4 changelog

Adopted as proposed in Reviewer Z's patch list

  • §3.5 + §5 anchor-string neutralization (verdict-token strip on PASS/FAIL/verdict)
  • §4.5 MP-scholarship-context briefing non-exhaustiveness disclaimer
  • §11 signed-apply-scope ratification (Signed apply scope: line written onto the card)
  • §11 ↔ §16 procedural-only main-thread authority (no substantive escalation)
  • §13.2 criterion 7a/7b split
  • §15 visibility preface (front-loaded apply-mode non-authorization)
  • §16-17 agent-teams reframing with verified isolation properties

Adopted with modification (Claude's pushback applied)

  • §2.1 tool-allowlist: chose option (b) — keep Write + PreToolUse hook — with pilot-prep hook smoke test required. Rationale: option (a) "drop Write, main-thread writes" would route all 17 subagent outputs through main-thread's context, degrading the isolation claim that main-thread is unbathed in reviewer reasoning.
  • §2.2 isolation: chose option (a), reframed as taxonomy-informed methodological choice (cross-card view enables #12 motifs-weight tipping detection) rather than apologetic weakening of "dual isolation."
  • §3.5 anchor neutralization: stripped PASS/FAIL/verdict only; kept tier-N and work-test (structural-label over-strip risk).
  • §7 trace integrity: chose two-file hypothesis-note structure (hypothesis-pre + hypothesis-post) + PreToolUse-hook enforcement, not Z's mtime-ordering check (which conflicted with v3 §7's allowed post-verdict revision pathway).
  • §11.5 dependency map: wikilinks-only extraction (not mechanism-name regex — that isn't mechanical).
  • §13.1 pilot controls: planted-defect control only; rejected seeded-clean control because wikiwip lacks ground-truth oracles for "clean card" calibration.
  • §13.2 7a/7b boundary: sharpened test to "could adjudicator have decided correctly given its inputs?" — routes through the adjudicator's input set, which is precisely defined in §11.
  • §18 interpretation-checker: chose option (b) — single I with starter-set prompt framing — rather than spawning 2.
  • §20 confidence inference: adjudicator's expanded CONFIDENCE-AMBIGUOUS license (source-anchor quality, not just count).

Maintainer-added in v4

  • §6 research scope discipline: replaced v3 hard caps (5 reads/card, 8 for Role C Rule 18) with calibrated-budget approach + Scope-Mismatch self-flag at 10+ reads. Rationale: philosophy has no ground-truth oracle; artificial budget caps produce sign-off based on under-verified anchors. Reviewers self-allocate budget to card complexity; the self-flag is informational and feeds §10 diagnosis, not terminal.
  • §7 hypothesis-note confidence tracking: explicit Initial confidence + per-anchor-check / per-raw-read Confidence trace + Final confidence fields. Allows reviewer's confidence to update during ongoing research; a non-moving confidence becomes calibration data for §10.
  • §10 expanded diagnostic options: added (d) card-cohort mis-scoping; added confidence-trajectory and scope-mismatch repeats as triggers alongside same-pattern repeats.

Rejected (with rationale)

  • §13.1 seeded-clean control (Z): assumes ground-truth that the philosophy domain doesn't have. The "clean card" judgment is exactly the philosophical-defensibility judgment the pilot was designed to make; importing the calibration-against-oracle pattern from oracle-having domains is domain-confusion. Planted-defect control alone is asymmetric-by-design.
  • §16 drop entirely (one of Claude's open-question options): option-value of post-pilot escape hatch with explicit weak-guarantees is real. Kept with reframing instead.
  • Z's "pilot vs calibration" preface as written (carried over from v3 rejection): the "unless user explicitly accepts pilot outcome as the bridge-card-specific calibration" clause introduces an authorization path that §15 categorically denies. Replaced with §15-visibility-only patch.
  • Z's criterion 1 rename to "any substantive non-approval event" (carried over from v3 rejection): broadens signal and dilutes taxonomy-bite testing. Replaced with §13.1 planted-defect control.

Reference to v3 changelog

v3 changes from v2 are preserved in wiki/.audit/bridge-card-signoff-plan-2026-05-12.md v3 changelog section and not reproduced here. v4 supersedes v3 as the active plan; v3 remains as audit-trail artifact.