Research Log: Agent-EA v2 Engagement Playbook

Research Log: Agent-EA v2 Engagement Playbook

Target: production-lines/agent-ea/playbook/engagement.md Eval criteria: auto-research/eval-criteria.md (12 binary questions)


Iteration #0 — 2026-05-31 (baseline)

Score: 8/12 Status: Baseline

Criteria results

# Question (short) Score Justification
1 Golden rule up front YES stated on line 5
2 A-codes defined/linked NO Routing table lists A251/A170/A370… with no definition or link
3 Stages in order, Routing at head NO stages table starts at Contract; Routing described below, not integrated
4 Gate + owner per stage YES all 5 rows have both
5 Exact command per automated stage NO node commands live in README, not here nor cross-linked
6 First action of a new request YES "Contract scoping — Routing (do this first)"
7 Data contract is a checklist YES concrete - [ ] checklist
8 Free of TODO/placeholder YES parity proof filled, no gaps
9 Follow-on mandates covered YES "Adding a mandate" paragraph
10 Conventions bind to D1 model YES kind object/co, change_type junction, epicId FK
11 Routing table = canonical bundles YES exact match to framework-guide scenarios
12 Open issues + spec/README links NO TES-CO vs PROJ/CHG not surfaced; no spec/plan/README link

Observations

Structure, fidelity and completeness are strong. The gaps are about operability (no commands) and traceability (A-codes unexplained, Routing not in the stage table, open modeling issue + source links missing). All four NO are additive fixes — low regression risk.

Next direction (iteration #1)

  • Q3: add Routing as stage 0 in the Stages & gates table (gate = A-code bundle selected; owner = Consulting intake).
  • Q5: add the exact node commands per automated stage (or cross-link README quickstart).
  • Q2: add a one-line pointer to the canonical A-code source (framework-guide) + compact legend.
  • Q12: add a short "References & open issues" footer (spec/plan, engine README, TES-CO vs PROJ/CHG note).

Iteration #1 — 2026-05-31

Score: 12/12 (was 8/12, +4) Status: Kept

Changes applied

  1. A. Added Scope (Routing) as stage 0 in the Stages & gates table (gate = engagement type → A-code bundle; owner = Consulting intake).
  2. B. Added "Running the automated stages (commands)" block — exact seed.mjs / publish.mjs / d1-export.mjs invocations + README cross-link + note that Contract/Review are human gates.
  3. C. Added A-code reference line (15 code meanings + link to the canonical Macroscope framework-guide.md).
  4. D. Added "References & open issues" section (spec/plan, engine README, schema; open: TES-CO vs PROJ/CHG duplication, canonical seed location).

Criteria results

# Question (short) Before After Delta
1 Golden rule up front YES YES
2 A-codes defined/linked NO YES +1
3 Stages in order, Routing at head NO YES +1
4 Gate + owner per stage YES YES
5 Exact command per automated stage NO YES +1
6 First action of a new request YES YES
7 Data contract is a checklist YES YES
8 Free of TODO/placeholder YES YES
9 Follow-on mandates covered YES YES
10 Conventions bind to D1 model YES YES
11 Routing table = canonical bundles YES YES
12 Open issues + spec/README links NO YES +1

Observations

All four fixes were additive — no regressions on the 8 prior YES. The Scope (Routing) row (A) and the commands block (B) together turn the playbook from a "policy doc" into an executable runbook. The A-code legend (C) removes the implicit dependency on tribal knowledge of Macroscope codes.

Next direction

Score is perfect (12/12) — loop stopped. Future eval evolution: if the engine grows more renderer formats (word/pdf/bi, TFD-0029), add a criterion on whether the playbook documents the manifest.formats selection. Resolving the TES-CO vs PROJ/CHG open issue should later be reflected back here.