Production Lines
Research Log — ai-layer-kit/agents/explorer.md
Research Log — ai-layer-kit/agents/explorer.md
Target:
production-lines/digital-talent/templates/ai-layer-kit/agents/explorer.mdStarted: 2026-05-26
Iteration #0 — 2026-05-26 (baseline)
Score: 10/12
Failing criteria
- Q1 — Frontmatter description started with a noun phrase ("Read-only code/content exploration.") instead of an action verb, weakening trigger-matching for the parent agent.
- Q11 — Body intro restated "You never write, edit, or delete." That theme was already covered by the description AND by Rule 1 (which names Write/Edit/NotebookEdit specifically). One layer was redundant.
Iteration #1 — 2026-05-26
Score: 12/12 (was 10/12, +2) Status: Kept
Changes applied
- Q1 fix — Rewrote the description opener from "Read-only code/content exploration." to "Explore code or content read-only." Verb-first; same information; slightly more agent-actionable when the parent scans descriptions.
- Q11 fix — Removed the sentence "You never write, edit, or delete." from the body intro. The no-write guarantee now lives in (a) the description, (b) the
tools:list omission, and (c) Rule 1 which names the forbidden tools explicitly. Three layers, no redundancy.
Criteria results
| # | Question (short) | Before | After | Delta |
|---|---|---|---|---|
| 1 | Description verb-first | NO | YES | +1 |
| 2 | "Use when" trigger | YES | YES | — |
| 3 | Non-action stated | YES | YES | — |
| 4 | Tools no-write | YES | YES | — |
| 5 | Body names Write/Edit by name | YES | YES | — |
| 6 | Bash allow/forbid defined | YES | YES | — |
| 7 | Rules numbered | YES | YES | — |
| 8 | "When NOT to call" present | YES | YES | — |
| 9 | Output format literal | YES | YES | — |
| 10 | < 50 content lines | YES | YES | — |
| 11 | No body↔description duplication | NO | YES | +1 |
| 12 | Numeric threshold / example | YES | YES | — |
Observations
- Defense-in-depth ≠ duplication. The criteria correctly distinguished intentional belt-and-suspenders (Rule 1 naming the forbidden tools, complementing the
tools:omission) from lazy restating (body intro repeating the description's no-write theme). Worth keeping that distinction in future agent-definition reviews. - Verb-first descriptions look trivial but compound: the parent agent's subagent-selection heuristic seems to weight the leading token. Default this for all future agent definitions in the kit.
Next direction
12/12. Possible criteria extensions if we want to keep pushing:
- "Is the output format compatible with downstream parsing (e.g., distinguishable Markdown sections)?"
- "Does the file declare the model choice's rationale (
model: sonnet— why not haiku?)?" Neither would change the current file; both would extend the eval.