Stage 5: Quality Gate / Testing

Owner: Quinn (QA Engineer) Blueprint stage: Stage 5 -- Quality Gate / Testing

Trigger

Build artifacts submitted by Pablo with completed assembly checklist (Stage 4 gate passed).

Inputs

Source	What
Pablo (Stage 4)	Complete agent repository, assembly checklist, test data, known limitations
Pablo (Stage 4)	Test data package: sample inputs per skill, synthetic data with planted gaps (for validation skills), expected output examples
Work order	Product-specific test cases (Section 8 or equivalent), acceptance criteria

Prerequisites (Added for Beta)

Phase G documentation package complete:
- User guide
- Configuration guide
- Skill reference card
- Quick-start cheat sheet
QA CANNOT pass if Phase G docs are missing or incomplete

Activities

5.0 Test Environment Setup

Start a fresh Claude Code session (no prior conversation context)
Use the model specified in the work order for each skill (or default: Sonnet for analysis, Haiku for execution)
Create a clean test working directory separate from the build directory
Do not reuse outputs between test cases unless testing pipeline integration

5.1 Structural Validation

Verify: CLAUDE.md (all required sections), all skills in .claude/commands/ match solution spec, all templates present, reference materials loaded, documentation complete.

Test data requirement: Pablo must provide test data as part of the Stage 4 handoff. Test data requirements: at least one sample input per functional test case, synthetic data with known planted issues for validation/quality skills, expected output format examples.

5.2 Functional Testing

Test cases are derived from the work order and solution specification. For each capability in scope, verify that the corresponding skill:

Accepts the specified inputs
Produces the expected output format
Follows naming conventions
Uses correct domain terminology and language

Test each skill in isolation first (unit testing). Then test multi-step pipelines end-to-end (integration testing). Record which mode each test used.

5.3 Edge Case Testing

Standard edge cases (apply to all products):

#	Test Case	Expected Behavior	Concrete Scenario
E1	Empty input	Reports missing input, no garbage output	Invoke skill with no file argument, or with a path to an empty file
E2	Ambiguous request	Asks clarifying questions or escalates	Provide input that could apply to 2+ skills
E3	Conflicting requirements	Identifies conflict, flags for human	Input contains contradictory constraints
E4	Large input	Handles without truncation or degradation	Provide input exceeding 5 pages / 2000 words
E5	Wrong skill invoked	Redirects to correct skill or reports mismatch	Call a requirements skill with solution-phase input

Product-specific edge cases from the work order are added to this list.

5.4 Documentation Testing

#	Test Case	Method
D1	User guide accuracy	Follow workflows, verify behavior matches
D2	Skill reference accuracy	Compare descriptions to actual I/O
D3	CLAUDE.md accuracy	Verify skills table matches .claude/commands/

5.5 Scoring

PASS: Functional 100% AND edge case >= 80% AND documentation 100% CONDITIONAL: Functional 100% AND edge case >= 60% with remediation plan FAIL: Any functional failure OR edge case < 60%

Note: Product-specific thresholds in the work order override these defaults when specified.

5.6 Remediation

On FAIL or CONDITIONAL: Quinn documents failing test cases with specific failures. Pablo fixes issues and re-submits with a change delta (what was modified and why). Quinn re-tests only the previously failing cases plus any cases affected by the changes. Remediation loops track iteration count — escalate to Clara (CTO) after 3 failed iterations.

Outputs

Deliverable	Format	Destination
QA certification	Markdown	PASS -> Diego, FAIL -> Pablo
Test execution results	Markdown	Attached to certification
Remediation plan (if CONDITIONAL)	Markdown	Tracked by Pablo

Quality Gate

Gate: QA PASS certification. Evidence: QA certification with status, all test results, scoring.