Production Brief — WO-0008: Maya (Conversational KB Talent)
Production Brief — WO-0008: Maya (Conversational KB Talent)
Stage: 1 — Intake (internal pilot, dogfooding) Owner: Pablo (Production Line Architect) Date: 2026-05-26 Decision ref: TFD-0023 Technical canon: RD-0017 (Karpathy LLM Wiki)
Client Profile (M1 — internal pilot)
| Field | Value |
|---|---|
| Client | Talent Factory (internal, dogfooding) |
| Segment | Camille (CS triage), Riley (R&D retrieval), Oscar (CEO knowledge queries) |
| Domain | Factory knowledge — decisions, requests, KB learnings, HTML deliverables |
| Methodology | Karpathy LLM Wiki pattern (RD-0017) |
| Language | FR primary, EN mixed |
| AI platform | Claude API (Haiku retrieval, Opus/Sonnet synthesis), pluggable per model-config-pattern |
| Corpus path | C:\Projects\talent-factory\ (filtered — see below) |
| Documentation platform | Same repo (markdown + HTML) |
| Naming | maya-* skills, WO-0008 per TFD-0009 |
Discovery Summary
Business Context
The factory has accumulated ~150 markdown files (TFDs, requests, R&D analyses, KB) plus the intranet's 405 Astro pages and ~30 HTML EA deliverables. Camille currently does client triage by manual grep + memory. Riley re-reads the same RD analyses every R&D session. The CEO asks recurring questions whose answers are spread across 4-6 files at a time. JSM-Confluence deflection (CON-0006 Stack A) was the prior plan; TFD-0023 supersedes it with Maya.
Pain Points
- No semantic retrieval over the corpus — keyword search misses bilingual rephrasings and cross-file concepts.
- No connection layer — each query starts from zero; nothing compounds.
- Confluence cost + format mismatch — paid per seat, can't render the factory's HTML deliverables (capsules, diagrams), weak FR/EN.
- No deflection mechanism — every question becomes a CEO interrupt.
Current State
- Markdown corpus:
company/decisions/,departments/*/requests/,references/videos/RD-*/, KB lessons. - HTML deliverables:
intranet/dist/, EA pages under client OneDrive (out of scope for Maya-Factory M1; in scope for Maya-STM M2). - Beta-portal live (project memory:
beta-portal) — host for the Maya widget. - JSM live on jackson-creek-tech.atlassian.net — ticket route target for deflection.
Product Definition
Product Type
A digital talent packaged as a deployable bundle: agent definition + skills + widget + corpus config. Two deployment profiles share the same core (see TFD-0023 Action 1):
- M1 — Maya-Factory: corpus = factory repo (filtered). Users = factory team. Host = beta-portal widget.
- M2 — Maya-Client (STM POC): corpus =
OneDrive-STM/agent-ea/. Users = STM staff via JCT portal. Bundled with EA handover #1.
Architecture (RD-0017 canonical pattern)
maya/
├── raw/ ← read-only sources (symlink or copy from corpus_path)
├── wiki/ ← agent-maintained markdown KB (the synthesized layer)
├── CLAUDE.md ← schema: purpose, folders, ingest workflow, formatting, QA
├── corpus.config.yaml ← {corpus_path, filters, language, deflection_target}
├── manifest.json ← generated index (titles, summaries, paths)
└── widget/ ← embeddable JS (Astro + standalone bundle)
Three behaviors driven by the schema:
- Ingest — on add/update of raw source: extract concepts, update existing wiki pages, create new pages, link, log changes.
- Query — multi-turn FR/EN; consult wiki first (not raw); cite source paths; flag uncertainty.
- Lint —
/lint-wiki: contradictions, orphan pages, outdated claims, concepts without page. Folded into RD-0031 toolkit-catalog bundle per TFD-0023.
Core Capabilities (from order.md, re-prioritized for M1)
| # | Capability | M1 priority | Notes |
|---|---|---|---|
| 1 | Corpus ingestion → manifest + wiki | Must | Manifest-based, no vector DB <10k docs |
| 2 | Conversational retrieval (multi-turn FR/EN) | Must | Claude long context + manifest |
| 3 | Citation by paragraph | Must | Native Claude API citations |
| 4 | Deflection → Telegram (M1) / JSM (M2) | Must | M1 uses Telegram (existing channel); M2 uses JSM |
| 5 | Embeddable widget | Must | Astro component for beta-portal |
| 6 | Bilingual native | Must | No language toggle |
| 7 | Per-deployment corpus | Must | corpus.config.yaml is the only thing that changes between deployments |
| 8 | Re-indexing on commit | Should | Git hook → manifest refresh <60s |
| 9 | /lint-wiki skill |
Should | Co-developed with RD-0031 |
Out of M1 scope
Vector DB, authentication (host portal handles it), analytics dashboard (JSM handles deflection rate), HTML deliverables ingestion beyond markdown extraction (M2 problem).
Scope
- In: Conversational RAG with citations, deflection routing, embeddable widget, FR/EN, manifest-based indexing, wiki ingest workflow,
/lint-wiki. - Out (v1): Vector DB, auth, analytics, multi-tenant (each Maya is single-corpus by design).
- Compliance: Each instance reads only its configured corpus. No cross-tenant leakage. Citations always include source path.
Feasibility Assessment
Risk: Low. Pattern proven by RD-0017. Stack is factory-native (markdown + Claude Code + Astro). No new infrastructure. Order.md is fully specified (AC + DoD already written). The 4-week sequence per TFD-0023 has slack — week 4 STM POC depends only on M1 + STM corpus access (already available).
Open Questions for Stage 2
- Corpus filters for M1 — which paths in
talent-factory/are in vs out? Proposal: includecompany/,departments/*/requests/,references/videos/,production-lines/orders/*/order.md. Exclude.claude/,node_modules/,intranet/dist/, OneDrive client folders. - Wiki location — does the wiki live in the repo (
maya/wiki/committed) or in a sibling folder? Pablo to decide based on git noise tolerance. - Re-index cadence — git hook (every commit) or scheduled (hourly)? Cheap to try both.
- Widget styling — match Anthropic warm cream / Trustworthy Blue per
docs-design-system?
4-Week Sequence (per TFD-0023)
| Week | Stage | Owner | Output |
|---|---|---|---|
| 1 (now → 2026-06-02) | Stage 1 close + Stage 2 design | Pablo + Riley | Sandbox proof (3 TFDs) + Stage 2 solution spec |
| 2 (2026-06-03 → 09) | Stage 3 pattern selection + Stage 4 build start | Pablo | Schema CLAUDE.md frozen, ingest + query skills built |
| 3 (2026-06-10 → 16) | Stage 4 build complete + Stage 5 QA | Pablo + Quinn | Maya-Factory live for Camille; QA cert |
| 4 (2026-06-17 → 23) | Stage 6 deploy + Stage 7 delivery (STM POC) | Diego + Dana | Maya-STM bundled with EA handover #1 |
Week-1 Action List (Pablo)
- Run Riley's RD-0017 sandbox (~25 min): create
process/sandbox/{raw,wiki,CLAUDE.md}, ingest TFD-0019/021/022, run a cross-cutting question +/lint-wiki. Capture transcript inprocess/sandbox/sandbox-report.md. This is the Stage 1 acceptance gate. - Decide the 4 open questions above — drop a one-pager
process/stage-1-decisions.md. - Adapt
CLAUDE.mdschema from Karpathy starter to factory context (FR primary, citation format machine-parseable for widget, deflection to Telegram for M1, JSM for M2). - Open Stage 2 —
process/stage-2-solution-spec.mdcovering: manifest schema, query loop (multi-turn + citation), widget contract, corpus filter spec, deflection payload format. - Sync with Diego on RD-0031 bundle convention so
/lint-wikiships in the right channel from day 1. - Sync with Riley on what RD-0017 did not answer — write any residual unknowns as REQ-EXEC tickets, do not absorb silently into the build.
References
production-lines/orders/WO-0008-maya-conversational-kb-talent/order.md— full AC + DoDcompany/decisions/TFD-0023-maya-load-bearing-infrastructure-talent.md— sequencing & ownershipreferences/videos/RD-0017/out/flashcard_rd-0017.md— technical pattern (canon)references/videos/RD-0017/intrants/transcript_rd-0017.md— source material- TFD-0009 (request folder standard), TFD-0012 (R&D pipeline), TFD-0021 (toolkit-catalog Go)
- Memory:
maya-rag-wiki-pattern,delivery-model-foundry-not-hosted,documentation-format,model-config-pattern
Stage 1 Acceptance Gate
Stage 1 closes when:
- Sandbox proof runs end-to-end on 3 TFDs and produces a non-trivial wiki + lint report
- 4 open questions answered in
stage-1-decisions.md - Adapted
CLAUDE.mdschema committed underprocess/stage-1-claude-md-draft.md - Stage 2 solution spec opened (even empty header)
Target close date: 2026-06-02 EOD.