AI-Assisted Development
A prompt-by-prompt playbook for building production-grade applications
Module Overview
Module 5 is where the artifact chain becomes working software. You implement your Dev Tasks with AI through a disciplined loop — plan before code, one task per session, tests from acceptance criteria, read every diff before committing. It pulls the whole program together: the prompting from Module 3 and the context discipline from Module 4 are applied for real on production-grade code.
| At a glance | |
|---|---|
| Covers | The implementation loop (plan → implement → self-review); production boilerplate scaffolding; connecting services via MCP; tests from acceptance criteria; safe refactoring; debugging workflow |
| When it runs | Week 3 Monday (in-person), alongside Module 6 |
| Builds on | Modules 3 and 4, and the approved Week 2 Architecture + Dev Tasks |
| Leads into | Week 3 code production and the Sprint 1 Readiness gate |
What you'll produce
Across Week 3: merged code implementing your Dev Tasks, with unit tests derived from acceptance criteria and the full BRD → PRD → Architecture → Dev Tasks → code → tests traceability chain intact — reviewed at the Sprint 1 Readiness gate.
Prerequisites — What’s Already Done
This walkthrough assumes the upstream artifact chain is complete. Before development begins, the following must already exist:
| Artifact | Location | Produced by |
|---|---|---|
| Business Requirements Document (BRD) | docs/brd/ | Solutions Designer |
| Product Requirements Document (PRD) | docs/prd/ | Product Manager |
| Architecture Document + ADRs | docs/arch-docs/ (ARCH doc + one ADR per epic) | Solutions Architect |
| Dev Tasks (CSV) | docs/dev-tasks/ (one CSV per epic) | Tech Lead |
Notice what is not on this list: AGENTS.md, coding standards, and the knowledge directory. These are generated during Stage 1 alongside the boilerplate. You need to understand the project first — by reading the architecture document — before you can write accurate project rules.
The architecture document is the single input
Everything downstream — the boilerplate, the AGENTS.md, the coding standards, the knowledge directory — is derived from the architecture document. The developer reads it, understands the stack and patterns, and generates everything else from that understanding.
Project Structure After Stage 1
After the boilerplate stage is complete, the project has this structure. Items marked with an arrow are generated during Stage 1, not beforehand.
my-project/ # parent project folder
├── docs/ # artifacts — NOT committed to any repo
│ ├── brd/
│ ├── prd/
│ ├── arch-docs/ # ARCH-<code>-v1.0.md + one ADR per epic
│ ├── dev-tasks/ # one CSV per epic (epic-N-...-tasks.csv)
│ └── generated-figma/
│
└── <app-repo>/ ← generated in Stage 1 (this is what gets committed)
├── AGENTS.md ← cross-tool rules (+ CLAUDE.md overrides)
├── knowledge/ ← generated in Stage 1
│ ├── prompts/ # dev/, solutions-architect/, tech-lead/, product/
│ ├── patterns/ # implement-and-test.md (E→P→A→V chain)
│ ├── rules/ # coding-standards.md (rules + rationale)
│ ├── agents/
│ ├── retros/
│ └── templates/
└── ... ← source + config (boilerplate, stack-dependent)
Key Principle
The knowledge directory lives inside the code repo and is version-controlled and shared — when a developer discovers a better prompt pattern or a new coding rule, they contribute it back via PR, and the team’s AI effectiveness compounds over time. The docs/ folder is the opposite: it holds working artifacts (BRD, PRD, Architecture, ADRs, Dev Tasks), sits beside the repo, and is not committed.
The Core Loop
Every task follows the same four-step cycle. The artifacts and knowledge directory provide the context. The CSV task fields provide the prompt.
| Step | What the developer does | Where context comes from |
|---|---|---|
| EVALUATE | Load files, understand the task and surrounding code | Architecture doc + DevTasks + existing codebase |
| PLAN | State implementation plan, wait for approval | Task description + acceptance criteria from CSV |
| APPLY | Implement the plan, generate the code | AGENTS.md coding standards |
| VALIDATE | Self-review + tests against acceptance criteria | Acceptance criteria from CSV = the checklist |
The CSV is the prompt
The developer doesn’t write prompts from scratch. The task CSV has the user_story (context), description (what to build), acceptance_criteria (how to verify), and dependencies (what’s already done). These fields ARE the prompt — loaded as context files, not copy-pasted into templates.
Go back to Modules 3 and 4
Building your project’s source code (the LMS is the worked demo) is where the earlier concepts get applied for real — keep both handbooks open while you work.
Module 3 (Prompt Engineering Fundamentals) is the craft underneath every stage here: prompt structure, chain of thought for debugging, and the refinement loop. Even with the CSV as context, how you prompt still decides output quality — when the AI drifts or an implementation goes sideways, Module 3 is the first place to look.
Module 4 (Context Engineering & Management) is what keeps quality high across a long build: load only what the task needs in dependency order, compact or start a fresh session when quality degrades, and use subagents to keep the main context clean. Most failures deep into an implementation session are context problems, not prompting problems — Module 4 is where you diagnose and recover them.
Stage 1 — Production Boilerplate
Every project starts with a boilerplate that matches the design from day one. Load the architecture document AND the Figma export together. The output is a complete scaffold: infrastructure, configuration, design system, UI components, pages — all on placeholder data. No business logic, no API wiring. Just a production-grade shell that looks and runs like the real app.
Boilerplate only — no feature implementation
Stage 1 produces infrastructure + UI shell. It does NOT implement business logic, API endpoints beyond a health check, data fetching, or form submissions. Components render with placeholder data. Dev task implementation starts in Stage 2.
Why AGENTS.md is generated here, not earlier
You cannot write accurate project rules without understanding the project. The architecture document defines the stack, the data model, the error format, the auth strategy, and the API conventions. AGENTS.md and the coding standards are derived from those decisions.
EVALUATE — Understand the Architecture and Design
[EVALUATE]
Load: docs/arch-docs/ (ARCH doc + ADRs)
Attach: Figma export (zip)
I’m setting up this project. Before generating any code,
walk me through:
Architecture:
1. The stack decisions and why they were made
2. The data model — tables, relationships, indexes
3. What infrastructure is needed from day one
4. The middleware stack and auth flow
5. Error response format and error handling strategy
6. Security rules and constraints
Design:
7. Design tokens (colors, typography, spacing, radii, shadows)
8. Component inventory (every component, variants, states)
9. Layout patterns (grid, breakpoints, responsive behavior)
10. Iconography (style, sizes, library)
Don’t generate code yet. Help me understand both the
architecture and the design system.
PLAN — Boilerplate + UI Shell + Project Rules
[PLAN]
Based on the architecture document and Figma design,
plan four things:
1. INFRASTRUCTURE BOILERPLATE
- Project structure — every directory, annotated
- Database schema — from the architecture doc’s data model
- Environment variables — complete list, documented
- Auth flow — as specified in the architecture
- Error handling — using the format from the architecture
2. UI SHELL (from Figma)
- Design system config (tailwind.config.ts, globals.css,
CSS custom properties, font loading)
- Component architecture (name, path, props interface,
variants, composition, accessibility)
- Page layouts (every page from the Figma, structured
with placeholder data)
- Build order (component dependency chain)
3. AGENTS.md (cross-tool project rules)
- Project description and stack summary
- Backend and frontend coding standards
- Security rules, git conventions, quality gates
4. KNOWLEDGE DIRECTORY STRUCTURE
- knowledge/rules/coding-standards.md
- knowledge/prompts/dev/ (initial prompt templates)
- knowledge/patterns/ (implement-and-test chain)
- Design system spec saved to knowledge/
If we have an existing golden path boilerplate, identify what
needs to be configured or extended for this project.
If generating from scratch, output as a blueprint. No code yet.
APPLY — Generate the Complete Boilerplate
[APPLY]
Plan approved. Generate everything in one scaffold:
1. Infrastructure:
.env.example, initial migration, auth middleware,
error handler, health-check endpoint, seed script,
ESLint + Prettier config, README, CI pipeline config.
2. UI shell (matching Figma exactly):
Design system config (tailwind.config.ts, globals.css),
all components with TypeScript props and all visual states
(default, hover, focus, disabled, loading, empty, error),
all page layouts responsive to Figma breakpoints,
semantic HTML + ARIA attributes throughout.
Use placeholder/mock data — not real API calls.
Define prop interfaces so Stage 2 can wire real data
without changing the component.
3. AGENTS.md at project root.
4. knowledge/ directory with initial content.
Package requirements:
- Use the latest STABLE versions of all dependencies
(not beta, not canary, not release candidates)
- Pin exact versions in package.json (no ^ or ~)
- Verify each package is actively maintained
- No deprecated packages or APIs
Production-grade standards:
- No placeholder pages or unused dependencies
- Auth protecting all routes that need it
- Consistent error response format throughout
- Proper logging setup (not console.log)
- Environment-based configuration (dev/staging/prod)
- Security headers and CORS configured
- Database connection pooling
Do NOT implement:
- API endpoints or business logic
- Data fetching, form submissions, or backend interactions
- Dev tasks from the CSV (that is Stage 2)
Output as complete files: npm install && npm run dev
VALIDATE — Validate Infrastructure + Design Fidelity
[VALIDATE]
Review everything against the architecture doc AND Figma.
Packages:
1. Are all dependencies on their latest stable version?
2. Any deprecated packages or APIs in use?
3. Are versions pinned exactly (no ^ or ~ ranges)?
4. Any known security vulnerabilities? (run npm audit)
Infrastructure:
5. Does the project structure match the architecture?
6. Does the schema match the data model?
7. Are all env vars from the architecture doc present?
8. Is logging production-grade (not console.log)?
9. Are security headers and CORS configured?
10. Is database connection pooling set up?
UI fidelity:
11. Do components match the Figma design?
12. Responsive at 320px, 768px, 1024px, 1440px?
13. All visual states render with placeholder data?
14. Accessibility — keyboard nav, WCAG AA contrast?
15. Prop interfaces typed for Stage 2 wiring?
16. No business logic or API calls in components?
AGENTS.md + knowledge directory:
17. Coding standards match the architecture?
18. Design system spec saved to knowledge/?
Classify: [BLOCKER] / [FIX NOW] / [BACKLOG]
Fix every [BLOCKER] and [FIX NOW].
Stage 1B — Connect Project Services via MCP
After the boilerplate is generated and running, connect your AI tool to the project’s services via Model Context Protocol (MCP). This gives the AI direct access to your database, APIs, and infrastructure — so it can inspect schemas, run queries, and generate code that matches what actually exists.
Supabase
claude mcp add --scope project --transport http \ supabase "https://mcp.supabase.com/mcp"
After connecting, the AI can inspect your database schema, read table definitions, check Row Level Security (RLS) policies, and generate code that matches your actual data model — instead of guessing from the architecture document alone.
Other Common MCP Connections
| Service | What it gives the AI | Command |
|---|---|---|
| GitHub | Repository access, issues, PRs, commit history | claude mcp add --scope project github |
| Filesystem | Direct file access on the developer’s machine | claude mcp add --scope project filesystem |
| NeonDB | PostgreSQL database access for schema inspection | Check NeonDB docs for current MCP endpoint |
| Google Drive | Access to shared project documentation | claude mcp add --scope project gdrive |
Scope matters
Use --scope project so the MCP connection is saved in the project’s .mcp.json and shared with the team. Use --scope user for personal connections that shouldn’t be committed to the repo.
Stage 2 — Task Execution
This is where developers spend most of their time. The tasks exist in the CSV. The architecture doc and coding standards are in the project. The developer loads the context and executes.
What to Load as Context
Before starting any task, load these files. How you load depends on your tool 2014 file references, project context, or upload. The method varies; what matters is that these files are in the AI2019s context window before you start.
| File | Why | Load when |
|---|---|---|
| docs/arch-docs/ (ARCH + ADRs) | Defines the stack, data model, API design, and patterns | Every task |
| docs/dev-tasks/epic-N-...-tasks.csv | Contains every task with description + acceptance criteria | Every task |
| AGENTS.md | Coding standards, error format, security rules | Every task |
| knowledge/rules/coding-standards.md | Rules with rationale — the AI follows these automatically | Every task |
| Relevant source files | The code the task will modify or extend | Per task |
The 4-Step Chain (per task)
This is the implement-and-test pattern from knowledge/patterns/. Each task goes through all four steps.
Step 1 — EVALUATE: Load context and understand
[EVALUATE]
Load: docs/arch-docs/, the epic CSV, AGENTS.md
I am implementing TASK-008: Create useSend API client module.
Before writing any code, help me understand:
- What does the current code do in this area?
- What patterns does the codebase use for API clients?
- What existing utilities should I reuse?
- What could go wrong with this task?
The task CSV provides the context
The AI already has the full task loaded — the user_story, description, acceptance_criteria, and dependencies are all in the CSV file. The developer doesn’t re-type them. They just say “I am implementing TASK-008” and the AI reads the rest from the loaded file.
Step 2 — PLAN: State implementation plan
[PLAN]
Give me an implementation plan for TASK-008 covering:
1. File path and module structure
2. Public API (method signatures, params, return types)
3. Authentication approach (useSend API key handling)
4. Retry logic design (backoff strategy)
5. Error cases to handle (401, 403, 429, 500)
6. How it integrates with email_logs table (from TASK-003)
Don’t write code yet. Plan only.
Review the plan before proceeding
This is the approval gate. Read the plan. Does the retry strategy make sense? Does the module structure follow existing patterns? Push back here — not after the code is written.
Step 3 — APPLY: Implement
[APPLY]
Plan approved. Implement TASK-008.
Follow AGENTS.md standards exactly.
Full file content — no diffs, no ‘rest stays the same.’
That’s it. The AI already knows the description, the acceptance criteria, the coding standards, and the architecture — all from the loaded files. The APPLY prompt is two lines.
Step 4 — VALIDATE: Self-review against acceptance criteria
[VALIDATE]
Review your implementation of TASK-008 against its
acceptance criteria from the task CSV.
For each criterion, confirm: does the code satisfy it?
Then check for:
1. Auth bypass or injection vectors
2. Missing error handling for API failures
3. Edge cases in retry logic (max retries, backoff overflow)
4. Missing input validation (email format, domain check)
5. Sensitive data in logs (API keys, email content)
Write tests covering happy path + failure cases.
Cite line numbers for every issue found.
The acceptance criteria IS the checklist
The CSV has 6–12 testable criteria per task. The VALIDATE step walks through each one. If the AI says “looks good” without addressing each criterion specifically, push back: “Walk through each acceptance criterion one by one.”
Concrete Example: TASK-019
Here’s what a real task execution looks like using CSV fields as the prompt. The developer loads the context files once, then works through the cycle.
The CSV row
| Field | Value |
|---|---|
| task_id | TASK-019 |
| issue_type | Story |
| summary | Implement order notifications — new order (seller) |
| user_story | As a seller, I want to be notified immediately when a buyer places an order so that I can quickly process the order |
| dependencies | TASK-013, TASK-017 |
| role | Fullstack |
| estimate / points | 6h / 3 |
The developer’s actual prompts
[EVALUATE]
Load: docs/arch-docs/, the epic CSV, AGENTS.md
Load: src/services/notification-service.ts (from TASK-013)
Load: src/services/toast-service.ts (from TASK-017)
I’m implementing TASK-019: new order notification for sellers.
Walk me through how the notification service and toast service
work so I understand where to integrate.
[PLAN]
Plan the implementation of TASK-019.
The acceptance criteria from the CSV are my requirements.
Cover: where to trigger, email template selection,
toast message, notification record creation,
preference checking, and cross-platform behavior.
[APPLY]
Plan approved. Implement TASK-019.
Follow existing patterns from notification-service.ts.
[VALIDATE]
Review TASK-019 against its acceptance criteria.
Check each one. Then: auth, error handling, deduplication,
preference bypass, cross-platform parity.
Write tests.
Total prompt effort: ~60 words across four steps. Everything else comes from the loaded files.
Stage 3 — Recurring Patterns
These live in knowledge/prompts/dev/ and knowledge/patterns/. Use them directly.
Code Review
From knowledge/prompts/dev/code-review.md:
Load: AGENTS.md, relevant source files Review this code for: security vulnerabilities, logic errors, error handling gaps, performance issues. For each issue: Severity (Critical/High/Medium/Low), line number, description, suggested fix. Do not comment on style. Focus on things that break.
Database Migration
[EVALUATE] Load schema. What’s the current state? [PLAN] New schema, data impact, backward compat, rollback [APPLY] Migration file + updated schema + seed update [VALIDATE] Data loss? Table locks? Broken queries? Rollback works?
API Endpoint
[EVALUATE] Load docs/arch-docs/. Which endpoint? [PLAN] Validation rules, DB queries, error responses [APPLY] Route handler following AGENTS.md standards + curl test [VALIDATE] Invalid input? Auth bypass? Response format correct?
Bug Fix
[EVALUATE] Observed vs expected, stack trace, suspected code [PLAN] Root cause → fix design → same pattern elsewhere? [APPLY] Implement fix + regression test [VALIDATE] Resolved without side effects? Related paths?
Stage 4 — Pre-Launch
Load: Full project, docs/arch-docs/, AGENTS.md [EVALUATE] Audit for production readiness. Security, performance, reliability, DevOps, UX. [PLAN] Prioritize: [BLOCKER] / [FIRST WEEK] / [BACKLOG] [APPLY] Fix every [BLOCKER]. Provide [FIX NOW] fixes for post-launch. [VALIDATE] Fixes work? New problems? Deploy pipeline tested?
Contributing Back to Knowledge
The knowledge directory grows with the team. After completing tasks, contribute what you learned.
| What you discovered | Where it goes | Format |
|---|---|---|
| A prompt pattern that worked well | knowledge/prompts/dev/ | Prompt template with example input/output |
| A coding rule you learned the hard way | knowledge/rules/ | Rule + rationale (why it exists) |
| A multi-step workflow | knowledge/patterns/ | Step-by-step chain with load instructions |
| A sprint retro insight | knowledge/retros/ | YYYY-MM-DD.md with action items + owners |
Quality standards for contributions
Every prompt must have at least one real example input/output.
Rules must include rationale — why does this rule exist?
Patterns must include which files to load as context.
Submit via PR. Get peer review before merging.
Quick Reference
| Step | Developer does | Context comes from | Watch out for |
|---|---|---|---|
| EVALUATE | Load files, understand task | Architecture doc + CSV + codebase | AI coding before understanding |
| PLAN | State plan, wait for approval | Task description + acceptance criteria | Plans that skip edge cases |
| APPLY | “Implement TASK-XXX” | AGENTS.md coding standards | Placeholder code, hallucinated APIs |
| VALIDATE | Walk each acceptance criterion | Acceptance criteria from CSV | AI rubber-stamping itself |
Discipline Notes
Load context, don’t re-type it
The architecture doc, task CSV, and coding standards are files. Load them. The AI reads the acceptance criteria, dependencies, and patterns from the files — you don’t paste them into every prompt.
Plan is the approval gate
Read the plan. Push back. The AI will happily build the wrong thing perfectly if you let it. Fix it here, not after APPLY.
APPLY should be two lines
If your APPLY prompt is longer than “Plan approved. Implement TASK-XXX. Follow AGENTS.md.” then your context loading or your PLAN step was incomplete.
Verify against the CSV, not your feelings
Walk each acceptance criterion from the CSV. “Does the code satisfy criterion 1? Criterion 2?” If the AI says “looks good” without citing criteria, push back.
Contribute back. When you find a better prompt, a new rule, or a useful pattern — add it to knowledge/. The team’s AI effectiveness compounds over time. One developer’s insight becomes every developer’s shortcut.