AI Foundational Training

Module 5 · Week 3 Mon

AI-Assisted Development

A prompt-by-prompt playbook for building production-grade applications

using the Evaluate → Plan → Apply → Validate cycle.

Module Overview

Module 5 is where the artifact chain becomes working software. You implement your Dev Tasks with AI through a disciplined loop — plan before code, one task per session, tests from acceptance criteria, read every diff before committing. It pulls the whole program together: the prompting from Module 3 and the context discipline from Module 4 are applied for real on production-grade code.

At a glance
Covers	The implementation loop (plan → implement → self-review); production boilerplate scaffolding; connecting services via MCP; tests from acceptance criteria; safe refactoring; debugging workflow
When it runs	Week 3 Monday (in-person), alongside Module 6
Builds on	Modules 3 and 4, and the approved Week 2 Architecture + Dev Tasks
Leads into	Week 3 code production and the Sprint 1 Readiness gate

What you'll produce

Across Week 3: merged code implementing your Dev Tasks, with unit tests derived from acceptance criteria and the full BRD → PRD → Architecture → Dev Tasks → code → tests traceability chain intact — reviewed at the Sprint 1 Readiness gate.

Prerequisites — What’s Already Done

This walkthrough assumes the upstream artifact chain is complete. Before development begins, the following must already exist:

Artifact	Location	Produced by
Business Requirements Document (BRD)	docs/brd/	Solutions Designer
Product Requirements Document (PRD)	docs/prd/	Product Manager
Architecture Document + ADRs	docs/arch-docs/ (ARCH doc + one ADR per epic)	Solutions Architect
Dev Tasks (CSV)	docs/dev-tasks/ (one CSV per epic)	Tech Lead

Notice what is not on this list: AGENTS.md, coding standards, and the knowledge directory. These are generated during Stage 1 alongside the boilerplate. You need to understand the project first — by reading the architecture document — before you can write accurate project rules.

The architecture document is the single input

Everything downstream — the boilerplate, the AGENTS.md, the coding standards, the knowledge directory — is derived from the architecture document. The developer reads it, understands the stack and patterns, and generates everything else from that understanding.

Project Structure After Stage 1

After the boilerplate stage is complete, the project has this structure. Items marked with an arrow are generated during Stage 1, not beforehand.

my-project/                          # parent project folder
├── docs/                             # artifacts — NOT committed to any repo
│   ├── brd/
│   ├── prd/
│   ├── arch-docs/                    #   ARCH-<code>-v1.0.md + one ADR per epic
│   ├── dev-tasks/                    #   one CSV per epic (epic-N-...-tasks.csv)
│   └── generated-figma/
│
└── <app-repo>/                   ← generated in Stage 1 (this is what gets committed)
    ├── AGENTS.md                 ←   cross-tool rules (+ CLAUDE.md overrides)
    ├── knowledge/                ←   generated in Stage 1
    │   ├── prompts/              #     dev/, solutions-architect/, tech-lead/, product/
    │   ├── patterns/             #     implement-and-test.md (E→P→A→V chain)
    │   ├── rules/                #     coding-standards.md (rules + rationale)
    │   ├── agents/
    │   ├── retros/
    │   └── templates/
    └── ...                       ←   source + config (boilerplate, stack-dependent)

Key Principle

The knowledge directory lives inside the code repo and is version-controlled and shared — when a developer discovers a better prompt pattern or a new coding rule, they contribute it back via PR, and the team’s AI effectiveness compounds over time. The docs/ folder is the opposite: it holds working artifacts (BRD, PRD, Architecture, ADRs, Dev Tasks), sits beside the repo, and is not committed.

The Core Loop

Every task follows the same four-step cycle. The artifacts and knowledge directory provide the context. The CSV task fields provide the prompt.

Step	What the developer does	Where context comes from
EVALUATE	Load files, understand the task and surrounding code	Architecture doc + DevTasks + existing codebase
PLAN	State implementation plan, wait for approval	Task description + acceptance criteria from CSV
APPLY	Implement the plan, generate the code	AGENTS.md coding standards
VALIDATE	Self-review + tests against acceptance criteria	Acceptance criteria from CSV = the checklist

The CSV is the prompt

The developer doesn’t write prompts from scratch. The task CSV has the user_story (context), description (what to build), acceptance_criteria (how to verify), and dependencies (what’s already done). These fields ARE the prompt — loaded as context files, not copy-pasted into templates.

Go back to Modules 3 and 4

Building your project’s source code (the LMS is the worked demo) is where the earlier concepts get applied for real — keep both handbooks open while you work.

Module 3 (Prompt Engineering Fundamentals) is the craft underneath every stage here: prompt structure, chain of thought for debugging, and the refinement loop. Even with the CSV as context, how you prompt still decides output quality — when the AI drifts or an implementation goes sideways, Module 3 is the first place to look.

Module 4 (Context Engineering & Management) is what keeps quality high across a long build: load only what the task needs in dependency order, compact or start a fresh session when quality degrades, and use subagents to keep the main context clean. Most failures deep into an implementation session are context problems, not prompting problems — Module 4 is where you diagnose and recover them.

Stage 1 — Production Boilerplate

Every project starts with a boilerplate that matches the design from day one. Load the architecture document AND the Figma export together. The output is a complete scaffold: infrastructure, configuration, design system, UI components, pages — all on placeholder data. No business logic, no API wiring. Just a production-grade shell that looks and runs like the real app.

Boilerplate only — no feature implementation

Stage 1 produces infrastructure + UI shell. It does NOT implement business logic, API endpoints beyond a health check, data fetching, or form submissions. Components render with placeholder data. Dev task implementation starts in Stage 2.

Why AGENTS.md is generated here, not earlier

You cannot write accurate project rules without understanding the project. The architecture document defines the stack, the data model, the error format, the auth strategy, and the API conventions. AGENTS.md and the coding standards are derived from those decisions.

EVALUATE — Understand the Architecture and Design

[EVALUATE]

Load: docs/arch-docs/ (ARCH doc + ADRs)

Attach: Figma export (zip)

I’m setting up this project. Before generating any code,

walk me through:

Architecture:

1. The stack decisions and why they were made

2. The data model — tables, relationships, indexes

3. What infrastructure is needed from day one

4. The middleware stack and auth flow

5. Error response format and error handling strategy

6. Security rules and constraints

Design:

7. Design tokens (colors, typography, spacing, radii, shadows)

8. Component inventory (every component, variants, states)

9. Layout patterns (grid, breakpoints, responsive behavior)

10. Iconography (style, sizes, library)

Don’t generate code yet. Help me understand both the

architecture and the design system.

PLAN — Boilerplate + UI Shell + Project Rules

[PLAN]

Based on the architecture document and Figma design,

plan four things:

1. INFRASTRUCTURE BOILERPLATE

- Project structure — every directory, annotated

- Database schema — from the architecture doc’s data model

- Environment variables — complete list, documented

- Auth flow — as specified in the architecture

- Error handling — using the format from the architecture

2. UI SHELL (from Figma)

- Design system config (tailwind.config.ts, globals.css,

CSS custom properties, font loading)

- Component architecture (name, path, props interface,

variants, composition, accessibility)

- Page layouts (every page from the Figma, structured

with placeholder data)

- Build order (component dependency chain)

3. AGENTS.md (cross-tool project rules)

- Project description and stack summary

- Backend and frontend coding standards

- Security rules, git conventions, quality gates

4. KNOWLEDGE DIRECTORY STRUCTURE

- knowledge/rules/coding-standards.md

- knowledge/prompts/dev/ (initial prompt templates)

- knowledge/patterns/ (implement-and-test chain)

- Design system spec saved to knowledge/

If we have an existing golden path boilerplate, identify what

needs to be configured or extended for this project.

If generating from scratch, output as a blueprint. No code yet.

APPLY — Generate the Complete Boilerplate

[APPLY]

Plan approved. Generate everything in one scaffold:

1. Infrastructure:

.env.example, initial migration, auth middleware,

error handler, health-check endpoint, seed script,

ESLint + Prettier config, README, CI pipeline config.

2. UI shell (matching Figma exactly):

Design system config (tailwind.config.ts, globals.css),

all components with TypeScript props and all visual states

(default, hover, focus, disabled, loading, empty, error),

all page layouts responsive to Figma breakpoints,

semantic HTML + ARIA attributes throughout.

Use placeholder/mock data — not real API calls.

Define prop interfaces so Stage 2 can wire real data

without changing the component.

3. AGENTS.md at project root.

4. knowledge/ directory with initial content.

Package requirements:

- Use the latest STABLE versions of all dependencies

(not beta, not canary, not release candidates)

- Pin exact versions in package.json (no ^ or ~)

- Verify each package is actively maintained

- No deprecated packages or APIs

Production-grade standards:

- No placeholder pages or unused dependencies

- Auth protecting all routes that need it

- Consistent error response format throughout

- Proper logging setup (not console.log)

- Environment-based configuration (dev/staging/prod)

- Security headers and CORS configured

- Database connection pooling

Do NOT implement:

- API endpoints or business logic

- Data fetching, form submissions, or backend interactions

- Dev tasks from the CSV (that is Stage 2)

Output as complete files: npm install && npm run dev

VALIDATE — Validate Infrastructure + Design Fidelity

[VALIDATE]

Review everything against the architecture doc AND Figma.

Packages:

1. Are all dependencies on their latest stable version?

2. Any deprecated packages or APIs in use?

3. Are versions pinned exactly (no ^ or ~ ranges)?

4. Any known security vulnerabilities? (run npm audit)

Infrastructure:

5. Does the project structure match the architecture?

6. Does the schema match the data model?

7. Are all env vars from the architecture doc present?

8. Is logging production-grade (not console.log)?

9. Are security headers and CORS configured?

10. Is database connection pooling set up?

UI fidelity:

11. Do components match the Figma design?

12. Responsive at 320px, 768px, 1024px, 1440px?

13. All visual states render with placeholder data?

14. Accessibility — keyboard nav, WCAG AA contrast?

15. Prop interfaces typed for Stage 2 wiring?

16. No business logic or API calls in components?

AGENTS.md + knowledge directory:

17. Coding standards match the architecture?

18. Design system spec saved to knowledge/?

Classify: [BLOCKER] / [FIX NOW] / [BACKLOG]

Fix every [BLOCKER] and [FIX NOW].

Stage 1B — Connect Project Services via MCP

After the boilerplate is generated and running, connect your AI tool to the project’s services via Model Context Protocol (MCP). This gives the AI direct access to your database, APIs, and infrastructure — so it can inspect schemas, run queries, and generate code that matches what actually exists.

Supabase

claude mcp add --scope project --transport http \
  supabase "https://mcp.supabase.com/mcp"

After connecting, the AI can inspect your database schema, read table definitions, check Row Level Security (RLS) policies, and generate code that matches your actual data model — instead of guessing from the architecture document alone.

Other Common MCP Connections

Service	What it gives the AI	Command
GitHub	Repository access, issues, PRs, commit history	claude mcp add --scope project github
Filesystem	Direct file access on the developer’s machine	claude mcp add --scope project filesystem
NeonDB	PostgreSQL database access for schema inspection	Check NeonDB docs for current MCP endpoint
Google Drive	Access to shared project documentation	claude mcp add --scope project gdrive

Scope matters

Use --scope project so the MCP connection is saved in the project’s .mcp.json and shared with the team. Use --scope user for personal connections that shouldn’t be committed to the repo.

Stage 2 — Task Execution

This is where developers spend most of their time. The tasks exist in the CSV. The architecture doc and coding standards are in the project. The developer loads the context and executes.

What to Load as Context

Before starting any task, load these files. How you load depends on your tool 2014 file references, project context, or upload. The method varies; what matters is that these files are in the AI2019s context window before you start.

File	Why	Load when
docs/arch-docs/ (ARCH + ADRs)	Defines the stack, data model, API design, and patterns	Every task
docs/dev-tasks/epic-N-...-tasks.csv	Contains every task with description + acceptance criteria	Every task
AGENTS.md	Coding standards, error format, security rules	Every task
knowledge/rules/coding-standards.md	Rules with rationale — the AI follows these automatically	Every task
Relevant source files	The code the task will modify or extend	Per task

The 4-Step Chain (per task)

This is the implement-and-test pattern from knowledge/patterns/. Each task goes through all four steps.

Step 1 — EVALUATE: Load context and understand

[EVALUATE]

Load: docs/arch-docs/, the epic CSV, AGENTS.md

I am implementing TASK-008: Create useSend API client module.

Before writing any code, help me understand:

- What does the current code do in this area?

- What patterns does the codebase use for API clients?

- What existing utilities should I reuse?

- What could go wrong with this task?

The task CSV provides the context

The AI already has the full task loaded — the user_story, description, acceptance_criteria, and dependencies are all in the CSV file. The developer doesn’t re-type them. They just say “I am implementing TASK-008” and the AI reads the rest from the loaded file.

Step 2 — PLAN: State implementation plan

[PLAN]

Give me an implementation plan for TASK-008 covering:

1. File path and module structure

2. Public API (method signatures, params, return types)

3. Authentication approach (useSend API key handling)

4. Retry logic design (backoff strategy)

5. Error cases to handle (401, 403, 429, 500)

6. How it integrates with email_logs table (from TASK-003)

Don’t write code yet. Plan only.

Review the plan before proceeding

This is the approval gate. Read the plan. Does the retry strategy make sense? Does the module structure follow existing patterns? Push back here — not after the code is written.

Step 3 — APPLY: Implement

[APPLY]

Plan approved. Implement TASK-008.

Follow AGENTS.md standards exactly.

Full file content — no diffs, no ‘rest stays the same.’

That’s it. The AI already knows the description, the acceptance criteria, the coding standards, and the architecture — all from the loaded files. The APPLY prompt is two lines.

Step 4 — VALIDATE: Self-review against acceptance criteria

[VALIDATE]

Review your implementation of TASK-008 against its

acceptance criteria from the task CSV.

For each criterion, confirm: does the code satisfy it?

Then check for:

1. Auth bypass or injection vectors

2. Missing error handling for API failures

3. Edge cases in retry logic (max retries, backoff overflow)

4. Missing input validation (email format, domain check)

5. Sensitive data in logs (API keys, email content)

Write tests covering happy path + failure cases.

Cite line numbers for every issue found.

The acceptance criteria IS the checklist

The CSV has 6–12 testable criteria per task. The VALIDATE step walks through each one. If the AI says “looks good” without addressing each criterion specifically, push back: “Walk through each acceptance criterion one by one.”

Concrete Example: TASK-019

Here’s what a real task execution looks like using CSV fields as the prompt. The developer loads the context files once, then works through the cycle.

The CSV row

Field	Value
task_id	TASK-019
issue_type	Story
summary	Implement order notifications — new order (seller)
user_story	As a seller, I want to be notified immediately when a buyer places an order so that I can quickly process the order
dependencies	TASK-013, TASK-017
role	Fullstack
estimate / points	6h / 3

The developer’s actual prompts

[EVALUATE]

Load: docs/arch-docs/, the epic CSV, AGENTS.md

Load: src/services/notification-service.ts (from TASK-013)

Load: src/services/toast-service.ts (from TASK-017)

I’m implementing TASK-019: new order notification for sellers.

Walk me through how the notification service and toast service

work so I understand where to integrate.

[PLAN]

Plan the implementation of TASK-019.

The acceptance criteria from the CSV are my requirements.

Cover: where to trigger, email template selection,

toast message, notification record creation,

preference checking, and cross-platform behavior.

[APPLY]

Plan approved. Implement TASK-019.

Follow existing patterns from notification-service.ts.

[VALIDATE]

Review TASK-019 against its acceptance criteria.

Check each one. Then: auth, error handling, deduplication,

preference bypass, cross-platform parity.

Write tests.

Total prompt effort: ~60 words across four steps. Everything else comes from the loaded files.

Stage 3 — Recurring Patterns

These live in knowledge/prompts/dev/ and knowledge/patterns/. Use them directly.

Code Review

From knowledge/prompts/dev/code-review.md:

Load: AGENTS.md, relevant source files
 
Review this code for: security vulnerabilities,
logic errors, error handling gaps, performance issues.
 
For each issue: Severity (Critical/High/Medium/Low),
line number, description, suggested fix.
 
Do not comment on style. Focus on things that break.

Database Migration

[EVALUATE] Load schema. What’s the current state?
[PLAN] New schema, data impact, backward compat, rollback
[APPLY] Migration file + updated schema + seed update
[VALIDATE] Data loss? Table locks? Broken queries? Rollback works?

API Endpoint

[EVALUATE] Load docs/arch-docs/. Which endpoint?
[PLAN] Validation rules, DB queries, error responses
[APPLY] Route handler following AGENTS.md standards + curl test
[VALIDATE] Invalid input? Auth bypass? Response format correct?

Bug Fix

[EVALUATE] Observed vs expected, stack trace, suspected code
[PLAN] Root cause → fix design → same pattern elsewhere?
[APPLY] Implement fix + regression test
[VALIDATE] Resolved without side effects? Related paths?

Stage 4 — Pre-Launch

Load: Full project, docs/arch-docs/, AGENTS.md
 
[EVALUATE] Audit for production readiness.
Security, performance, reliability, DevOps, UX.
 
[PLAN] Prioritize: [BLOCKER] / [FIRST WEEK] / [BACKLOG]
 
[APPLY] Fix every [BLOCKER]. Provide [FIX NOW] fixes for post-launch.
 
[VALIDATE] Fixes work? New problems? Deploy pipeline tested?

Contributing Back to Knowledge

The knowledge directory grows with the team. After completing tasks, contribute what you learned.

What you discovered	Where it goes	Format
A prompt pattern that worked well	knowledge/prompts/dev/	Prompt template with example input/output
A coding rule you learned the hard way	knowledge/rules/	Rule + rationale (why it exists)
A multi-step workflow	knowledge/patterns/	Step-by-step chain with load instructions
A sprint retro insight	knowledge/retros/	YYYY-MM-DD.md with action items + owners

Quality standards for contributions

Every prompt must have at least one real example input/output.

Rules must include rationale — why does this rule exist?

Patterns must include which files to load as context.

Submit via PR. Get peer review before merging.

Quick Reference

Step	Developer does	Context comes from	Watch out for
EVALUATE	Load files, understand task	Architecture doc + CSV + codebase	AI coding before understanding
PLAN	State plan, wait for approval	Task description + acceptance criteria	Plans that skip edge cases
APPLY	“Implement TASK-XXX”	AGENTS.md coding standards	Placeholder code, hallucinated APIs
VALIDATE	Walk each acceptance criterion	Acceptance criteria from CSV	AI rubber-stamping itself

Discipline Notes

Load context, don’t re-type it

The architecture doc, task CSV, and coding standards are files. Load them. The AI reads the acceptance criteria, dependencies, and patterns from the files — you don’t paste them into every prompt.

Plan is the approval gate

Read the plan. Push back. The AI will happily build the wrong thing perfectly if you let it. Fix it here, not after APPLY.

APPLY should be two lines

If your APPLY prompt is longer than “Plan approved. Implement TASK-XXX. Follow AGENTS.md.” then your context loading or your PLAN step was incomplete.

Verify against the CSV, not your feelings

Walk each acceptance criterion from the CSV. “Does the code satisfy criterion 1? Criterion 2?” If the AI says “looks good” without citing criteria, push back.

Contribute back. When you find a better prompt, a new rule, or a useful pattern — add it to knowledge/. The team’s AI effectiveness compounds over time. One developer’s insight becomes every developer’s shortcut.