Skip to main content

Case Study

An Agentic Engineering Practice

An Agentic Engineering Practice — requirements, build, validation: the loop behind the codebase

Role

  • Staff Software Engineer
  • Agentic Engineering

Stack

  • Claude
  • Claude Code
  • pi
  • Google Agent Development Kit (ADK)
  • Python
  • Laravel
  • Laravel Vapor
  • Pest
  • Vitest
  • PHPStan
  • CI/CD

I rebuilt Draft Slot's entire Laravel codebase with Claude — frontend and backend, multi-sport and real-time — through an agentic engineering practice that runs every change through one loop: requirements first, an agent build, validation before merge. It accelerated my delivery 10x, and the bar never moved: the same review gates, security posture, and tests as any other code.

This is how that practice is built. Read the harness before trusting the model. Treat skills as infrastructure, loaded only when a task calls for them. Spend context where it pays. The scaffolding here is era-specific — it dates to the Claude 3.7 Sonnet and Claude Sonnet 4 window, and newer models have absorbed much of it — but the discipline underneath is durable, and it's how I build today.

A Workflow of Its Era

Claude 3.7 Sonnet to Sonnet 4 — and Honest About It

The context window drawn as five stacked layers — system instructions (constant), retrieved knowledge (dynamic), tool definitions (configured), conversation history (accumulates), and working memory (ephemeral) — each labeled by how it changes over a run

This workflow was built in the Claude 3.7 Sonnet and Claude Sonnet 4 window of early to mid 2025. Models of that era could write production code, but they needed scaffolding: orchestration to break work down, validation loops to catch drift, and ruthless context management to stay coherent.

  • An orchestrator session coordinated subagents through discrete, scoped tasks
  • Requirements were written before a single line of code
  • Validation gates ran after every build step
  • Context budgets were managed by hand — nothing loaded that the task didn't need

Modern models have absorbed much of that scaffolding, and the workflow itself now lives in the past. Knowing why each piece existed is the durable skill — the next model era will need scaffolding for whatever it can't yet do.

Understand the Harness First

The Layer Between You and the Model

The agent harness drawn as an operating system: the agent application on top; prompt presets, tool handling, lifecycle hooks, planning, filesystem access, and sub-agent management in the middle; the model as the CPU and the context window as RAM underneath

An agent is a model inside a harness: the system prompt, the tools, the context budget, the extension points. Reading harness source taught me more about agent behavior than any prompting guide — and minimal harnesses like pi made that layer legible: small enough to understand end to end, real enough to build working workflows on.

  • System prompt and tool design decide what an agent can do
  • The context budget decides how long it stays coherent
  • Extension points — skills, prompt templates, custom tools — are where workflows live
  • pi bridged the gap: a harness small enough to read, extensible enough to ship with

Skills Are Infrastructure

A Catalog, Not a Manifest

A skill loading instantly into a session — knowledge pulled from the catalog on demand, the way Neo downloads a new skill in The Matrix

The practice's knowledge lives in skill repos — base commands, design, PHP, frontend, planning, QA — pulled into any codebase on demand from a private catalog. The rule that makes the system work: the catalog is not a manifest. Nothing is fetched until a task asks for it.

  • Skill groups as repos, versioned and pushed back when improved
  • Each skill is a SKILL.md contract plus cookbook files
  • Cookbooks load only at execution time — context is spent when it pays
  • One command pulls a skill into a codebase; the catalog stays the source of truth

Context windows are the scarcest resource in agentic work. A setup that loads everything up front pays for it in coherence on every task that follows; a catalog that loads on demand keeps the window for the work.

The Loop, Applied

Requirements, Build, Validation — on a Real Codebase

Draft Slot's tiered scoring breakdown — production UI from a codebase written entirely with Claude

Draft Slot is where the practice met production. It was already a shipped, server-rendered Laravel app; the loop rebuilt its frontend and backend in place — every change validated against the spec before any merge.

A showcase environment let every UI component run the full loop: written requirements in, agent build, validation against the spec before anything merged. The orchestrator decomposed the work, dispatched subagents, and collected results.

  • Conversion of an existing app, not a from-scratch build
  • Laravel to Laravel Vapor migration driven through the loop
  • Showcase: every UI component from requirements to validated build
  • Orchestrator session dispatching subagents per scoped task

Why This Matters

The Skill Is the Discipline

Teams adopting agents don't fail on tooling — they fail on process. The engineers who will be most useful over the next few years are the ones who can hold velocity and rigor at the same time: who treat an agent as a collaborator whose work gets reviewed, not an oracle whose output gets merged.

That is the practice I bring to a team: a harness understood end to end, context spent where it pays, guardrails written down — and the velocity gains reinvested in design reviews, architectural documentation, and mentorship, so the whole team gets faster, not just one engineer.