Case Study

An Agentic Engineering Practice

Role: Staff Software Engineer
Agentic Engineering
Stack: Claude
Claude Code
pi
Google Agent Development Kit (ADK)
Python
Laravel
Laravel Vapor
Pest
Vitest
PHPStan
CI/CD

An agentic engineering practice isn't designed in one sitting. Mine evolved across model generations — in parallel on Draft Slot's production codebase and University of Maryland platform work — until every change ran through one loop: requirements first, an agent build, validation before merge. It accelerated my delivery 10x, and the bar never moved: the same review gates, security posture, and tests as any other code.

This is that evolution, stage by stage. Each layer of scaffolding existed because the models of its moment needed it — and each was retired when they didn't. Knowing why every layer existed is the durable skill, and it's how I build today.

Just Ask

Claude 3.5 Sonnet and a Prompt Box

The practice started small: Claude 3.5 Sonnet and a prompt. A scoped function, a tricky migration, a block of boilerplate — pasted in, written back out. For code that fit in one window, it was already faster than writing it by hand.

Single-shot prompts for scoped, self-contained code
Worked: small functions, tests, boilerplate, translations between stacks
Broke: anything spanning files, anything that needed the project's context

The lesson that set up everything after it: the model wasn't the bottleneck. The missing context was.

Feed the Window

Context Engineering with Agents

The context window drawn as five stacked layers — system instructions (constant), retrieved knowledge (dynamic), tool definitions (configured), conversation history (accumulates), and working memory (ephemeral) — each labeled by how it changes over a run

The fix for missing context wasn't longer prompts — it was deciding, deliberately, what an agent sees. Context engineering: system instructions, retrieved knowledge, tool definitions, history, and working memory, assembled per task instead of dumped in wholesale.

Requirements written before a single line of code
Context budgets managed by hand — nothing loaded that the task didn't need
Validation gates after every build step to catch drift

Coherence was the scarce resource. An agent given exactly what a task needs stays sharp for the whole run; an agent given everything degrades by the file.

Skills Are Infrastructure

A Catalog, Not a Manifest

A skill loading instantly into a session — knowledge pulled from the catalog on demand, the way Neo downloads a new skill in The Matrix

Context discipline scaled by writing the knowledge down. The practice's know-how lives in skill repos — base commands, design, PHP, frontend, planning, QA — pulled into any codebase on demand from a private catalog. The same skills serve Draft Slot one day and University of Maryland platform work the next; the catalog is what made the practice portable.

Skill groups as repos, versioned and pushed back when improved
Each skill is a SKILL.md contract plus cookbook files
Cookbooks load only at execution time — context is spent when it pays
One command pulls a skill into a codebase; the catalog stays the source of truth

The rule that makes the system work: the catalog is not a manifest. Nothing is fetched until a task asks for it — a setup that loads everything up front pays for it in coherence on every task that follows.

Build the Harness

Tailoring the Layer Between You and the Model

The agent harness drawn as an operating system: the agent application on top; prompt presets, tool handling, lifecycle hooks, planning, filesystem access, and sub-agent management in the middle; the model as the CPU and the context window as RAM underneath

An agent is a model inside a harness: the system prompt, the tools, the context budget, the extension points. Reading harness source taught me more about agent behavior than any prompting guide — and minimal harnesses like pi made that layer legible: small enough to understand end to end, real enough to build working workflows on.

Understanding the layer led to building on it — harnesses tailored to the workflow, so the agent fit the codebase instead of the codebase bending to the tool.

System prompt and tool design decide what an agent can do
The context budget decides how long it stays coherent
Extension points — skills, prompt templates, custom tools — are where workflows live
pi bridged the gap: a harness small enough to read, extensible enough to ship with

The Loop, Orchestrated

Multiple Agent Workflows, One Orchestrator

Orchestrated build assembly — a spec feeds an orchestrator, which dispatches three subagents on scoped tasks; each passes a validation check before the work integrates and ships, with a hand-set context budget gauge alongside

Where the practice landed: not one agent, but multiple agent workflows looped under an orchestrator. Work arrives as requirements; the orchestrator decomposes it, dispatches subagents through build and validation, and integrates the results.

An orchestrator decomposes work and dispatches subagents per scoped task
Every dispatched task carries its own requirements, build, and validation gate
Workflows loop — results feed the next round until the spec is satisfied
Conversion of an existing app, not a from-scratch build

This is the practice that rebuilt Draft Slot's shipped Laravel app in place — frontend and backend, the Vapor migration, a showcase environment that ran every UI component from written requirements to validated build — and the same discipline runs through University of Maryland platform work.

Read the Draft Slot case study

Why This Matters

The Skill Is the Discipline

Every stage of this arc was scaffolding for what that era's model couldn't yet do — and the next era will need scaffolding for whatever comes after. Teams adopting agents don't fail on tooling — they fail on process. The engineers who will be most useful over the next few years are the ones who can hold velocity and rigor at the same time: who treat an agent as a collaborator whose work gets reviewed, not an oracle whose output gets merged.

That is the practice I bring to a team: a harness understood end to end, context spent where it pays, guardrails written down — and the velocity gains reinvested in design reviews, architectural documentation, and mentorship, so the whole team gets faster, not just one engineer.

Read the resume

Next case study

Agents in Production

Grounded search · URL ingestion · triage · content · n8n workflows