Before: The Copy-Paste Trap

You've probably done this. You open a chat with an AI assistant, describe your problem, and paste code back into your editor. For a quick one-off task, it's magical. For anything with real complexity — a new API endpoint, a refactor that touches six files, a feature that has to fit into your existing architecture — it falls apart fast.

The AI doesn't know your codebase. It doesn't know the decisions you made six months ago or why you structured things the way you did. Every session starts cold, and you spend half your time re-explaining context that should just *be there*.

We lived in that world for a while. It wasn't terrible, but it kept hitting the same ceiling. The more complex the task, the more the AI would drift toward generic solutions that worked in isolation but clashed with how we actually built things.

---

The Turning Point: It's Not an AI Problem

Here's the thing we eventually realized: the issue wasn't Claude's capability. Claude is exceptional at following structured processes when given clear guidance. The issue was that we weren't giving it any.

We were treating every task like a chat thread — freeform, context-free, ephemeral. Of course the output was inconsistent. We were asking an intelligent system to make architectural decisions it had no basis for making. The fundamental problem was context, not capability.

So we stopped trying to prompt our way out of it and started building a proper process instead.

---

The Messy Middle: Adapting FDD for AI Collaboration

We built our process around Feature-Driven Development — a methodology created by Jeff De Luca and Peter Coad in the late 1990s. FDD defines five core processes:

  • Develop an Overall Model — define the domain, actors, and relationships
  • Build a Feature List — translate the model into implementable features
  • Plan by Feature — assign ownership, dependencies, and timelines
  • Design by Feature — detailed technical design per feature
  • Build by Feature — implement, test, and verify

Classical FDD was designed for human teams. Adapting it for an AI agent required rethinking what each step actually produces and how an agent transitions between them. We ended up with a state machine: a pipeline of governed steps, each with entry criteria, a checklist, knowledge artifacts to load, and exit criteria before advancing.

Every task enters the pipeline the same way, regardless of size. A two-line bug fix and a new feature both go through the same gates — though most steps are pruned based on scope. The discipline is the point. Once the process is consistent, the AI's behavior becomes predictable.

The hardest part wasn't designing the pipeline. It was breaking the habit of skipping steps when things felt "obvious." The times we skipped steps were almost always the times things went sideways.

---

Knowledge Before Code

One principle made more difference than anything else: load all applicable governance before touching the implementation.

Before any coding step, the process requires loading the relevant artifacts from our knowledge base — coding standards, naming conventions, architectural guidelines, proven solutions for common problems. The agent doesn't just get a task description. It gets the full context of how we've solved similar problems before, what constraints apply, and what patterns to follow.

This sounds tedious, but it's not. It takes seconds, and it eliminates an entire class of drift. When Claude knows that we use server-side data loading exclusively, it doesn't suggest client-side fetches. When it knows our error handling conventions, it follows them without being reminded. Context accumulated in the knowledge base compounds over time — every solved problem becomes a guardrail for the next one.

The insight here is almost embarrassingly simple: LLMs are excellent at following structured processes when given clear governance rules. The problem was never the model. It was us, handing it a blank page and expecting it to intuit a year of architectural decisions.

---

After: Verified Deliverables, Not Rough Drafts

The last principle is accountability. When a task is complete, the agent doesn't ask you to check it. It builds the project, runs the tests, verifies that the implementation matches the specification, and commits with a traceable record. You get a verified deliverable — something you can review as a reviewer, not a babysitter.

This shift matters more than it sounds. When you trust that the thing in your PR has already been built, tested, and self-reviewed, you can spend your review time on the decisions that actually need human judgment. Architecture tradeoffs. Business logic. Edge cases that need domain knowledge. Not "did you remember to handle the null case."

Is this about replacing human judgment? No. It's about making sure that when human judgment is required, it's applied to work that's already verified — not to rough drafts that need a round of cleanup first.

---

What Actually Changed

Looking back, the shift wasn't really about AI at all. We built a software delivery process that happens to work exceptionally well with an AI agent following it. The governance artifacts, the knowledge base, the state machine — those are good engineering practice regardless of who's doing the work.

The AI made us more disciplined about encoding what we knew. We couldn't rely on tribal knowledge anymore. If a constraint wasn't written down somewhere an agent could find it, it might as well not exist. That pressure to make implicit knowledge explicit has made the whole system more coherent — for human contributors too.

If you're hitting the same ceiling we were — good AI, inconsistent results, constant context re-explanation — the answer probably isn't a better prompt. It's a better process.