Spec-first strategies for production-grade AI coding
Context engineering moves from buzzword to blueprint: a spec-first, context-disciplined workflow that scales AI coding in complex, brownfield codebases. The method hinges on research-plan-implement phases, frequent intentional compaction, and sub-agents to keep context under control and teams aligned. Real-world results include one-shot fixes in 300,000-line repos and compressing weeks of work into hours.
Points clés
- Dex, founder of Human Layer (YC F24), popularized the term after publishing “12actor agents: principles of reliable LLM applications” on April 22 and reframing the talk as “context engineering” on June 4.
- Influences include Sean Grove’s “the new code” and a Stanford study showing AI coding boosts rework and struggles on brownfield/complex tasks; Amjad from Replit noted agents shine for prototypes, not production.
- The catalyst: repeated 20,000-line Go PRs made code review untenable, forcing a shift to spec-first development; within ~8 weeks the team embraced plans and tests over line-by-line review.
- Stated goals: work in large, complex codebases, solve complex problems, eliminate slop, ship production code, maintain team alignment—and intentionally spend tokens for quality.
- Core method: keep context utilization under 40% via frequent intentional compaction, a progress file (instead of /compact), and a three-phase loop—research, plan, implement—with explicit tests and verification.
- Sub-agents are used for inline compaction (e.g., find flows, files, and line numbers) to spare the parent agent context burden and avoid “telephone” errors via structured returns.
- KPI: a one-shot fix landed in a 300,000-line Rust codebase (BAML), merged by the CTO without knowing it was a live experiment.
- KPI: 35,000 lines shipped in 7 hours with the Boundary CEO, compressing an estimated 1–2 weeks of work; author shipped six PRs in a single day relying on specs.
- Team productivity: an intern shipped two PRs on day one and roughly ten by day eight using the workflow.
- Context economics: with ~170,000 tokens available, using fewer for “work” improves outcomes; Jeff Huntley’s “Ralph Wiggum as a software engineer” looped-prompt approach underscores the power of context discipline.
À retenir
Start with research, write the plan, then let the agent implement—and please stop yelling at it like it’s your terminal from 2009. Keep your context under 40%, compact intentionally with a progress file, and use sub-agents to hunt code paths so your main agent isn’t drowning in JSON soup. Review plans instead of 2,000-line diffs, measure PR throughput and merge quality, and yes, spend the tokens—because wasting engineer hours is the most expensive “optimization” of all (said every finance team, ever).
Sources
Quiz sur la vidéo: 5 questions





