Andrew Ng on architecting multi-agent AI: the orchestration layer, 5 trends, and why applications win

ApplicationsDigital Markets ActNews

How agentic AI is reshaping the stack and strategy

Andrew Ng argues that agentic AI is the defining trend of the moment, shifting value decisively to the application layer and introducing an orchestration layer that coordinates multi-step workflows. He outlines five forces reshaping teams and roadmaps—coding assistance, fast prototyping, visual AI, the voice stack, and data engineering—while urging architectures that maximize model optionality and speed of iteration. The strategic playbook: prototype fast in safe sandboxes, unlock unstructured data (especially PDFs), prepare for voice, and design for rapid model swaps.

Points clés

  • Andrew Ng frames the AI stack as semiconductors → clouds → foundation models → applications, emphasizing that the application layer will capture most of the value.
  • Agentic AI (multi-step, iterative workflows) outperforms single-pass prompting by planning, searching, drafting, and revising in loops to raise output quality.
  • A new agentic orchestration layer is emerging to coordinate multi-call workflows; foundation model switching costs are low, but orchestration switching costs are higher, shaping vendor lock-in dynamics.
  • Architectures should preserve model optionality: teams can run evals and switch to a better model in roughly 2–3 days.
  • Coding assistance is accelerating: tools from GitHub Copilot to Cursor, Windsurf, Anthropic’s Claude Code, and OpenAI are boosting engineers while demanding thoughtful “rapid engineering,” not “vibe coding.”
  • Productivity gains differ by task: roughly 30–50% lift for production/legacy work versus 10x or more for quick prototypes thanks to relaxed requirements and richer building blocks.
  • Visual AI is unlocking PDFs at scale; an invoice-matching demo (e.g., Office Depot and Staples) was assembled in under half a day via agentic document extraction.
  • Voice applications face a latency–accuracy tradeoff (early builds saw up to 9-second lag); techniques like natural “stalling” fillers help keep latency low while guardrails ensure policy-compliant answers.
  • Data engineering is shifting toward unstructured data, as “data gravity” wanes: egress can be ~$0.10/GB while GenAI processing runs ~$30–$40/GB, enabling globally distributed, best-of-breed pipelines.
  • Agent avatar design is application-dependent, from disembodied orbs to cartoonified or photorealistic faces, each with trade-offs in relationship-building, trust, distraction, and ethics.

À retenir

Start where the value is: applications. Use agentic orchestration to stitch best-of-breed services, then keep your options open with evals and two-day model swaps. Prototype recklessly—inside sandboxes—because 18 throwaways for the 2 winners is a bargain, and yes, your PDFs are finally useful. Teach everyone just enough coding to be dangerous (in a good way), prepare for voice without promising instant refunds, and stop treating prototype code like a sacred relic—you can rewrite it next week while the coffee’s still warm.

Sources

Quiz sur la vidéo: 5 questions