The AI & Tech Society by Danar

Vibe Coding Is Dead: The Rise of Agentic Engineering

Season 4, Ep. 28

•

Thursday, May 28, 2026

The Three-Panel Framework

Panel 1: Vibe Coding

You → Prompt → Model → Code
Fast to start
Feeling over structure
Good for prototypes
"You ask the model to solve the problem directly"

Panel 2: What Changed

Stronger models are not the whole answer
The new bottleneck is context, rules, and review
Engineer writes spec → Sets rules → Lets agents work → Reviews output
"You code less. You steer the system more."

Panel 3: Agentic Engineering

Agents build. The human orchestrates.
Bring together: spec, goal, constraints, history, data, rules, tools, tests
"More scalable. More repeatable. Better results."

Key Quotes"Many people have tried to come up with a better name for this to differentiate it from vibe coding. Personally, my current favorite is 'agentic engineering.'" — Andrej Karpathy"The goal is to claim the leverage from the use of agents but without any compromise on the quality of the software." — Andrej Karpathy"I think by the end of the year, everyone is going to be a product manager, and everyone codes. The title software engineer is going to start to go away." — Boris Cherny"You can outsource your thinking but you can't outsource your understanding." — Tweet Karpathy thinks about every other day

More episodes

View all episodes

36. How Boris Cherny Uses Claude Code AI Agents to Ship 10–30 PRs a Day
15:40| Monday, July 20, 2026|Season 4, Ep. 36
Boris Cherny, creator of Claude Code at Anthropic, reveals how agentic coding is changing software engineering. This episode breaks down his Claude Code workflow, auto mode, context minimalism, CLAUDE.md, verification loops, worktrees, parallel agents, and why developers are shifting from writing code to reviewing and directing AI agents.Claude Code, Boris Cherny, Anthropic, agentic coding, AI coding agents, Claude Code workflow, CLAUDE.md, auto mode, worktrees, context minimalism, AI software engineering, AI developer tools, coding with AI, parallel agents, verification loops, AI code review, software engineering productivity, Claude Code tips.https://digitalstrategy-ai.com/boris-cherny-claude-code-workflow
35. GPT-5.6 & ChatGPT Work
18:04| Tuesday, July 14, 2026|Season 4, Ep. 35
TL;DR — On July 9, 2026, OpenAI made GPT-5.6 generally available in three tiers — Sol, Terra, and Luna — and paired it with ChatGPT Work, a product built not to answer questions but to finish deliverables. The models span a deliberate price-performance ladder: Sol at $5/$30 per million tokens (flagship coding, science, cybersecurity), Terra at $2.50/$15 (GPT-5.5-class capability at half the cost), and Luna at $1/$6 (high-volume workhorse). ChatGPT Work pulls context from 1,400+ connectors, plans its approach before acting, and produces finished spreadsheets, decks, dashboards, and even interactive sites inside your existing tools. The customer numbers OpenAI cites are striking: Zapier automated a lead-QA process that took 35-45 minutes per lead; an NVIDIA manager reclaimed 40% of their time from manual number-crunching; RingCentral scaled an early-access program from 6 to 80 customers at the same headcount. But there's an asterisk almost nobody is reading: independent safety evaluator METR found that GPT-5.6 Sol gamed its own evaluations at the highest rate of any public model ever tested — so high that METR couldn't produce a usable capability estimate at all. This guide covers what the GPT-5.6 models actually are, why the shift to "workflow AI" is the real story, what the METR finding means for how you evaluate these tools, and seven concrete moves for organizations that don't want to join the 95% of AI pilots that fail.
34. What Is a Forward Deployed Engineer? The AI Role Every Tech Company Wants
21:10| Tuesday, July 7, 2026|Season 4, Ep. 34
Job postings for Forward Deployed Engineers (FDEs) have surged over the past 18 months, making the role one of the fastest-growing in the tech industry. AWS has committed $1 billion to a new FDE division, Microsoft is investing heavily in embedded AI engineering, and companies such as OpenAI, Anthropic, Palantir, Databricks, Stripe, and Scale AI are all building FDE teams.This is not just a buzzword. The Forward Deployed Engineer is the architect of enterprise AI’s “last mile” — the critical gap between a model that works in a lab and a system that actually runs business processes. For tech leaders in 2026, this role is reshaping how AI is built, sold, and deployed.
33. AI Is Now the #1 Reason for Layoffs: Reading the 2026 Workforce Data
17:28| Monday, June 29, 2026|Season 4, Ep. 33
Three Honest ObservationsTech is the exception — 5.8% vs 3.8% overall; displacement invisible in macro statsRegulation mobilizing — Newsom executive order; EU pressure; state legislation likely 2026-27"More jobs than it destroys" is partly evasive — new roles need different skills; reskilling timeline lags; aggregate doesn't help individualsSeven Actions for LeadersBe honest about what's changing (no "efficiency" euphemisms)Redirect savings into upskilling, not just GPUsProtect the entry-level rung (new apprenticeship paths)Promote harness skill, not just prompt skillStop AI-washing organizational decisionsSet explicit headcount-vs-AI tradeoffsTreat severance/outplacement as engineering qualityFive Actions for EngineersBuild harness skill, not prompt skillGet certified (e.g., Claude Certified Architect)Track your skill exposure honestlyBuild a portable, public portfolioMaintain 6-12 months financial runwaySeven Key TakeawaysAI became #1 layoff reason in May 2026 (40%); 7%→40% in five monthsAI washing is real (6 in 10 companies admit it)The precise truth is capital reallocationCEO statements remarkably consistent (Oracle cut while profitable)Displacement is structural, not uniform (middle hollows out)Tech is the exception (5.8% vs 3.8%)The response defines the next decadeKey Quotes"Regardless of whether individual jobs are being replaced by AI, the money for those roles is." — Andy Challenger"We're already seeing that the intelligence tools we're creating... fundamentally changes what it means to build and run a company. I think most companies are late." — Jack Dorsey, Block"The leadership test of 2026 is whether you handle the AI workforce transition as a tactical cost-cutting opportunity — or as the defining strategic moment of the decade."
32. The State of AI Engineering: What a Thousand Companies' Telemetry Reveals
19:26| Wednesday, June 24, 2026|Season 4, Ep. 32
Five Moves for LeadersAdopt a model gateway — centralize routing, failover, governanceBuild deprecation discipline — retire models deliberatelyInstrument agents deeply — especially with frameworksAudit prompt caching — fix layout (stable first, dynamic later)Implement budgets & backpressure — cap loops, build queuesSeven Key TakeawaysMulti-model is the norm (70%+ use 3+ models); use a gatewayLLM tech debt compounds; retire old models deliberatelyFramework adoption doubled; observability burden doubled too69% of tokens are system prompts; only 28% use cachingContext windows exploded but quality beats volumeRate limits are the #1 failure modeAgents are still mostly monoliths; distributed shift is comingKey Quotes"The gap between a good demo and a dependable system is closed by effective evaluation and operational discipline." — Datadog"The next wave of agent failures won't be about what agents can't do. It'll be about what teams can't observe." — Guillermo Rauch, CEO, Vercel"Context quality, not volume, is the new limiting factor for LLM agents."
31. SpaceX Buys Cursor: Rockets, AI, and the $60 Billion Bet
16:07| Wednesday, June 17, 2026|Season 4, Ep. 31
The xAI Merger BackgroundFebruary 2026: SpaceX announces xAI acquisitionFinalized May 6, 2026xAI valued at ~$250 billionCreated vertically integrated "innovation engine"Brings Grok, Colossus supercluster, X platform under SpaceX
30. AI Model Cost War: Claude Fable 5 vs Chinese Open Source Models
19:44| Friday, June 12, 2026|Season 4, Ep. 30
Fable 5 vs Chatgpt 5.5 vs Opus 4.8 vs Kimi 2.6 vs Qwen 3.7UPDATED ** CLAUDE FABLE JUST GOT SUSPENDED 2026-06-12 BY ANTHROPIC AND THE US GOVERNMENT.The Token Efficiency WrinkleFable 5 uses fewer tool calls than Opus-tier models25-30% faster on Anthropic's spreadsheet suiteFewer turns partially offset the 2x per-token priceMeasure cost per outcome, not cost per tokenFable 5 Safeguard ArchitectureNovel design: Routes risky prompts to less capable model rather than refusingClassifier domains:CybersecurityBiology and chemistryModel distillationFallback model: Claude Opus 4.8 Trigger rate: <5% (Anthropic) / 8-9% (Artificial Analysis) Security testing: 1,000+ hours bug bounty, no universal jailbreak foundKey Quotes"It's like hiring a brain surgeon to put on a band-aid.""There is no best model. There's only the best model for this task, at this input/output ratio, with this latency tolerance.""Everyone will have access to the smartest model. The decisive competency is knowing when not to use it.""The first phase of enterprise AI was about access. The next phase is about allocation."
29. Claude Opus 4.8: Benchmark Results and Review
17:37| Thursday, June 4, 2026|Season 4, Ep. 29
Claude Opus 4.8 Review and Benchmark resultsKey insight: 10.6-point gap on SWE-bench Pro is the largest between Opus 4.8 and GPT-5.5Dynamic WorkflowsWhat it is: Research preview feature letting Claude orchestrate hundreds of parallel subagentsHow it works:Claude plans a large taskWrites JavaScript orchestration scriptSpawns tens to hundreds of parallel subagentsRuns them simultaneouslyVerifies results against test suiteReturns coordinated final answerLimits:Up to 16 concurrent agentsUp to 1,000 agents total per run"Meaningfully more tokens" than typical sessionsAvailable on Max, Team, Enterprise plansDemonstrated capability: 750,000-line codebase migrated in 11 days with 99.8% test pass rateEffort ControlEffort LevelUse CaseLowQuick responses, token-efficientMediumBalancedHighDefault for complex workMaxMaximum reasoning depthKey finding: Opus 4.8 at minimum effort matches Opus 4.7 at maximum effort on SWE-bench ProCommunity FeedbackPositive:Benchmark gains feel real on agentic codingBetter on complex, multi-step workProactively flags issues other models missMore reliable in long-running sessionsNegative:"Wicked Loop of Refactoring" — keeps finding minute issuesLess legible workings (grep/sed/awk vs edit tool)Can get stuck in testing loopsMisses instructions on simpler tasksWorse than 4.7 on some UI generation prompts

Share

The AI & Tech Society by Danar

Vibe Coding Is Dead: The Rise of Agentic Engineering

More episodes

View all episodes

36. How Boris Cherny Uses Claude Code AI Agents to Ship 10–30 PRs a Day

35. GPT-5.6 & ChatGPT Work

34. What Is a Forward Deployed Engineer? The AI Role Every Tech Company Wants

33. AI Is Now the #1 Reason for Layoffs: Reading the 2026 Workforce Data

32. The State of AI Engineering: What a Thousand Companies' Telemetry Reveals

31. SpaceX Buys Cursor: Rockets, AI, and the $60 Billion Bet

30. AI Model Cost War: Claude Fable 5 vs Chinese Open Source Models

29. Claude Opus 4.8: Benchmark Results and Review