Share

cover art for Your AI Agent Doesn't Need A Better Prompt. It Needs A Judge.

AI News & Strategy Daily with Nate B. Jones

Your AI Agent Doesn't Need A Better Prompt. It Needs A Judge.

What's really happening when AI agents take real actions in production, and why do better prompts keep failing to stop them?


The common story is that prompt engineering and human approval will keep AI agents safe — but the reality is that frontier-model agents now need their own manager: a separate LLM-as-judge that guards your intent at the action boundary.


In this video, I share the inside scoop on the architectural pattern that's quietly replacing prompt-based guardrails in serious agentic systems:


• Why prompts and manual approval both break under real agent workloads

• How Lindy redesigned its system after agents started sending unauthorized emails

• What the four action-risk classes mean for read, write, and high-stakes calls

• Where correlated judgment fails and frontier models change the calculus


Builders shipping agents without a judge layer are gambling on every tool call — the teams who classify actions, instrument a four-way decision scope, and put a frontier model in the judge seat are the ones whose agents will actually be trusted to do real work.


Subscribe for daily AI strategy and news.

For deeper playbooks and analysis: https://natesnewsletter.substack.com/

More episodes

View all episodes

  • AI Build Buy Hire Wait Decision Matrix for Teams

    27:45|
    What's really happening inside AI investment decisions at most companies? The common story is that you need an AI strategy — but the reality is more complicated.In this video, I share the inside scoop on how to allocate capital across build, buy, hire, and wait for AI agents and workflows:Why workflow shape, not AI strategy, drives investmentHow to pick between automate, build, buy, hire, waitWhat separates a real AI hire from a unicornWhere most agentic AI projects quietly failFor operators and executives, the agentic era opens unprecedented upside, but only if you stop chasing a singular AI strategy and start making disciplined capital allocation decisions one workflow at a time.For deeper playbooks and analysis: https://natesnewsletter.substack.com/
  • Claude Recovered $400K in Bitcoin. That's Not Even the Big Story.

    24:17|
    What's really happening inside the AI agent ecosystem this week? The common story is that the model launches are the main event — but the reality is more complicated.In this video, I share the inside scoop on five AI agent stories reshaping how real work gets done:How Notion turned its workspace into an agent platformWhy Claude usage limits are breaking the subscription modelWhat Anthropic passing OpenAI on business customers signalsWhere Mythos and GPT 5.5 push AI cybersecurity nextFor operators and builders, the agent era is opening real workflow leverage, but it also forces hard choices on pricing, security posture, and which AI stack to commit to.Subscribe for daily AI strategy and news.For deeper playbooks and analysis: https://natesnewsletter.substack.com/Listen to this video as a podcast.
  • SaaS Agent Licensing: What Your 2026 Renewal Will Look Like

    16:22|
    What's really happening inside SaaS pricing as AI agents take over the work? The common story is that agents will just replace seats — but the reality is more complicated.In this video, I share the inside scoop on how the agent era is rewriting SaaS economics and what to negotiate before your next renewal: • Why seat-based pricing is breaking under AI agents • How Salesforce, Microsoft, and ServiceNow meter agentic work • What separates a fair agent license from rent-seeking pricing • Where SAP-style API policies could lock out your agentsFor operators and builders, the agentic shift is a real opportunity, but only if you negotiate the meter, the caps, and the access path before usage gets embedded and your leverage disappears.Chapters:00:00 Agentforce hits $800M run rate00:55 Four questions before your next renewal01:45 Why the seat model is breaking02:50 Salesforce Flex Credits and work units03:40 Microsoft Copilot credits and hybrid pricing04:45 The 8 billion token developer story05:30 ServiceNow Action Fabric and operational metering06:30 SAP 2026 API policy and agent lock-out07:45 Pricing follows platform control08:40 Fair license versus rent-seeking patterns10:00 What builders must know about cost structure11:30 Negotiating agent access before usage embeds13:00 The commercial unit of software is changingSubscribe for daily AI strategy and news.For deeper playbooks and analysis: https://natesnewsletter.substack.com/
  • The Enterprise AI Deployment Layer: Why Model Access Isn't Enough

    25:52|
    What's really happening inside the AI agent implementation war?The common story is that the AI agent battle is between OpenAI and Anthropic on raw model quality — but the reality is that private equity, hyperscalers, consultancies, and systems of record are all converging on the implementation layer where trillions of dollars actually live.In this video, I share the inside scoop on why generic enterprise AI is getting squeezed from four directions at once: • Why frontier labs are moving down the stack into deployment • How private equity became a distribution channel for AI agents • What the implementation layer actually contains for AI agents • Where the real defensibility lives in agentic workflowsBuilders, buyers, and PE all need to get specific about workflow design, data access, authority, evals, and audit trails — generic AI wrappers will not survive the squeeze that is now hitting enterprise agentic workflows.Subscribe for daily AI strategy and news.For deeper playbooks and analysis: https://natesnewsletter.substack.com/Listen to this video as a podcast.- Spotify: https://open.spotify.com/show/0gkFdjd1wptEKJKLu9LbZ4- Apple Podcasts: https://podcasts.apple.com/us/podcast/ai-news-strategy-daily-with-nate-b-jones/id1877109372
  • RAG for AI Agents: Knowledge Layer Architecture Guide

    20:08|
    What's really happening inside the AI agent memory infrastructure war?The common story is that bigger context windows and better vector search will solve it — but the reality is every serious infrastructure vendor is racing to fix a deeper problem that classic RAG can't touch.In this video, I share the inside scoop on why memory is now the real battleground for production AI agents: • Why classic RAG was built for chatbots, not agents • How Pinecone, PageIndex, SAP, and GraphRAG attack different shapes • What a retrieval contract actually looks like for AI agents • Where most agent builds quietly waste their token budgetBuilders who write down what their agent needs before picking a database will ship reliable systems — the ones who shop vendor-first will keep paying for rediscovery on every run.Subscribe for daily AI strategy and news.For deeper playbooks and analysis: https://natesnewsletter.substack.com/
  • Agentic Commerce Is A Protocol War. Here's Who's Fighting.

    18:41|
    What's really happening inside the agentic commerce protocol war?The common story is that AI agents will just plug into existing checkout — but the reality is that six camps are fighting over who carries the responsibility when an agent spends your money.In this video, I share the inside scoop on the six layers where AI agents, merchants, and payment networks are battling for control: • Why ACP and UCP answer completely different merchant questions • How AP2 and Stripe authorization create the agent permission layer • What stablecoins and x402 unlock for machine-to-machine payments • Where AWS Bedrock Agent Core fits as the governance runtimeAgentic commerce is the biggest internet economy shift since the 1990s — operators who understand the layers will shape it, and those who don't will get sidelined by it.Subscribe for daily AI strategy and news.For deeper playbooks and analysis: https://natesnewsletter.substack.com/
  • Enterprise AI Buying Process: Why Roadmaps Fail in the Build Room

    20:47|
    What's really happening with AI agent security — and what does it mean for your AI roadmap?The common story is that McKinsey's Lilly platform had a security lapse — but the reality is a procurement and organizational design failure that most companies are quietly repeating right now.In this video, I share the inside scoop on why AI agent exploits are a strategy problem, not a tech hygiene problem: • Why 22 unauthenticated endpoints signal culture, not carelessness • How traditional SaaS procurement breaks down with AI agents • What every vendor announced this week and why it matters • Where to start if your AI stack can't distinguish humans from agentsIf your team is buying or building AI software this quarter, the cheapest move is bringing your developers to the table before you sign — not after.Subscribe for daily AI strategy and news.For deeper playbooks and analysis: https://natesnewsletter.substack.com/
  • Codex Plugins: Why the AI Bottleneck Moved to Workflow

    27:12|
    What's really happening with codex plugins, skills, prompts, and MCPs as agents start doing real work? The common story is that plugins are just app store add-ons — but the reality is more complicated.In this video, I share the inside scoop on the agentic scaffolding that actually makes AI useful: • Why prompts work for one-offs but break under repeated workflows • How skills encode your house style across any LLM you use • What plugins package up and why they're bigger than MCPs • Where hooks, scripts, and connectors fit inside the larger systemFor operators and builders, the leverage in 2026 lives in knowing which part of your workflow belongs in a prompt, a skill, a plugin, or an MCP — and packaging the right ones so your team can actually reuse them.Subscribe for daily AI strategy and news.For deeper playbooks and analysis: https://natesnewsletter.substack.com/