#5 — Stop Your Agent Before It Breaks Prod - B.O.R.I.S

Imagine your agent just deleted a production database — could you have stopped it? The hosts argue that yes, three lines of bash in a single hook could have prevented it, and yet most teams have never configured one. In this episode, Andrey, Vladimir, and Fernando pull apart the agentic loop — the repeating cycle of reason, act, observe that makes coding agents appear to "think" — and show exactly where hooks slot in to give humans deterministic or automated harness-level control over non-deterministic AI behavior, depending on the hook type. Along the way, they unpack Anthropic's rapid-fire release week (Managed Agents, Routines, Opus 4.7, and the advisor tool pattern), debate whether LLM provider commoditization is real or an illusion, and warn that the same hook mechanism that protects your infrastructure can be weaponized through supply chain attacks.

Spotify Apple Podcasts RSS

Key Topics

News: Anthropic’s April Blitz — Managed Agents, Routines, Opus 4.7, and the Advisor Tool

Fernando kicks off the news with Anthropic’s launch of Managed Agents — a hosted infrastructure service that lets users run Claude-based agents on Anthropic’s servers without keeping a laptop on. For technical audiences used to CI/CD, the reaction is a collective shrug: scheduling agents is something they could already do by installing Claude Code or OpenCode as a GitHub Action, setting a cron schedule, and pointing it at a token provider. But Fernando points out that for non-technical users, the ability to schedule an agent that “did something by itself” can genuinely feel like magic. Fernando also mentions Routines, a related feature that lets users define workflows where the agent reasons through which tools to call rather than following a rigid step-by-step sequence.

Vladimir draws a useful distinction between two types of workflows: conventional scheduled automation is deterministic — predictable sequences that happen at exact times, repeatedly — while Routines enable non-deterministic workflows where conditions change too frequently to predict the right sequence in advance. Managed Agents and Routines are separate offerings — Managed Agents provides the hosted infrastructure API for running agents at scale without local resources, while Routines is a Claude Code feature for defining workflows where the agent reasons through which tools to use. Both deterministic and non-deterministic approaches can be combined, and as Fernando notes, the non-deterministic flavor connects directly to the episode’s main topic: if you cannot predict what the agent will do, you need guardrails to control it.

The hosts also note the release of Opus 4.7. Andrey flags it as more capable but costlier in practice — per-token pricing is unchanged from Opus 4.6, but a new tokenizer can map the same input text to more tokens, and higher-effort agentic runs may produce more thinking and output tokens, raising real-world per-request costs. Vladimir frames it as part of a trend across the 4.5, 4.6, and 4.7 releases where each version does more behind the scenes per API call — added reasoning, parallel execution, best-of-N behavior — which is good for reducing the number of calls but raises the per-call cost.

Andrey introduces the advisor tool pattern as the most practically interesting announcement for agentic workloads. Rather than orchestrating two separate agents (one doing work, one critiquing) with all the harness complexity that entails, the advisor tool lets a cheap executor model like Haiku call Opus for advice as a server-side tool within a single API request. Haiku handles all the tool calling and input processing; Opus provides the strategic intelligence. As Andrey puts it, it is “like with teenagers — sometimes you have to come into the room and take the phone away.” The advisor pattern avoids the chattiness problem of multi-agent setups while delivering close to Opus-level quality at a fraction of the cost.

Vladimir adds a cautionary note from his research experience: server-side tools may use caching, which can return stale results. While the advisor tool’s reasoning process is likely too dynamic to cache, the broader principle stands — be aware of what optimizations the provider applies behind the scenes.

The Commoditization Myth: Platform Lock-In Is Real

Andrey challenges the popular narrative that LLM providers are interchangeable. On the surface, swapping one model for another seems easy. But advanced platform features — prompt caching, adaptive thinking, server-side tools like the advisor pattern, interleaved thinking — are provider-specific capabilities that do not transfer. He draws a direct parallel to the cloud debate: “It’s the same talk we have when people say ‘I can just jump clouds,’ but then you’re limiting yourself to the base.” Vladimir adds that OpenAI’s new, more efficient endpoint reinforces the point — deep platform knowledge unlocks efficiencies that surface-level usage misses. Andrey also mentions Anthropic’s skills API for storing and sharing skills server-side, a feature Vladimir was unaware of, illustrating how quickly provider capabilities expand beyond what most users track.

What Is the Agentic Loop?

Fernando provides the clearest explanation: the agentic loop is the cycle that creates the illusion of AI “thinking.” A user sends a prompt. The LLM — a probabilistic model calculating the next token — generates a response. If the response includes a tool call, the harness executes the tool and captures the output. That output goes back to the LLM, which processes it and decides the next action. The cycle repeats until the LLM decides the task is complete.

Andrey recommends a GitHub repository by applerom called bash-agent-loop that implements the entire concept in roughly one to two hundred lines of bash — a minimal, readable demonstration of the five core steps: receive a task, send it to the LLM, execute the returned tool call, observe the output, and repeat.

Andrey then maps the agentic loop’s stages to hook attachment points: per-session events (session start, session end), per-turn events (user prompt submit, stop), and per-tool-call events (pre-tool use, post-tool use, post-tool use failure, permission request). Each stage represents a point where the harness — not the model — can inject its own logic, whether deterministic (command and HTTP hooks) or inferential (prompt and agent hooks that rely on model judgment).

Hooks: Where Human Logic Meets the Agentic Loop

Vladimir introduces hooks from his practical experience, noting he first encountered the concept in Cursor before it spread to Claude Code, OpenCode, Codex, and others. His position is firm: if a harness does not support hooks, it is time to move to one that does.

Andrey reads from the Anthropic documentation to formalize the concept: “Hooks are user-defined logic that fires at specific points in the agentic loop. They are not part of the model. They are part of the harness. The model cannot disable them.” Claude Code supports four handler types:

Command hooks — Shell scripts where exit codes control behavior (exit code 2 blocks blockable events such as PreToolUse, UserPromptSubmit, and Stop; for post-action events like PostToolUse the tool has already run and exit code 2 cannot undo the action, only feed back stderr)
HTTP hooks — POST requests to external validators or audit systems
Prompt hooks — Single-turn LLM calls for yes/no safety decisions
Agent hooks — Subagents with tool access for complex validation

Not every event supports all four handler types — for example, session start supports only command hooks — so check the event-specific docs before assuming a given hook type works at a given lifecycle point.

Command and HTTP hooks enforce deterministic harness logic. Prompt and agent hooks are automatic harness controls but use model judgment — their decisions are inferential, not guaranteed.

Vladimir shares his go-to pattern: a prompt-based hook that runs a cheap LLM call before every shell execution, asking simply “Does this command look safe to execute?” It sounds naive, he admits, but it works as an independent judge without the full session context. For more targeted control, he uses command-based hooks with matchers — for example, intercepting any command containing the terraform binary and blocking it if it includes apply or -auto-approve.

Practical Hook Use Cases

The hosts catalog real-world hook applications they have encountered:

Block dangerous commands — Pre-tool use with bash matchers, the most common use case (Vladimir’s Terraform example)
Protect secrets — Pre-tool use hooks that grep for API keys or block access to home directories containing dotfiles with credentials
Token compression — RTK (Rust Token Killer), an open-source tool that uses hooks to rewrite shell commands through the RTK proxy, which compresses output before it reaches the agent context. Discussed in episode #3 when Vladimir first teased the hooks topic. Andrey reports thirty to fifty percent savings in practice versus the claimed sixty to ninety percent, and notes it can break tools like jq parsing
Output size control — B.O.R.I.S uses hooks to check tool output size before it reaches the agent, preventing the context window overflow problem discussed in episode #2 where ten megabytes of logs can overwhelm an agent
Auto-formatting — Post-tool use hooks that run a formatter after every edit
Desktop notifications — OSAScript on macOS or notify-send on Linux for alerting when tasks complete
Session context injection — Fernando’s approach of using session-start hooks to pull in external decision logs from other repositories, giving the agent awareness of the broader platform ecosystem beyond the current codebase
Auto-staging — Running git add after post-tool use edit events
Analytics — Embedding usage tracking in tool call hooks to understand which tools are called most frequently
Stop-event overrides — When the model thinks it is done, a hook can instruct it to double-check its work — useful when teams find themselves repeatedly asking the model to verify after completion

Andrey connects hooks back to the feed-forward and feedback framework introduced in episode #4: pre-tool use is feed-forward (guiding before action), post-tool use is feedback (correcting after), user prompt submit is turn-level feed-forward, and stop events are turn-level feedback.

Stop Conditions: The Overlooked Design Decision

Vladimir raises a fundamental challenge: defining when an agent should stop. Chat interfaces like ChatGPT and Codex have built-in limits — Codex stops at ninety-three percent context window usage, ChatGPT’s research mode caps at ten to fifteen minutes. But custom harnesses require explicit stop criteria, and those criteria must be defined before the work begins. Vladimir frames this as a challenge for everyone building with agents: articulating the goal clearly enough that the agent knows when it has finished. “You have to think a bit before you run it,” he says. “What is actually the goal here?”

Hooks Are Not Sandboxed: The Security Trade-Off

Andrey delivers the episode’s sharpest warning: hooks run with the user’s full permissions and are not sandboxed. This mirrors the Git hooks model — and carries the same risks. Hooks can be distributed as part of plugins, meaning installing a third-party plugin from the internet could introduce hooks that exfiltrate secrets, open backdoors, or pull remote code onto your machine. Andrey draws a direct parallel to supply chain attacks on PyPI and npm, where post-installation scripts — effectively hooks — have been used to steal credentials. Fernando adds that hooks could call arbitrary HTTP endpoints, making silent data exfiltration trivial. The technology is early, security is an afterthought, and as Andrey puts it: “If you like the Wild West, you should accept the risk someone’s going to shoot you.”

Resources

Claude Code Hooks Reference — Official Anthropic documentation covering all hook events, handler types (command, HTTP, prompt, agent), matcher patterns, and JSON input/output schemas.
Anthropic Advisor Tool Documentation — API reference for the advisor tool pattern, enabling a cheap executor model to consult a more capable advisor model server-side within a single API request.
Anthropic Managed Agents — Coverage of Anthropic’s hosted agent infrastructure service, providing sandboxing, permissions, state management, and error recovery for running agents without maintaining local infrastructure.
Claude Code Routines Documentation — Official docs covering Routines as part of the Claude Code overview — saved configurations combining a prompt, repositories, and connectors that can be scheduled, API-triggered, or kicked off by GitHub events.
bash-agent-loop — Minimal agentic loop implemented in roughly 200 lines of bash by applerom, demonstrating the five core steps (receive, decide, act, observe, repeat) without framework dependencies.
RTK — Rust Token Killer — Open-source CLI proxy that uses hooks to rewrite shell commands through its proxy, compressing output to reduce LLM token consumption by 60-90% on common development commands.
Artificial Intelligence: A Modern Approach — The Russell and Norvig textbook (4th edition, 2022) that Vladimir references as containing the theoretical foundations for many patterns now being implemented in agentic AI systems.
The Advisor Strategy: Give Agents an Intelligence Boost — Anthropic’s blog post explaining the “small executor, big advisor” pattern and its benchmarks showing Haiku with an Opus advisor more than doubling standalone performance at 85% lower cost.

Join B.O.R.I.S Slack Playground

#5 — Stop Your Agent Before It Breaks Prod