#3 — Skills, Powers, SOPs - B.O.R.I.S

What happens when your AI coding tool quietly starts billing like a cloud service — and your team burns through a thousand dollars in a week? Vladimir returns as the hosts share their sticker-shock moment with Cursor's new pricing before diving into agent skills. From Claude Code skills to Kiro Powers to AWS Strands SOPs, the naming varies but the idea is the same — plugging structured knowledge into an agent's brain on demand.

Key Topics

The Cursor pricing wake-up call

The episode opens with a cautionary tale. The team’s annual Cursor subscription renewed and shifted from generous included request allotments toward stricter usage-based overages tied to actual API costs. Within a single week, costs hit roughly a thousand dollars — per developer. Andrey notes they are “late to the party” since monthly subscribers felt the pain back in July 2025, but annual subscribers only got hit now. Fernando stresses the importance of watching automatic recharge settings: “You might get a surprise at the end of the month — or like happened to us, after a week.”

The team is now spreading across multiple tools. Andrey moved back to Kiro, which still offers a generous credit allowance. The team also picked up a Claude Code team subscription, though Vladimir reports burning through limits in roughly an hour of non-intensive work. Fernando estimates running an AI developer purely on model costs — no IDE seat, no human — at five to six thousand dollars per month. The consensus: the era of “just twenty bucks” is over, and teams need to budget for AI tooling like real infrastructure.

One bright spot from Cursor that the hosts want to keep: hooks — the ability to run shell or Python scripts before the AI executes a tool or command. Vladimir calls it his favorite feature and teases that hooks deserve their own episode. For CI/CD code review, the team replaced Cursor with Claude Code, primarily because of its straightforward AWS Bedrock integration. Vladimir also flags OpenCode as worth watching, since it supports Bedrock API keys and could enable temporary-role-based access — important for organizations with strict credential policies.

A leaked model and a reason to prioritize security

Andrey brings up recently leaked documents suggesting Anthropic is close to releasing a model above the Opus class. The notable detail: Anthropic itself reportedly flagged the model’s “advanced cyber capabilities” and plans to give defenders early access so they can harden infrastructure before the model becomes widely available. Fernando frames it plainly — attackers previously constrained by time and resources can now automate at scale. “Forget about the script kiddies,” Andrey adds. Vladimir points out the implications extend to real-time phishing, where attackers could dynamically replicate legitimate UIs as they change. The pricing for such a model remains unknown, but the security implications provide a natural lead-in to the episode’s main topic.

What are agent skills?

Fernando reaches for The Matrix: skills are like plugging Kung Fu into Neo’s brain. An agent already has broad general knowledge, but a skill gives it a structured methodology for a specific task — “now you know how to do it, maybe better than you did before.” Vladimir adds that skills close a knowledge gap by sharing human experience and encoding company- or individual-specific processes. Andrey emphasizes that a good skill defines the desired outcome and structure, then lets the agent figure out the execution.

The most common use case is explaining internal tooling that the agent would otherwise not know how to use. But critically — and this connects directly to the context management discussion from episode #2 — skills use progressive disclosure. Unlike AGENTS.md or CLAUDE.md files that load at session start and immediately consume context, skills only surface their names and descriptions up front; the full skill content and supporting files load only when relevant. This means teams can maintain a large library of skills without bloating every session’s context window.

Anatomy of a skill: more than just Markdown

At its core, a skill is a Markdown file with plain-English instructions. But a plugin — the distribution container — can bundle much more: multiple skill files, shell scripts, text files, MCP server configuration, and even sub-agent definitions. Scripts are particularly useful when deterministic output matters. As Fernando explains, instead of the agent fumbling through commands one by one, a script can collect information quickly and hand back structured results for the agent to reason over. This combination of natural language guidance and deterministic scripts helps reduce hallucinations and keeps outputs consistent.

The security minefield of third-party skills

The hosts are blunt: installing skills from the internet is risky. Vladimir draws the analogy of giving a stranger the keys to your house. Unlike a Docker image that at least runs in a container, downloaded skill scripts execute with your permissions and full filesystem access. Fernando references the OpenClaw ecosystem, where agents can discover and install skills autonomously. Vladimir cautions that public skill hubs already contain confirmed malware — credential stealers, contact stealers, and worse — and that third-party skills should be treated as untrusted code. Independent audits reinforce the concern: Lakera’s research found a coordinated malware campaign and widespread insecure patterns across public skill registries, including rampant command injection vulnerabilities and OAuth over-provisioning. Fernando’s advice: understand the concepts, then write your own skills so you maintain control.

Building a private skill marketplace

The safe alternative is what Claude Code calls a “plugin marketplace” — which, despite the grand name, is simply a private GitHub repository containing a JSON index of your plugins. The hosts recommend organizing by team: each team owns a plugin where they manage their own skills, MCP configurations, and sub-agents without polluting the shared repository. When a skill proves broadly useful, it can be promoted to a default plugin available to everyone. The distribution mechanism is still raw — teams may need to build automation around updates — but the pattern is sound and keeps control within the organization.

Skills, Powers, SOPs — different names, same idea

The ecosystem is fragmented in naming but converging in concept. Claude Code has skills. Amazon’s Kiro has Powers. AWS Strands agents use SOPs (Standard Operating Procedures). Vladimir notes that despite efforts to draw distinctions, the comparisons he has found show minimal real difference — “just calling the same thing, different names.”

SOPs are particularly natural for DevOps workflows. Andrey gives two concrete B.O.R.I.S examples. First: a legacy service that periodically misbehaves. Instead of manually repeating the same diagnostic steps, an SOP encodes the runbook — what to check, in what order, and what constitutes a real problem. Second: generating postmortems. B.O.R.I.S has access to Slack, metrics, AWS, and code. An SOP defines each team’s preferred postmortem format and storage location, so the agent produces a document that actually fits the team’s workflow.

Skills versus static configuration files

The hosts clarify an important distinction. Files like AGENTS.md, CLAUDE.md, Cursor rules, or Kiro steering rules load at session start and immediately occupy context. They are best used as lightweight reference — what the project is about, where things live — to give the agent a head start. Skills, by contrast, use progressive disclosure: their names and descriptions are read at startup, but the full content and supporting files load only when the agent determines they are relevant, whether through hints in the skill definition or explicit slash commands. This is what makes large skill libraries practical without the context-window penalties discussed in episode #2. Fernando adds that skills can live at the user level (available across all projects on a machine) or the project level (scoped to a specific codebase). Some tools, such as OpenCode, also support Claude-compatible SKILL.md skills, while others like Kiro and Strands define their own formats.

Resources

Claude Code Skills Documentation — Official Anthropic documentation covering skill creation, the SKILL.md format, plugin structure, and marketplace setup.
Anthropic Skills Repository — Anthropic’s public GitHub repository of first-party agent skills.
Introducing Strands Agent SOPs — AWS blog post on the SOP format for AI agent workflows, including parameterized inputs and constraint-based execution.
Strands Agent SOP GitHub Repository — Open-source SOP SDK and MCP server, compatible with Strands Agents, Kiro, Claude Code, Cursor, and other MCP-enabled tools.
OpenCode — Open-source terminal-based coding agent supporting AWS Bedrock, Anthropic, OpenAI, and other providers, with no code or context data stored externally.
The Agent Skill Ecosystem: When AI Extensions Become a Malware Delivery Channel — Lakera’s audit of 4,310 published OpenClaw skills, identifying 44 tied to the ClawHavoc malware campaign (with 12,559+ downloads) alongside widespread insecure patterns including command injection vulnerabilities in 43% and OAuth over-provisioning in 70% of analyzed skills.
Cursor Pricing Explained 2026 — Detailed breakdown of Cursor’s mid-2025 pricing overhaul from fixed request allotments to usage-based billing tied to actual API costs.
Anthropic “Mythos” Model Leak — Fortune’s reporting on the leaked next-generation Anthropic model with unprecedented cyber capabilities, and the plan to give defenders early access.

Join B.O.R.I.S Slack Playground

#3 — Skills, Powers, SOPs