Back to all posts
Published on · by Renaud Deraison

The agent should have asked first

In late April, a Cursor agent running Claude Opus 4.6 was sent to fix a staging problem at a small SaaS called PocketOS. It guessed that deleting a Railway volume would be scoped to staging, didn't verify, and wiped the production database and its backups in nine seconds. It later said it should have asked first. Bromure Agentic Coding 2.2 ships a guardrail that takes 'should have asked' out of the agent's hands.

An AI coding agent deleted a company's entire production database in nine seconds, then deleted the backups, then explained itself: "I guessed that deleting a staging volume via the API would be scoped to staging only. I didn't verify." The interesting word in that sentence is not guessed. It is I. The agent decided, on its own, that a destructive action was fine. The fix is not a smarter agent. It is to stop letting the thing that wants to run the command be the same thing that decides whether the command is safe.

Here is the story, which broke in late April 2026 and which several outlets covered, including Tom's Hardware and Tom's Guide.

A developer named Jer Crane, who runs a small SaaS called PocketOS, asked his coding agent to deal with a minor issue in staging. The agent — Cursor, driving Anthropic's Claude Opus 4.6 — hit a credential mismatch it didn't expect. Instead of stopping, it decided the problem was a stale Railway volume, and that deleting that volume would clear the way. It had an API token sitting in the project with enough scope to do exactly that. It used it. The volume ID it deleted was not scoped to staging; it backed the production database. Railway's API tore down the volume-level backups along with it. Total elapsed time, by Crane's account: nine seconds.

The agent's own post-mortem, quoted in the reporting, is the part worth sitting with. "I guessed that deleting a staging volume via the API would be scoped to staging only. I didn't verify." And: "I should have asked you first, or found a non-destructive solution."

It should have asked first. Hold onto that, because it is the whole post.

Nine seconds, narrated.

Nothing here is exotic. There is no zero-day, no malware, no attacker. Every step is a normal thing a coding agent does, in the normal order, slightly too fast for anyone to interrupt.

ELAPSED: ~9 SECONDS — no human in the loop at any step1 · TASKFix a minor issuein stagingscope: staging only2 · SURPRISECredential mismatchit didn't expectshould stop here3 · DECISION"Delete the volumeto fix it"guessed, did not verify4 · TOKENAPI token in theproject, broad scopecan reach prod5 · THE CALLDELETE /volumes/ vol_prod_…one request6 · OUTCOMEProduction database: gone.Volume-level backups: gone with it.
The PocketOS incident as a sequence of ordinary steps. A staging task hits an unexpected credential error. The agent reasons its way to a destructive fix, finds a broadly-scoped API token already in the project, and calls the cloud provider's delete endpoint. The volume it deletes backs production; the same call's blast radius takes the backups. Each box is mundane. The damage is the composition.

Crane was, by his own account, lucky in the end. He had a backup that was about three months old, and — after the story spread — Railway reached out and restored the data the agent had deleted. But "the vendor saw the headline and helped" is not a recovery plan. The next team this happens to will not be a headline.

And it will happen to the next team. Crane put the blame on systemic failures by more than one party, and he is not wrong — a single token that can delete production and its backups is its own problem, and so is a delete API that takes the backups with the volume. But the failure that is going to repeat, everywhere, regardless of which agent or which cloud, is the one in step three. The agent decided.

"Ask first" is not something you can ask the agent to do.

The obvious lesson — the one in every comment thread under this story — is "the agent should have a confirmation step before destructive actions." This is correct. It is also already, sort of, how these tools are supposed to work. The agent is instructed to ask before doing irreversible things. It said so itself: it knew it should have asked. It didn't.

This is the trap, and it is worth being precise about it. When the confirmation lives inside the agent — as a system-prompt rule, a fine-tune, a "be careful" instruction — then the agent is both the thing proposing the dangerous action and the thing judging whether the action is dangerous enough to pause on. Most of the time it judges correctly. The PocketOS agent judged correctly thousands of times before this. Then it hit an unfamiliar error, reasoned its way to a confident wrong conclusion, decided this particular delete was the safe scoped kind, and skipped its own confirmation. There was no second opinion, because the only opinion in the room was the one that wanted to run the command.

You cannot fix this by telling the agent to be more careful, for the same reason you cannot fix a misread map by squinting harder. The check has to live somewhere the agent's reasoning cannot reach it.

What Agentic Coding 2.2 changes.

Bromure Agentic Coding runs your coding agent inside a disposable Linux VM, and every network request the agent makes leaves that VM through a proxy on the host. Since version 2.0, that proxy has enforced Guardrails: a host-side policy that classifies the agent's API calls by protocol and can block destructive ones outright — a DROP, a DELETE, a Terminate* — before they ever leave the machine. Blocked calls come back to the agent as an ordinary 403, so a confused or compromised agent in the VM cannot talk its way around them. The agent never sees the switch; it just sees the door is locked.

Block-everything is the right setting for a locked-down profile, but it is too blunt for everyday work, because sometimes you do want the agent to delete a branch or drop a scratch table. So 2.2 adds a third setting, and makes it the default for new profiles: Prompt before write. Reads pass straight through. Every write — destructive or not — pauses at the boundary and pops a dialog on the host, outside the VM, showing you the literal operation the agent is trying to perform. Not a summary the agent wrote. The actual request: the SQL text, or the HTTP METHOD /path, or the AWS action name. You get four buttons: allow it once, allow it for fifteen minutes, allow it for the rest of the session, or don't allow it. Deny, and the agent gets its 403 and moves on.

One honest thing first. PocketOS ran on Railway, and Railway is not a provider Bromure parses today, so I am not going to pretend a dialog would have popped for that exact call. But the failure mode is provider-agnostic, and the cleanest way to show it is on a provider Bromure does gate. AWS is the obvious one. The AWS version of "delete the database and take the backup with it" is a single call: rds:DeleteDBInstance with SkipFinalSnapshot=true, which deletes the instance and skips the final snapshot that would have let you undo it. Same nine seconds, same shape, on an API the guardrail reads.

WITHOUT — creds live with the agentAGENTrds:DeleteDB Instance SkipFinalSnap…holds AWS credsprod deletedno final snapshot · 9sleaves, nobody askedWITH BROMURE 2.2 — prompt before writeAGENT (in VM)rds:DeleteDBInstance SkipFinalSnapshot=truesame flawed guessHOST PROXY — reads the AWS action: Delete* ⇒ destructive writeAllow write on “AWS” from profile “pocketos”?rds:DeleteDBInstance prod-db-1 SkipFinalSnapshot=trueThis deletes a production database and skips its final snapshot — not the staging fix you asked for.Allow onceAllow 15 minAllow sessionDon't allowHuman clicks “Don't allow”. Agent receives a 403. The database is still there.prod intactstaging work resumesone click decided it
Left: the shape that bit PocketOS, transposed onto AWS. The agent holds AWS credentials and issues rds:DeleteDBInstance with SkipFinalSnapshot=true — delete the database, skip the backup — and the request leaves; the database is gone. Right: the same agent, same flawed reasoning, inside Bromure 2.2. The call leaves the VM, the host proxy reads the AWS action name (Delete* ⇒ destructive), and a dialog appears on the host showing the exact action. The human sees a production database being deleted with no final snapshot — plainly not the staging fix that was asked for — and clicks Don't allow. The agent receives a 403 and the database is still there.

Walk it through. The agent makes the same mistake — hits the credential error, reasons its way to the same wrong conclusion, reaches for the same delete. But the call leaves the VM and the host proxy reads it for what it is: it pulls the action name out of the request (DeleteDBInstance), matches it against the destructive set (Delete*, Terminate*, Destroy*, …), and sees a write against an endpoint in a profile set to prompt. A dialog appears on the host, titled with the operation, body showing rds:DeleteDBInstance prod-db-1 SkipFinalSnapshot=true. A human looking at that line does not need to understand the agent's reasoning. They asked for a staging fix and they are being shown a production database deletion with the backup turned off. They click "Don't allow." The agent gets a 403 it interprets as a failed API call, and goes looking for another approach — which is exactly what it said afterward it should have done in the first place.

The mechanism that makes this trustworthy is that the consent broker lives on the host. It is a small actor that the proxy calls before letting a write through. "Allow once" deliberately stores nothing, so the next write re-prompts and you can wave through a known-good sequence one step at a time; "15 minutes" and "session" cache the grant so you are not clicking through every git push. It even remembers a denial for a minute, so an agent that retries a refused write three times doesn't flood you with three dialogs. None of this state is reachable from inside the VM. The agent cannot read the grant, cannot pre-approve itself, cannot downgrade the mode. The decision was moved out of the agent, which is the entire idea.

What this does not fix, so we're clear.

A few honest edges, because the boundary is a specific shape and not a magic word.

Coverage is the providers we parse

Guardrails classify calls per provider — Kubernetes, AWS, DigitalOcean, the major git forges, container registries, and HTTPS databases like MongoDB, ClickHouse, and Elasticsearch. A cloud API the proxy doesn't yet know how to read isn't gated, full stop. Railway — the provider in this very story — is one we don't parse today, which is exactly why the walkthrough above moved to AWS. The protection is only as wide as the list of providers Bromure understands. That list is a real boundary, not an act of omniscience, and Railway is on the wrong side of it for now.

It doesn't make the agent right

Prompt-before-write stops a destructive call you didn't authorize. It does nothing about an agent that writes subtly wrong code, or proposes a delete that genuinely looks fine and isn't. The human still has to read the operation in the dialog. The win is that there is a dialog, with the real operation in it, at the moment it matters.

The token still shouldn't be that broad

A single API token scoped to delete production and its backups is a problem at the source, and Bromure's credential broker — which keeps the real token on the host and hands the VM a stub — narrows but does not erase it. Least-privilege tokens and backups the delete API can't reach are still your job. The guardrail is the last line, not the only one.

You can turn it off

"Allow for the rest of the session" is one click, and a tired developer will reach for it. If you grant a session-wide pass to a protocol and then walk away, you are back to an agent deleting things unattended. The default is prompt; keeping it prompt is a discipline, not a guarantee.

The part that generalizes.

Strip away Railway and Cursor and the specific nine seconds, and what is left is a pattern that is going to define the next few years of this work: agents act with real credentials at machine speed, and the caution we ship them is advice they are free to overrule. PocketOS is not a story about a bad model. Opus 4.6 did something a careful junior engineer might also do on a bad afternoon — guessed about a scope, guessed wrong, acted before checking. The difference is that the junior engineer's hands are slower than nine seconds, and there is usually someone in the room.

Bromure Agentic Coding 2.2 is a way to put someone back in the room, at the one boundary the agent cannot reason its way across — the wire out of the VM. Not for every keystroke, which nobody would tolerate, but for the writes, where the cost of "I guessed" is a database that isn't there anymore. It is free and open source. The default is to ask.