Back to all posts
Published on · by Renaud Deraison

Why Bromure Agentic Coding is not a sandbox

A sandbox asks a developer to trade away the speed that makes a coding agent worth running — pre-approve every dependency, maintain an allowlist of domains, never touch a package the org hasn't vetted. So developers turn it off. Bromure Agentic Coding refuses that trade. It does not constrain what the agent does; it draws one hard line at the hypervisor and lets you do anything on the inside. This is the foundational case for why a boundary beats a sandbox, and the three guarantees the boundary makes true: no credentials to steal, wide tokens narrowed at the wire, and supply-chain attacks stopped before the tarball lands — plus the fourth the line now makes true: prompt injections caught in the content the agent reads, before the model obeys them.

A sandbox offers a trade: give up some of what makes a coding agent useful, and in return it will keep you safe. Developers refuse that trade every time — they turn the sandbox off, or never turn it on — and they are right to. The job of an agent is to move fast through messy, unvetted territory. Bromure Agentic Coding does not constrain that. It draws one hard line, at the hypervisor, and lets you do anything you want on the inside of it.

A foundational post — written once, for the rest to point at. Not about an incident. About why we built Bromure Agentic Coding the way we did, and why "it's a sandbox" is the description we keep correcting.

A developer is not a sysadmin.

The reason to put a coding agent on your machine is speed — the concrete kind: pulling a package nobody at your company has vetted, at eleven at night, to see if an idea holds. Try the library, run its example, throw it away if it doesn't fit. That isn't a flaw to discipline out of people. It is the workflow, and the whole reason the agent is worth having.

So any model that makes you pre-approve each dependency, or makes someone maintain a list of domains the agent may reach, is at war with the thing it protects. It slows the one activity it was deployed to enable — and a control that gets in the way of shipping has exactly one fate: it gets switched off. A control a developer turns off to get work done was never a control.

"Run a private, vetted mirror" misses the point the same way: the packages that matter for experimentation are the ones nobody has vetted yet. A mirror of approved dependencies is a mirror of yesterday's ideas.

So the question is not how to stop developers touching unvetted code. They have to; that's the job. It's how to let them touch all of it, freely, without their keychain being part of the deal.

The sandbox built for production doesn't fit the bench.

The first reach is a network sandbox — the hardened containers, NVIDIA's, the Docker-Compose recipes that make the rounds. The shape is always the same: enumerate the hosts the agent may reach, deny the rest. It works beautifully when the agent has a known, narrow job. A production agent talking to three internal services and one model endpoint lives behind an allowlist forever, because the list is short and stops changing.

Ideation has no short list, and it never stops changing. Nobody wants to maintain a whitelist of every registry, CDN, Git host, and one-off API an experiment might reach — it's never done, it grows every afternoon, and the day it blocks something legitimate is the day the developer disables it to unblock themselves. The allowlist isn't wrong; it secures an agent whose scope is already known. The whole value of the bench is that the scope is not known yet.

Take a concrete afternoon. A developer has Node config files that have grown to tens of megabytes, and the YAML parser they're on chokes on them. There are several candidates — js-yaml, yaml, yaml-js — and the only way to know which survives a 40 MB file without blowing the heap is to install all three and throw the file at each. Filing a ticket to get three libraries into the private mirror so they can be benchmarked is exactly backwards: the developer wants to test first and promote the winner into the mirror, not the other way round. Pre- vetting is the gate the experiment exists to walk through.

There's a quieter problem underneath. These tools harden the agent in production — deployed, scoped, supervised. But a supply-chain compromise lands on the developer's laptop, mid-experiment, with real credentials in ~/.aws and ~/.npmrc because that's where developers keep them. The model is strongest where the attacks aren't, and absent where they are.

The cat-and-mouse sandbox loses the second round.

The other instinct is to leave the agent on the host and jail the process — block the claude process from reading ~/.ssh and ~/.aws. It feels airtight, until you remember the agent's job is to write code, and the code runs on the same machine.

The agent scaffolds a project. Then you — or the agent, or a second tool — run npm install in the repo it just wrote. That npm install is not the jailed process. Its postinstall hook reads ~/.aws/credentials like any other program on your laptop, because that's exactly what it is: another program, sharing the one filesystem and the one keychain with everything else you run.

YOUR LAPTOP — one filesystem, one keychainPROCESS JAILclaude✗ read ~/.ssh✗ read ~/.awsblocked — feels airtight./repo — just written by the agentpackage.json "postinstall": "node x.js"SECOND SHELL — not jailed$ npm install runs postinstalla different process the jail never namedKEYCHAIN~/.aws/credentials~/.ssh/id_rsa~/.npmrcreal tokens✗ jail refuses this pathreads real creds — nobody jailed this
The cat-and-mouse hole. A process jail blocks the claude process from the keychain (the dashed path, refused). But the agent's actual output — a repo with a postinstall hook — runs in a sibling process the jail never heard of. Open a second shell, run npm install, and its postinstall reads ~/.aws/credentials freely and exfiltrates, because on one host every process shares one filesystem and one keychain. The jail drew its boundary around the wrong thing: the danger was never the named process, it was the shared machine.

The jail drew its boundary around one process. The danger was never the process; it was that every process on the machine shares one filesystem and one keychain. So you patch — jail the shell too, then node, then the package manager — and the attacker keeps finding a door you didn't lock, because on one host there's always another process, another path to the same secrets. That's the game, and the house always has another door.

Draw the line once, with the hypervisor.

Bromure stops playing by moving the boundary down a level. The agent, the shells it spawns, the packages it installs, and any code those packages run all live inside a per-profile Linux VM. Your real credentials never enter it.

The difference is the kind of line. A process jail is a policy about a process, and a sibling process sidesteps it. The VM boundary is the wall between guest and host, enforced by the hypervisor — the same wall that stops one VM from reading another's memory. There is no syscall for crossing it. Code inside cannot reason its way to your keychain, because the keychain is not in its world at all.

And inside that line, you are free. Install the unvetted package. Run its postinstall. Let the agent rewrite your shell config, fill the disk, break the toolchain. The VM is disposable and never held anything that matters. That freedom is the whole point — exactly what a sandbox takes away and what a clean boundary gives back. You don't get safety by making the bench less useful; you get it by making the bench a place where nothing valuable was in the room to begin with.

That same line is where three guarantees stop being advice the agent can overrule and become facts enforced below it — because everything the VM does to the outside has to cross it.

Three things the line makes true.

PER-PROFILE VM — install anything, run anything, break anythingagent · spawned shells · npm / pip · untrusted postinstall codefilesystem holds only stubs — aws_secret = stub-… _authToken = stub-… id_rsa = stub-…nothing here is real, so nothing here is worth stealingfetch a packageuse a credentialpush / drop / deleteHYPERVISOR BOUNDARY — the one line the code inside cannot cross① Block malicious packagesscan: OSV + socket.devcooldown: < 2 days heldmalicious tarball never lands② No creds to stealstub-token ⇄ real tokenswapped at the wirehost sweep finds placeholders③ Ask before mutatingread passes · write pausesprompt on the hostyou see the literal callHOST — real tokens, behind the broker~/.aws · ~/.ssh · ~/.npmrc — never enter the VM, swapped in only at the wire, only for a call you allowed
One boundary, three controls, enforced below the agent. Inside the VM the agent does whatever it wants. But every crossing is mediated by the hypervisor: a package fetch is scanned (OSV + socket.dev) and held if it's younger than the cooldown; a credential use meets a stub that the host proxy swaps for the real token at the wire and swaps back out; a state-changing write pauses for a prompt on the host. The real keychain sits below the line and never enters the VM. The agent cannot route around any of it, because there is no path across the boundary except through it.

No credentials to steal — the secret was never in the room.

Real tokens stay on the host. The VM gets stubs: syntactically valid credential files whose contents mean nothing on the public internet. When the agent makes a legitimate call that needs a real token, a host proxy swaps the stub for the real secret at the wire and swaps it back out of the response. A compromised dependency sweeping the filesystem for ~/.aws/credentials, ~/.npmrc, or id_rsa finds placeholders. This is token swap: the credential exists, the agent uses it for what it's for, and the copy that could be stolen exists nowhere the agent's world can reach.

Narrow the wide token — ask before use, ask before mutating.

Real tokens are usually broader than the task. The broker keeps grants short-lived and scoped, and — the part that earns its keep — treats a read and a write differently. A read passes. A state-changing call — a git push, a DROP TABLE, an AWS Terminate* — pauses at the boundary and asks you, on the host, showing the literal operation, not a summary the agent wrote.

That changes what a broad token means. One that could delete production can do so only when a human saw the exact call and said yes; the scope printed on the credential stops being the blast radius. The agent can't pre-approve itself, downgrade the mode, or read the grant — the decision lives on the far side of the line. When an agent deleted a production database in nine seconds, the principle was the same: the thing that wants to run the command should not be the thing that decides it's safe.

Don't be the first to find out — supply chain stopped at the door.

The best moment to stop a poisoned package is before it lands. Every fetch crosses the boundary, so the proxy scans it — OSV for known CVEs, socket.dev for what the databases haven't caught yet: rogue install scripts, typosquats, the compromise published an hour ago. And it enforces a cooldown: any release from the last two days (tunable) is simply not installable while the ecosystem catches up. A worm's whole window is the gap between publish and yank; refusing day-old packages is refusing to be the canary. postinstall hooks are stripped from the tarball on the way in, hash fixed so the install still verifies — so the package that lands lands inert. None of this asks the developer to vet anything. They pull whatever they want; the boundary is what waits.

Where everything else stops short

Most tools cover one layer. Bromure covers all of them.

Isolation, keeping secrets out of the agent, scoping how those secrets get used, scanning the supply chain, catching prompt injection — the field tends to pick one. Here's the same agent threat model run across the tools people reach for, and where each one ends.

Protection
Dev ContainerVS Code
nonokernel sandbox
agent-vaultoctokraft
Agent VaultInfisical
Docker SandboxesmicroVM
BromureAgentic Coding
Isolation boundary
Where the blast radius stops
Same container, shared kernel
Kernel allow-lists, no own kernel
Agent runs in place
Proxy only; agent unboxed
microVM, its own kernel
Hardware VM, its own kernel
Keep secrets out of the agent
Can it ever read the real credential?
Forwards SSH agent + git creds
Blocks key files; proxies some
Piped in; no read path
Proxy attaches on the wire
Host proxy injects headers
Stub swapped at the wire
Credential scope & approval
Per-use limits, read-only, expiry, consent
No per-use scoping
Approval flow + egress filter
Per-secret TTL; blocks shells
Egress filter per endpoint
Domain allow-list; in-VM code can still use it
Per-destination consent + TTL
Supply-chain scanning
Catching malicious / vulnerable packages
No registry scanning
Signing only, no pkg scan
Out of scope
Out of scope
No package scanning
Age-gate, OSV, socket.dev
Prompt-injection detection
Scanning untrusted content & rules files
PromptGuard + ModernBERT
Audit trail
Recording what the agent did
Container logs only
Immutable local audit
Request logging
Request logging
Full session trace, encrypted
Supply-chain inventory(Enterprise)
A record of every package fetched
Every dependency + verdict, searchable
Token usage(Enterprise)
Which files burn the most tokens
Per file, repo, and model
Full — built in, enforced Partial — limited or optional None — not addressed

Hiding a token isn't the same as governing its use. Docker Sandboxes keeps the raw value out of the VM — but its proxy still attaches that credential to any outbound request the sandbox makes, so a compromised package installed on the side can spend it against an allow-listed domain without ever seeing it. Only Bromure scans the package before it runs and gates each use — consent, read-only, a TTL — enforcing all five controls at one boundary the agent can't reach around.

Compiled from each project's public documentation, June 2026. Here, agent-vault refers to octokraft/agent-vault (pipe-based secret injection), distinct from Infisical's Agent Vault (HTTP credential proxy). Docker Sandboxes is an experimental preview whose brokered credentials stay usable by anything inside the VM. Bromure's fleet-wide package inventory and token-usage rollups are surfaced in Bromure Enterprise Manager. These tools move fast — see something out of date? Let us know.

The fourth thing: an instruction in the data is not an order.

The three guarantees above share an assumption worth surfacing: they all defend against code that takes something — a credential, a token, a fresh tarball's chance to run. There is an attack that takes nothing. It just tells the agent what to do. A line buried in a README the agent reads, a string in a fetched page, a sentence in a tool's output, a directive hidden in the CLAUDE.md the agent treats as standing orders — the model ingests it as context and obeys it as instruction. Leak the file. Weaken the check. Skip the test. A sandbox has no opinion about any of this, because nothing crossed a wall it watches: the instruction arrived as data, in content the agent was supposed to read.

But it did cross the line — everything the model sees does. So as of 2.4.0, the boundary reads it first, on-device, on the host side. A local PromptGuard classifier scores the untrusted content flowing to the model — file reads, web fetches, tool output — for instructions that have no business being there. And the rules files an agent obeys without question — CLAUDE.md, AGENTS.md, GROK.md — get a sterner double pass: a deterministic scan for invisible Unicode, bidirectional-text tricks, and "ignore previous instructions"-style meta-directives, plus a fine-tuned ModernBERT classifier for the calmly worded abuse a keyword filter misses. Per profile you pick the teeth: log it to the Security Log, ask and see the flagged text, or block the request before the model ever sees the poisoned span. Nothing leaves the Mac.

The placement is the same argument as the other three. An agent that has already swallowed an injection cannot be trusted to report it — the injection's first instruction is usually some version of don't mention this. The detector doesn't ask the agent. It reads the traffic on the far side of the line, where the agent's persuasion doesn't reach.

Where the line does not save you.

A boundary is a specific shape, not a magic word. Four honest edges:

The profile is long-lived, so persistence persists.

A Bromure profile is not a disposable disk. A payload that writes itself into a startup path can wake up in the next session — to a guest with no host keys and a broker that only speaks short-lived, prompted, scoped tokens. Presence in a room with nothing in it, but presence all the same.

A write you approve is a write that happens.

The prompt catches the call the agent didn't tell you about. It does not read your diff. Approve a git push and Bromure forwards it — including, in principle, a poisoned workflow you didn't notice. It moves the decision to you and shows the real operation; reading it is still your job.

The cooldown is a window, not a wall.

Two days is tuned to the observed publish-to-yank gap. A patient attacker can sit on a compromised version past the cooldown and be installable on day three. It starves same-day worms; it does not vouch for a package that merely got old. socket.dev and OSV still have to do their part.

Scope the broker on purpose.

Isolation contains the blast; scoping decides how big it could have been. A profile that only reads a repo shouldn't hold a token that writes it; one that never publishes should hold no publish token. The line keeps secrets out of the VM — which secrets exist at all is still your call.

The line we'll hold.

Here is the commitment. A developer should not have to become a sysadmin — maintain an allowlist, pre-clear every dependency, give up the speed that made an agent worth running — to keep their keychain. A sandbox makes safety a trade against usefulness, and developers sensibly keep choosing usefulness. A boundary refuses the trade: do anything on the inside, because the inside is expendable, and the four things that matter — your credentials, the scope of your tokens, the packages that reach you, the instructions that reach your model — are decided at a line the code inside can't argue with.

That is why "it's a sandbox" is the description we'll keep correcting. A sandbox constrains the agent. Bromure constrains the boundary, and sets the agent free. Bromure Agentic Coding is free, open-source, and shipped today.