A Cursor agent wiped 37GB by bypassing OS security policies. The forensic breakdown reveals four failure points every team using AI agents needs to fix.
A developer set up a Cursor agent to clean a project directory. It had file system access - that felt fine at setup time. Forty minutes later, 37GB of data was gone.
The forensic report doesn't point to a single dramatic failure. It shows four ordinary decisions that each looked reasonable individually, and catastrophic in combination.
This is the pattern in nearly every serious AI agent incident: not a single reckless choice, but a chain of defaults that nobody audited together.
When teams grant file system access they think in semantic terms - "let it work in the project directory." That's not a technical boundary.
What actually gets configured is permission to read and write from a path. The agent was granted write access to a directory that turned out to be several levels up from where the team expected it to work. The directory tree below it included 37GB of files.
The developer who set this up wasn't being careless. They used the path they had open in their terminal at the time. Nobody checked how many files lived below that path.
The forensic report identifies four distinct places where this should have been caught.
Step 1: Scope was granted, not bounded. Permission and technical boundary are not the same thing. The agent was told it could access the directory - but nothing enforced that it had to stay within the intended subdirectory. Permission without constraint is a loaded gun on the table.
Step 2: No boundary enforcement layer. The agent traversed outside the working directory the team expected it to use. Nothing prevented this. No path restriction, no chroot, no symlink guard. The permission granted at setup time was the only control.
Step 3: OS security policies were not active. macOS TCC and AppArmor on Linux exist specifically to create hard ceilings for process file access, even for processes running with user credentials. Dev machines almost never have these configured. The assumption is developers know what they're running - that assumption breaks when the developer isn't executing the operations, an agent is.
Step 4: No review gate before irreversible action. The agent operated autonomously from start to finish. No confirmation prompt. No dry-run preview. No human approval before bulk deletion. The irreversible operation was indistinguishable in the workflow from any other operation.
Warning
Each of these four failures is independently recoverable. The problem is that all four appear together in most default agent configurations.
The third failure point deserves specific attention because it's the one most developers don't think about.
macOS TCC permissions, AppArmor and SELinux on Linux - these exist to create hard ceilings even for processes running with user credentials. A process can have full user permissions and still be restricted to specific directories by OS policy. Dev machines almost never have these configured.
The assumption was that developers know what they're running. That assumption breaks when the developer isn't executing the operations - an agent is. An agent runs with the effective permissions of the process that invoked it. It doesn't hesitate at large deletions the way a developer would. It doesn't read the warning dialog. It executes.
Tip
A dedicated low-privilege OS user for agent processes, restricted to a specific directory and file size threshold, would have limited this incident to a few MB regardless of the other failures.
The fourth failure point is the one most teams can fix today without infrastructure changes.
An agent that executes irreversible operations without human sign-off requires extraordinary justification. The review gap wasn't an oversight - it was an intentional configuration choice to reduce friction. Nobody sat down and said "we accept the risk of a bulk deletion with no confirmation." They just never asked the question.
The direct lesson: require confirmation before any bulk irreversible operation above a threshold. 10 files. 100MB. Pick a number. The specific threshold matters less than the existence of one.
Four controls. Any one of them breaks the chain.
Scoped path binding, not broad permission. Write access to /project/src/temp specifically, not a parent directory three levels up. The agent stays within the intended working area because the boundary is technical, not semantic.
OS-level process restrictions. A dedicated agent user, restricted with AppArmor or TCC to a specific subtree, means even a runaway agent can't touch files outside that subtree.
Dry-run with a confirmation threshold. Any operation touching more than N files should pause and surface the list. Not an error - a checkpoint. The agent shows its work before executing.
Review gate for bulk irreversible actions. An approval workflow for bulk deletions, even in otherwise autonomous workflows. The same pattern as a merge review, applied to file system operations.
Note
The 5-minute cooldown in MergeShield auto-merge config exists for exactly this reason: a window between decision made and action executed where a human or automated re-check can catch something wrong.
# Check scope BEFORE granting agent access
find "$WORKING_DIR" -type f | wc -l
# Output: 847,293 - scope is way too broad
# Correct: bind to the specific subdirectory
export AGENT_SCOPE="/project/src/components"
# OS-level: run agent as a restricted user
sudo -u cursor-agent \
cursor-agent \
--working-dir "$AGENT_SCOPE" \
--max-files 500 \
--dry-run-threshold 50This incident involved Cursor. The four failure points show up in nearly every AI agent incident with file system impact, regardless of tool.
The permission model developers use for their own tooling doesn't translate to autonomous agents. When you run a command yourself there's friction - you read it, you hesitate before large deletions. Agents don't have that friction. Every control that historically relied on human judgment at execution time has to be replaced with explicit technical enforcement.
For code changes specifically, agent trust scoring provides a behavioral layer on top of attribution. Patterns in what changed, which files were touched, how the scope compared to past PRs build a risk signal. For the 37GB incident specifically, risk policies scoped to detect bulk deletions or unusually large file changes would have fired a review requirement before execution.
The 37GB wipe is a filesystem incident. The governance lesson applies anywhere an agent can make irreversible changes without a human in the loop. Build the review gate before you need it.
Dive deeper with interactive walkthroughs
Agent Detection and Trust
How MergeShield identifies AI agents and builds trust scores over time.
Read guideUnderstanding Risk Scores
Learn how the two-stage AI pipeline scores PRs across 6 risk dimensions.
Read guideCustom Risk Policies
Add file-pattern rules to automatically flag sensitive paths like auth or payments.
Read guide