documentation

[PROPOSAL: DESIGN ONLY - NOT YET IMPLEMENTED]

Controlled Edit Mode: Feasibility & Verification Report

Date: 2026-06-26
Subject: Executable Proof of Workspace & Build Lifecycle Safety


1. File Lifecycle Verification

We verified the agent’s core File IO and editing services by piping sequential commands directly into the cursor-agent binary.

Executed Command Sequence

echo "write:test_lifecycle.txt Hello World\nread:test_lifecycle.txt\nreplace:test_lifecycle.txt:World:Antigravity\nread:test_lifecycle.txt\nexit" | ./build/bin/cursor-agent

Captured Console Evidence

  local · llama3.2:3b

[offline · llama3.2:3b | llama3.2:3b]
> File 'test_lifecycle.txt' written successfully (11 bytes)
> Hello World
> Successfully replaced 1 occurrence(s)
> Hello Antigravity
> Goodbye
Agent run completed

Verification


2. Build Lifecycle & Recovery Verification

We simulated a broken edit cycle to verify the agent’s workspace checkpointing, compile failure detection, and automatic restore capabilities.

2.1 Checkpoint Creation (Baseline)

Before introducing any errors, we created a baseline workspace checkpoint test_cp via the interactive CLI:

echo "/checkpoint create test_cp \"initial clean state\"\nexit" | ./build/bin/cursor-agent

Output:

Creating checkpoint 'test_cp'...
 Checkpoint created with ID: 5d57002d
 Available checkpoints:
  5d57002d - test_cp (217 files, 5296 KB)

2.2 Introducing Compile Failure

We appended a syntax error to the end of src/main.cpp:

echo "invalid_syntax_error_here;" >> src/main.cpp

We then ran the build to verify compiler failure detection:

cmake --build build --config Release

Output (Exit Code: 2):

[ 28%] Building CXX object CMakeFiles/cursor-agent.dir/src/main.cpp.o
/Users/bniladridas/Desktop/cursor/src/main.cpp:655:1: error: a type specifier is required for all declarations
  655 | invalid_syntax_error_here;
      | ^
1 error generated.
make[2]: *** [CMakeFiles/cursor-agent.dir/src/main.cpp.o] Error 1

2.3 Restoring Workspace

We restored the workspace back to checkpoint 5d57002d using the C++ restore handler:

echo "/checkpoint restore 5d57002d\nexit" | ./build/bin/cursor-agent

Output:

Restoring from checkpoint 5d57002d...
Note: A backup will be created automatically before restore.
 Successfully restored from checkpoint 5d57002d

2.4 Verify Clean Build

We verified the file was completely reverted:

git diff src/main.cpp
# Output: (empty, 0 differences)

We then compiled the project again to ensure it successfully builds:

cmake --build build --config Release
# Output: (completed successfully, 100% build target built)

3. Loop Safety & Runaway Mitigation

We audited the C++ orchestrator loop to document early-termination and runaway mitigation conditions:

  1. Unconditional Iteration Cap:
    • Condition: iteration_count >= MAX_ITERATIONS in execution_engine.cpp:1027.
    • Threshold: Capped hard at 20 iterations. The agent loop can never exceed this limit regardless of LLM prompts.
  2. Confidence-Based Termination:
    • Condition: ConfidenceService::should_stop(combined_confidence, 0.2) in execution_engine.cpp:1190.
    • Behavior: If the compiler returns multiple failures, the build confidence scores drop (cr.score = 0.25). The average confidence score decreases with each attempt. If it falls below 0.2, the engine halts immediately and flags InsufficientEvidence.
  3. Execution Safety / Substring Filtering:
    • Condition: CommandService::is_dangerous_command checks all inputs.
    • Behavior: Commands containing blocked tokens (e.g. rm, sudo, dd) are automatically rejected prior to invocation.

To make Controlled Edit Mode even safer, the ToolRunner can automatically restore the workspace to the last clean checkpoint any time the build fails, ensuring the workspace is never left in a broken state if the agent exits.

toggle portrait / landscape