Sandboxes

Longshot workers execute in isolated sandboxes — ephemeral cloud environments that are created, used once, and destroyed. This prevents conflicts, ensures reproducibility, and enables massive parallelization.

Ephemeral Model

Unlike persistent worker pools, Longshot uses a create-use-destroy lifecycle for each task:

// From worker-pool.ts:132-217
async assignTask(task: Task): Promise<Handoff> {
  const worker = {
    id: `ephemeral-${task.id}`,
    currentTask: task,
    startedAt: Date.now(),
  };
  
  // Spawn Python subprocess to manage Modal sandbox
  const handoff = await this.runSandboxStreaming(task.id, task.branch, payload);
  
  // Sandbox self-terminates after handoff is written
  return handoff;
}

There is no persistent pool. start() and stop() are no-ops. Each task spawns its own isolated environment.

Sandbox Lifecycle

Step-by-Step

Orchestrator receives task from planner
Python spawner (infra/spawn_sandbox.py) provisions Modal sandbox
Modal creates isolated container with Node.js + git
Sandbox clones target repository to /workspace/repo
Worker runner (packages/sandbox/src/worker-runner.ts) starts Pi agent session
Pi agent executes task using full toolset (read, write, edit, bash, grep, find, ls)
Safety-net commit captures all uncommitted changes
Build check runs tsc --noEmit to detect type errors
Git push uploads task branch to remote
Handoff written to /workspace/result.json
Modal terminates container
Orchestrator receives handoff and enqueues branch for merge

Sandbox Environment

File Structure

/workspace/
  ├── task.json          # Input: task payload from orchestrator
  ├── result.json        # Output: handoff from worker
  ├── AGENTS.md          # Worker system prompt
  └── repo/              # Cloned target repository
      ├── .git/
      ├── src/
      ├── package.json
      └── ...

Task Payload

interface TaskPayload {
  task: Task;              // Task definition
  systemPrompt: string;    // Worker agent instructions
  llmConfig: {
    endpoint: string;      // LLM API endpoint
    model: string;         // Model name
    maxTokens: number;
    temperature: number;
    apiKey?: string;
  };
  repoUrl?: string;        // Git repository URL
  trace?: {                // Distributed tracing context
    traceId: string;
    parentSpanId: string;
  };
}

Written to /workspace/task.json before worker starts.

Handoff Output

interface Handoff {
  taskId: string;
  status: "complete" | "partial" | "blocked" | "failed";
  summary: string;
  diff: string;
  filesChanged: string[];
  concerns: string[];
  suggestions: string[];
  buildExitCode?: number;  // tsc --noEmit exit code
  metrics: {
    linesAdded: number;
    linesRemoved: number;
    filesCreated: number;
    filesModified: number;
    tokensUsed: number;
    toolCallCount: number;
    durationMs: number;
  };
}

Written to /workspace/result.json after worker completes.

Worker Agent Session

Inside the sandbox, a Pi coding agent session is created with full capabilities:

// From worker-runner.ts:194-203
const { session } = await createAgentSession({
  cwd: WORK_DIR,
  model,
  tools: fullPiTools,  // All 7 Pi tools
  authStorage,
  modelRegistry,
  sessionManager: SessionManager.inMemory(),
  settingsManager: SettingsManager.inMemory(),
  thinkingLevel: "off",
});

Full Pi Toolset

// All 7 built-in Pi tools — gives workers full filesystem and search
const fullPiTools = [
  ...codingTools,  // read, bash, edit, write
  grepTool,        // ripgrep-powered content search
  findTool,        // glob-based file search
  lsTool,          // directory listing
];

Workers have unrestricted access to the repository within their sandbox. They can read any file, run any command, modify anything. Isolation prevents cross-task interference.

Safety-Net Commit

After the Pi agent finishes, a safety-net commit captures any uncommitted changes:

// From worker-runner.ts:284-293
if (!isEmptyResponse) {
  safeExec("git add -A", WORK_DIR);
  const stagedFiles = safeExec("git diff --cached --name-only", WORK_DIR);
  if (stagedFiles) {
    safeExec(
      `git commit -m "feat(${task.id}): auto-commit uncommitted changes"`,
      WORK_DIR
    );
  }
}

This ensures all work is saved even if the agent forgot to commit. The orchestrator always gets a complete diff.

Post-Task Build Check

If a tsconfig.json exists, the sandbox runs a build check after task completion:

// From worker-runner.ts:296-307
if (!isEmptyResponse && existsSync(`${WORK_DIR}/tsconfig.json`)) {
  try {
    execSync("npx tsc --noEmit", { cwd: WORK_DIR, timeout: 60_000 });
    buildExitCode = 0;
  } catch (buildErr) {
    buildExitCode = buildErr.status ?? 1;
  }
}

The exit code is included in the handoff. The planner and reconciler use this to detect type errors early.

Artifact Filtering

Generated files are excluded from diffs to reduce noise:

const ARTIFACT_PATTERNS = [
  /^node_modules\//,
  /^\.next\//,
  /^dist\//,
  /^build\//,
  /^out\//,
  /^\.turbo\//,
  /^\.tsbuildinfo$/,
  /^package-lock\.json$/,
  /^pnpm-lock\.yaml$/,
  /^yarn\.lock$/,
];

These files are automatically added to .gitignore if missing:

// From worker-runner.ts:268-278
const gitignorePath = `${WORK_DIR}/.gitignore`;
if (!existsSync(gitignorePath)) {
  writeFileSync(gitignorePath, `${GITIGNORE_ESSENTIALS}\n`, "utf-8");
} else {
  const existing = readFileSync(gitignorePath, "utf-8");
  if (!existing.includes("node_modules")) {
    appendFileSync(gitignorePath, `\n${GITIGNORE_ESSENTIALS}\n", "utf-8");
  }
}

Streaming Progress

Sandbox output is streamed line-by-line to the orchestrator:

// From worker-pool.ts:268-286
const rl = createInterface({ input: proc.stdout });

rl.on("line", (line: string) => {
  stdoutLines.push(line);
  this.forwardWorkerLine(taskId, line);
  
  // Emit trace events for key milestones
  if (line.includes("sandbox created")) {
    workerSpan.event("sandbox.created");
  } else if (line.includes("repo cloned")) {
    workerSpan.event("sandbox.cloned");
  } else if (line.includes("starting worker agent")) {
    workerSpan.event("sandbox.workerStarted");
  }
});

Intermediate logs (tool calls, progress updates) appear in real-time in the dashboard.

Timeout Handling

Sandboxes have a configurable timeout (default: varies by deployment):

// From worker-pool.ts:246-259
const timer = setTimeout(() => {
  if (settled) return;
  settled = true;
  proc.kill("SIGKILL");
  this.timedOutBranches.push(branchName);
  reject(
    new Error(
      `Sandbox timed out after ${this.config.workerTimeout}s for task ${taskId}`
    )
  );
}, this.config.workerTimeout * 1000);

Timed-out branches are tracked and can be retried or escalated.

Empty Response Detection

If the LLM returns zero tokens and the agent made zero tool calls, the worker produced no useful work:

// From worker-runner.ts:263-266
const isEmptyResponse = tokensUsed === 0 && toolCallCount === 0;
if (isEmptyResponse) {
  log("WARNING: LLM returned empty response. Marking task as failed.");
}

This prevents scaffold-only diffs (.gitignore, AGENTS.md) from being treated as successful completions.

Resource Allocation

Sandbox resource limits are configurable:

interface SandboxConfig {
  imageTag: string;      // Docker image version
  cpuCores: number;      // CPU allocation
  memoryMb: number;      // Memory limit
  idleTimeout: number;   // Seconds before auto-termination
}

Modal handles actual provisioning and resource enforcement.

Distributed Tracing

Sandboxes participate in distributed tracing:

// From worker-runner.ts:143-152
if (payload.trace) {
  const tracer = Tracer.fromPropagated(payload.trace);
  workerSpan = tracer.startSpan("sandbox.worker", {
    taskId: task.id,
    agentId: `sandbox-${task.id}`,
  });
}

Traces flow from orchestrator → sandbox → worker agent, enabling end-to-end performance analysis.

Advantages of Ephemeral Sandboxes

No State Pollution: Each task starts from a clean repository clone. No leftover files, no merge conflicts from previous work.Massive Parallelism: Scale to hundreds of concurrent workers without resource contention.Reproducibility: Same input task always produces same environment. Debugging is deterministic.Cost Efficiency: Pay only for compute time used. Idle workers cost nothing.Security: Workers cannot interfere with each other or access orchestrator state.

Limitations

Startup Latency: Each task incurs ~10-30s overhead for sandbox provisioning and repo cloning.No Caching: Dependencies are re-installed for every task. Build artifacts don’t persist.Network Dependency: Requires reliable internet access for git clone and LLM API calls.

Future optimization: Pre-warmed sandboxes with cached dependencies and incremental clones.

Overview

Getting Started

Core Concepts

Guides

Agent Development

Examples

Ephemeral Model

Sandbox Lifecycle

Step-by-Step

Sandbox Environment

File Structure

Task Payload

Handoff Output

Worker Agent Session

Full Pi Toolset

Safety-Net Commit

Post-Task Build Check

Artifact Filtering

Streaming Progress

Timeout Handling

Empty Response Detection

Resource Allocation

Distributed Tracing

Advantages of Ephemeral Sandboxes

Limitations

Build docs developers (and LLMs) love

Overview

Getting Started

Core Concepts

Guides

Agent Development

Examples

​Ephemeral Model

​Sandbox Lifecycle

​Step-by-Step

​Sandbox Environment

​File Structure

​Task Payload

​Handoff Output

​Worker Agent Session

​Full Pi Toolset

​Safety-Net Commit

​Post-Task Build Check

​Artifact Filtering

​Streaming Progress

​Timeout Handling

​Empty Response Detection

​Resource Allocation

​Distributed Tracing

​Advantages of Ephemeral Sandboxes

​Limitations

Build docs developers (and LLMs) love

Ephemeral Model

Sandbox Lifecycle

Step-by-Step

Sandbox Environment

File Structure

Task Payload

Handoff Output

Worker Agent Session

Full Pi Toolset

Safety-Net Commit

Post-Task Build Check

Artifact Filtering

Streaming Progress

Timeout Handling

Empty Response Detection

Resource Allocation

Distributed Tracing

Advantages of Ephemeral Sandboxes

Limitations