Skip to main content
Enki uses signal files as a lightweight IPC mechanism for async communication between worker agents and the coordinator. When a worker calls an MCP tool that changes orchestrator state, the tool writes a JSON file to .enki/events/. The coordinator polls this directory every 3 seconds, processes all signal files, and deletes them.

Why Signal Files?

Enki’s architecture separates concerns:
  • Workers are subprocesses running ACP agents (claude-code or compatible clients)
  • Coordinator is the main enki process running the orchestrator state machine
  • MCP server runs as a subprocess (enki mcp) and can’t directly call into the coordinator
Signal files provide a simple, file-based IPC mechanism:
  • No sockets, no shared memory, no complex IPC
  • Workers write JSON, coordinator reads and deletes
  • Fire-and-forget: workers don’t block waiting for responses
  • Coordinator processes signal files in batches on its polling tick
This design trades real-time responsiveness (3s poll interval) for simplicity and robustness. Workers don’t need to maintain connections or handle backpressure.

Signal File Format

Signal files are written to .enki/events/sig-<id>.json where <id> is a unique ID generated by Id::new("sig").

Common Structure

All signal files are JSON objects with a type field:
{
  "type": "task_created",
  "task_id": "task-01JXXZ..."
}
The coordinator reads the type field and dispatches to the appropriate handler.

Signal File Lifecycle

  1. Write: Worker calls MCP tool → handler writes .enki/events/sig-<random-id>.json
  2. Poll: Coordinator wakes on 3s tick, reads all sig-*.json files in directory
  3. Process: Coordinator parses each signal, updates orchestrator state, produces events
  4. Delete: Coordinator deletes processed signal files
Implementation: crates/cli/src/commands/mcp/handlers.rs:14-23
pub(super) fn write_signal_file(signal: &Value) -> Result<(), String> {
    let enki_dir = super::super::enki_dir().map_err(|e| e.to_string())?;
    let events_dir = enki_dir.join("events");
    std::fs::create_dir_all(&events_dir).map_err(|e| e.to_string())?;
    let filename = format!("{}.json", Id::new("sig"));
    let path = events_dir.join(filename);
    let content = serde_json::to_string(signal).map_err(|e| e.to_string())?;
    std::fs::write(&path, content).map_err(|e| e.to_string())?;
    Ok(())
}

Signal File Types

Enki uses several signal types for different operations:

execution_created

Written by: enki_execution_create tool When: Planner creates a new multi-step execution Payload:
{
  "type": "execution_created",
  "execution_id": "exec-01JXXZ..."
}
Coordinator action: Polls DB for new tasks, builds DAG, schedules ready tasks Source: crates/cli/src/commands/mcp/handlers.rs:241

task_created

Written by: enki_task_create tool, enki_task_retry tool When: Planner creates a standalone task or retries a failed task Payload:
{
  "type": "task_created",
  "task_id": "task-01JXXZ..."
}
Coordinator action: Loads task from DB, adds to scheduler, spawns worker if tier capacity available Source: crates/cli/src/commands/mcp/handlers.rs:71, handlers.rs:477

steps_added

Written by: enki_execution_add_steps tool When: Planner dynamically adds steps to a running execution (e.g., after a checkpoint pause) Payload:
{
  "type": "steps_added",
  "execution_id": "exec-01JXXZ..."
}
Coordinator action: Reloads execution DAG, schedules newly-ready tasks Source: crates/cli/src/commands/mcp/handlers.rs:390-393

pause

Written by: enki_pause tool When: Planner pauses an execution or a single step Payload (pause execution):
{
  "type": "pause",
  "execution_id": "exec-01JXXZ..."
}
Payload (pause step):
{
  "type": "pause",
  "execution_id": "exec-01JXXZ...",
  "step_id": "build"
}
Coordinator action: Marks execution/step as paused, stops spawning new workers, lets running workers finish Source: crates/cli/src/commands/mcp/handlers.rs:491-495

cancel

Written by: enki_cancel tool When: Planner cancels an execution or step (kills running workers) Payload (cancel execution):
{
  "type": "cancel",
  "execution_id": "exec-01JXXZ..."
}
Payload (cancel step):
{
  "type": "cancel",
  "execution_id": "exec-01JXXZ...",
  "step_id": "auth"
}
Coordinator action: Marks task(s) as cancelled, kills worker subprocesses, cascades cancel to dependents Source: crates/cli/src/commands/mcp/handlers.rs:507-511

stop_all

Written by: enki_stop_all tool When: Planner or worker requests immediate halt of all running workers Payload:
{
  "type": "stop_all"
}
Coordinator action: Kills all worker subprocesses, shuts down gracefully Source: crates/cli/src/commands/mcp/handlers.rs:525
stop_all also writes a legacy .enki/stop file for backward compatibility. The coordinator checks both the stop file and stop_all signal.

resume

Written by: enki_resume tool When: Planner resumes a paused execution or step (e.g., after reviewing checkpoint output) Payload (resume execution):
{
  "type": "resume",
  "execution_id": "exec-01JXXZ..."
}
Payload (resume step):
{
  "type": "resume",
  "execution_id": "exec-01JXXZ...",
  "step_id": "deploy"
}
Coordinator action: Unpauses execution/step, reschedules ready tasks Source: crates/cli/src/commands/mcp/handlers.rs:418-425

worker_report

Written by: enki_worker_report tool When: Worker reports its current high-level activity (“analyzing codebase”, “running tests”, etc.) Payload:
{
  "type": "worker_report",
  "task_id": "task-01JXXZ...",
  "status": "implementing auth middleware"
}
Coordinator action: Updates task.current_activity in DB, displayed in TUI Source: crates/cli/src/commands/mcp/handlers.rs:535-539

mail

Written by: enki_mail_send, enki_mail_reply tools When: Worker or planner sends a message Payload:
{
  "type": "mail",
  "message_id": "msg-01JXXZ...",
  "from": "worker/task-01JXXZ...",
  "to": "coordinator",
  "subject": "Need clarification on requirements",
  "priority": "high"
}
Coordinator action: Notifies recipient, displays in TUI mail panel Source: crates/cli/src/commands/mcp/handlers.rs:707-714, handlers.rs:832-839

Coordinator Polling Loop

The coordinator runs a tokio::select! loop with a 3s tick:
loop {
    tokio::select! {
        _ = tick_interval.tick() => {
            // 1. Read all .enki/events/sig-*.json files
            // 2. Parse and dispatch by type
            // 3. Call orchestrator.process_events()
            // 4. Handle spawned/finished workers
            // 5. Delete signal files
        }
        Some(msg) = agent_rx.recv() => {
            // Handle agent streaming updates
        }
    }
}
Key points:
  • Signal files are batched (all processed in one tick)
  • process_events() may produce new events in a cascade (e.g., spawning a worker that immediately fails)
  • The coordinator drains events in a while !events.is_empty() loop

When Signal Files Are Used

Signal files are used for state-changing operations that affect the orchestrator:
  • Creating tasks or executions
  • Pausing, resuming, cancelling tasks
  • Retrying failed tasks
  • Adding steps to running executions
  • Reporting worker activity
  • Sending inter-agent mail
  • Stopping all workers
Signal files are not used for:
  • Read-only operations (enki_status, enki_task_list)
  • Operations handled entirely by the DB (enki_mail_read marks a message as read in SQLite, no coordination needed)

Debugging Signal Files

Signal files are ephemeral (deleted after processing), but you can observe them:
# Watch signal files being created and deleted
watch -n 0.5 'ls -lh .enki/events/'

# Capture signal files before coordinator deletes them (DANGEROUS: breaks IPC)
mkdir /tmp/signal-capture
cp .enki/events/sig-*.json /tmp/signal-capture/ 2>/dev/null
Do not manually delete or modify signal files while the coordinator is running. This will cause lost updates or undefined behavior.

Design Rationale

Why not use a more sophisticated IPC mechanism? Simplicity:
  • No protocol negotiation, no versioning, no handshakes
  • Workers don’t need to manage connections or handle failures
  • Coordinator can restart without draining queues or re-establishing connections
Robustness:
  • Workers crash? Signal files persist until coordinator processes them
  • Coordinator crash? Signal files remain on disk for recovery (though current coordinator doesn’t implement recovery-on-restart)
  • No risk of socket exhaustion, connection leaks, or backpressure
Debuggability:
  • Signal files are human-readable JSON
  • Easy to inspect with cat .enki/events/sig-*.json
  • Can replay signal files by copying them back to .enki/events/
Trade-offs:
  • 3s poll latency (not real-time)
  • File I/O overhead (negligible for Enki’s workload)
  • No guaranteed delivery if coordinator exits between polling cycles (acceptable for Enki’s single-session model)
If you need lower-latency coordination, consider reducing the poll interval in the coordinator loop. 3s is a conservative default that works well for multi-minute task execution times.

Build docs developers (and LLMs) love