Quick Diagnostics
Before diving into specific issues, check:- Main log file:
~/.enki/logs/enki.log(most recent session at bottom) - Error messages:
grep "ERROR\|WARN" ~/.enki/logs/enki.log - Session-specific logs:
~/.enki/logs/sessions/<label>.log
Common Problems
Workers not starting or failing immediately
Workers not starting or failing immediately
Symptoms: Tasks stuck in “Pending” or immediately fail after spawnCommon causes:
-
Missing Claude Code installation
- Workers are spawned as ACP (Agent Client Protocol) subprocesses
- Verify Claude Code is installed and accessible
- Check
DEBUGlogs for subprocess spawn errors
-
Node.js not available
- Some ACP agents require Node.js runtime
- Verify:
node --version
-
Infrastructure broken flag set
- If a
cpcommand fails during worker spawn, theinfra_brokenflag is set - All subsequent spawns auto-fail to avoid repeated errors
- Check logs for “copy failed” or “infra_broken” messages
- Solution: Fix the underlying issue (disk space, permissions) and restart Enki
- If a
Merge conflicts during task completion
Merge conflicts during task completion
Symptoms: Worker completes successfully but merge fails, task enters Failed stateWhat happens:
- When a worker completes, Enki fetches its
task/<id>branch from the worker’s copy - The refinery attempts to merge it into your main branch
- If there’s a conflict, Enki spawns a merger agent with minimal tools to resolve it
- Conflict detected →
MergeNeedsResolutionevent - Separate ACP session spawned with
MERGER_TOOLS(minimal tool access) - Works in a shared temp clone (kept alive via
CleanupGuard+std::mem::forget) - Agent resolves conflict and commits
- Merge completes →
MergeDoneadvances the DAG
- Check
.enki/verify.shhook (if present) — non-zero exit fails the merge - Review merge request logs in the database
- Check session logs:
~/.enki/logs/sessions/<merger-session>.log
Copy-on-Write failures
Copy-on-Write failures
Symptoms: “copy failed” errors, workers not getting isolated copiesAbout CoW copies:
- Each worker gets a full filesystem copy at
.enki/copies/<task_id>/ - Uses platform-specific CoW for instant, space-efficient clones:
- macOS/APFS:
cp -Rc - Linux:
cp --reflink=auto -a
- macOS/APFS:
- Excludes
.enki/from copies
-
Insufficient disk space
- Even with CoW, metadata and unique files consume space
- Check:
df -h
-
Filesystem doesn’t support CoW
- Linux: Requires btrfs or XFS (with recent kernel)
- macOS: Requires APFS
- On unsupported filesystems, falls back to full copy (slow + space-heavy)
-
Permission issues
- Check
.enki/copies/directory permissions - Ensure user can write to project directory
- Check
Stale workers / timeout issues
Stale workers / timeout issues
Symptoms: Workers appear stuck, no progress for extended periodHow monitoring works:If workers are genuinely stuck:
- The Monitor state machine tracks worker activity via ACP updates
- Workers with no update for
STALE_CANCEL_SECSget cancelled:- Standard tier: 120 seconds default
- Retry budget: up to 3 retries per task (
MAX_TASK_RETRIES) - After max retries, task blocks to prevent infinite loops
- Use MCP
enki_stop_alltool to cancel all running workers - Check for deadlocks in session logs
- Review task complexity tier assignment (heavy tasks get more time)
Database corruption or state issues
Database corruption or state issues
Symptoms: Crashes on startup, inconsistent task states, migration errorsDatabase details:
- SQLite in WAL mode:
.enki/db.sqlite - DAG stored as JSON blob in
executionstable - Auto-migration on every DB open (no version files)
-
Backup first:
-
Check integrity:
-
If corrupted, re-initialize (destroys task history):
-
Abandoned tasks (DB-only state):
- Set on session exit for in-flight tasks
- Never enters the DAG runtime
- Safe to ignore or manually clean up:
Permission denied errors
Permission denied errors
Symptoms: “Permission denied” when accessing files, creating copies, or running workersCommon causes:
-
Project directory permissions
- Enki needs write access to
.enki/subdirectory - Check:
ls -la .enki/
- Enki needs write access to
-
Worker copy permissions
- CoW copies preserve original file permissions
- If source files are read-only for the user, workers can’t modify them
-
Git repository permissions
- Workers create branches and commits
- Check:
git config user.name && git config user.email
Signal file processing issues
Signal file processing issues
Symptoms: MCP tool calls succeed but nothing happens, tasks don’t startHow signals work:Common issues:
- MCP server writes
.enki/events/sig-*.jsonfiles - Coordinator polls on 3-second tick and processes them
- Files are deleted after processing
- No filesystem notifications (no
fsnotify)
- Coordinator not running (TUI exited)
- Permission issues on
.enki/events/directory - Malformed JSON in signal file (MCP server bug)
