Skip to main content

Overview

This guide covers debugging techniques for evolution failures, signal extraction issues, gene selection problems, and validation errors.

Log Files & Locations

Evolution State

memory/evolution/
├── evolution_state.json              # Cycle counter and last run timestamp
├── evolution_solidify_state.json     # Last run/solidify state (for loop gating)
├── memory_graph.jsonl                # Causal graph (signals → outcomes)
└── dormant_hypothesis.json          # Saved state before backoff

GEP Assets

assets/gep/
├── genes.json          # Gene definitions
├── capsules.json       # Success capsules
├── events.jsonl        # Evolution event log (append-only)
├── failed_capsules.jsonl  # Failed evolution records
└── candidates.jsonl   # Capability candidates (extracted from logs)

Session Logs

~/.openclaw/agents/main/sessions/
├── channel_123456_session1.jsonl
├── channel_123456_session2.jsonl
└── general_session.jsonl

Workspace Memory

memory/
├── MEMORY.md                    # Global memory (or scoped at scopes/<scope>/MEMORY.md)
├── scopes/
   ├── channel_123456/
   └── MEMORY.md
   └── channel_789012/
       └── MEMORY.md
└── 2024-01-15.md                # Daily log

Debugging Evolution Failures

Step 1: Check Last Event

tail -1 assets/gep/events.jsonl | jq .
Key fields:
{
  "outcome": {
    "status": "failed",  // success | failed
    "score": 0.2
  },
  "meta": {
    "constraints_ok": false,
    "constraint_violations": [
      "max_files exceeded: 25 > 12"
    ],
    "validation_ok": false,
    "validation": [
      {"cmd": "npm test", "ok": false}
    ],
    "protocol_ok": false,
    "protocol_violations": [
      "missing_or_invalid_mutation"
    ]
  }
}

Step 2: Identify Failure Reason

tail -1 assets/gep/events.jsonl | jq -r '.meta.constraint_violations[], .meta.protocol_violations[]'
Common failure reasons:
ViolationMeaningFix
max_files exceeded: 25 > 12Blast radius too largeIncrease gene’s max_files or reduce scope
forbidden_path touched: .envModified protected fileRemove .env from changes
validation_failed: npm testTests failedFix code, improve validation
missing_or_invalid_mutationNo mutation objectEnsure Hand Agent emits mutation
critical_path_modified: skills/evolver/src/evolve.jsSelf-modification blockedSet EVOLVE_ALLOW_SELF_MODIFY=true or avoid

Step 3: Check Validation Details

tail -1 assets/gep/events.jsonl | jq '.meta.validation'
Output:
[
  {"cmd": "node scripts/validate.js", "ok": true},
  {"cmd": "npm test", "ok": false}
]
Drill into validation report:
tail -1 assets/gep/events.jsonl | jq '.meta.validation_report.results'

Step 4: Analyze Blast Radius

tail -1 assets/gep/events.jsonl | jq '.blast_radius, .meta.blast_breakdown'
Output:
{
  "blast_radius": {"files": 25, "lines": 450},
  "blast_breakdown": [
    {"dir": "src/gep", "files": 12},
    {"dir": "src/ops", "files": 8},
    {"dir": "scripts", "files": 5}
  ]
}
Interpretation: Evolution touched 25 files, mostly in src/gep/. If gene’s max_files: 12, this exceeds the limit.

Step 5: Check Estimate vs. Actual Drift

tail -1 assets/gep/events.jsonl | jq '.meta.blast_estimate_comparison'
Output:
{
  "estimateFiles": 5,
  "actualFiles": 25,
  "ratio": 5.0,
  "drifted": true,
  "message": "Estimate drift: actual 25 files is 5.0x the estimated 5. Agent did not plan accurately."
}
Interpretation: Agent estimated 5 files but changed 25. This indicates poor planning or unintended side effects.

Debugging Signal Extraction

Check Extracted Signals

tail -1 assets/gep/events.jsonl | jq '.signals'
Output:
["log_error", "api_timeout", "recent_error_count_high"]

Verify Signal Sources

Signals come from:
  1. Session logs: Recent errors, LLM errors, tool failures
  2. MEMORY.md: Memory anomalies, context drift
  3. USER.md: User feedback, feature requests
  4. Event history: Repair loops, saturation patterns
Manually inspect sources:
# Recent session errors
tail -100 ~/.openclaw/agents/main/sessions/*.jsonl | grep -i error

# Memory anomalies
grep -i 'error\|todo\|fix' memory/MEMORY.md

# Recent failed evolutions
grep '"status":"failed"' assets/gep/events.jsonl | tail -5

Debug Signal Extraction Logic

Run signal extraction manually:
node -e '
const { extractSignals } = require("./src/gep/signals");
const fs = require("fs");
const inputs = {
  recentSessionTranscript: fs.readFileSync("memory/test_session.txt", "utf8"),
  todayLog: "",
  memorySnippet: fs.readFileSync("memory/MEMORY.md", "utf8"),
  userSnippet: "",
  recentEvents: []
};
const signals = extractSignals(inputs);
console.log(JSON.stringify(signals, null, 2));
'

Debugging Gene Selection

Check Selector Decision

tail -1 assets/gep/events.jsonl | jq '.meta.selector'
Output:
{
  "selected_gene_id": "gene_repair_001",
  "selected_by": "memory_graph+selector",
  "score": 0.87,
  "candidates": [
    {"id": "gene_repair_001", "score": 0.87, "reason": "high_confidence_path"},
    {"id": "gene_repair_002", "score": 0.65, "reason": "signal_match"},
    {"id": "gene_optimize_003", "score": 0.45, "reason": "category_mismatch"}
  ],
  "memory_advice": {
    "preferredGeneId": "gene_repair_001",
    "bannedGeneIds": ["gene_innovate_005"],
    "confidence": 0.92
  }
}
Interpretation:
  • Selector chose gene_repair_001 with score 0.87
  • Memory graph advised this gene (high confidence path)
  • gene_innovate_005 is banned (low success rate)

Debug Selector Scoring

Run selector manually:
node -e '
const { selectGeneAndCapsule } = require("./src/gep/selector");
const { loadGenes, loadCapsules } = require("./src/gep/assetStore");
const genes = loadGenes();
const capsules = loadCapsules();
const signals = ["log_error", "api_timeout"];
const result = selectGeneAndCapsule({ genes, capsules, signals, memoryAdvice: null, driftEnabled: false, failedCapsules: [] });
console.log(JSON.stringify(result, null, 2));
'

Check Memory Graph Advice

grep 'getAdvice' memory/evolution/memory_graph.jsonl | tail -1 | jq .
Output:
{
  "type": "Advice",
  "signals": ["log_error", "api_timeout"],
  "preferredGeneId": "gene_repair_001",
  "bannedGeneIds": ["gene_innovate_005"],
  "confidence": 0.92,
  "reason": "high_confidence_path"
}

Debugging Memory Graph

Query Memory Graph

# Find all attempts with a specific gene
jq -r 'select(.type == "Attempt" and .gene_id == "gene_repair_001")' memory/evolution/memory_graph.jsonl

# Find all outcomes for a signal key
jq -r 'select(.type == "Outcome" and .signal_key == "sig_abc123")' memory/evolution/memory_graph.jsonl

# Count success rate by gene
jq -r 'select(.type == "Outcome") | [.gene_id, .status]' memory/evolution/memory_graph.jsonl | \
  awk '{print $1}' | sort | uniq -c

Inspect Causal Chain

# Find a specific signal snapshot
jq -r 'select(.type == "SignalSnapshot" and .signals[] | contains("api_timeout"))' memory/evolution/memory_graph.jsonl | tail -1

# Follow its hypothesis
SIGNAL_KEY=$(jq -r 'select(.type == "SignalSnapshot") | .signal_key' memory/evolution/memory_graph.jsonl | tail -1)
jq -r "select(.type == \"Hypothesis\" and .signal_key == \"$SIGNAL_KEY\")" memory/evolution/memory_graph.jsonl

# Follow its attempt
HYPO_ID=$(jq -r "select(.type == \"Hypothesis\" and .signal_key == \"$SIGNAL_KEY\") | .id" memory/evolution/memory_graph.jsonl | tail -1)
jq -r "select(.type == \"Attempt\" and .hypothesis_id == \"$HYPO_ID\")" memory/evolution/memory_graph.jsonl

# Follow its outcome
ATTEMPT_ID=$(jq -r "select(.type == \"Attempt\" and .hypothesis_id == \"$HYPO_ID\") | .id" memory/evolution/memory_graph.jsonl | tail -1)
jq -r "select(.type == \"Outcome\" and .attempt_id == \"$ATTEMPT_ID\")" memory/evolution/memory_graph.jsonl

Visualize Memory Graph

Export to GraphViz DOT format:
node -e '
const fs = require("fs");
const lines = fs.readFileSync("memory/evolution/memory_graph.jsonl", "utf8").split("\n").filter(Boolean);
const nodes = lines.map(l => JSON.parse(l));

console.log("digraph MemoryGraph {");
for (const n of nodes) {
  if (n.type === "Hypothesis") console.log(`  "${n.id}" [label="${n.gene_id}"]`);
  if (n.type === "Attempt" && n.hypothesis_id) console.log(`  "${n.hypothesis_id}" -> "${n.id}"`);
  if (n.type === "Outcome" && n.attempt_id) console.log(`  "${n.attempt_id}" -> "${n.id}" [label="${n.status}"]`);
}
console.log("}");
' > graph.dot

dot -Tpng graph.dot -o graph.png

Debugging Validation Failures

Run Validation Manually

# Find the gene's validation commands
jq -r '.validation[]' assets/gep/genes.json | grep gene_repair_001 -A5

# Run them manually
npm test
node scripts/validate.js

Check Validation Safety

Test if a command would be allowed:
node -e '
const { isValidationCommandAllowed } = require("./src/gep/solidify");
const cmd = "bash scripts/deploy.sh";  // Replace with your command
console.log(isValidationCommandAllowed(cmd) ? "ALLOWED" : "BLOCKED");
'

Inspect Validation Report

tail -1 assets/gep/events.jsonl | jq '.meta.validation_report'
Output:
{
  "type": "ValidationReport",
  "id": "vr_1234567890",
  "gene_id": "gene_repair_001",
  "commands": ["npm test"],
  "results": [
    {
      "cmd": "npm test",
      "ok": false,
      "duration_ms": 3200,
      "out": "...",
      "err": "Test suite failed. See above for errors."
    }
  ],
  "env_fingerprint": {"platform": "linux", "node_version": "v18.16.0"},
  "started_at": "2024-01-15T10:30:00.000Z",
  "finished_at": "2024-01-15T10:30:03.200Z"
}

Debugging Loop Gating

Check Pending State

cat memory/evolution/evolution_solidify_state.json | jq .
Output:
{
  "last_run": {
    "run_id": "run_1234567890",
    "started_at": "2024-01-15T10:30:00.000Z",
    "signals": ["log_error"],
    "selected_gene_id": "gene_repair_001"
  },
  "last_solidify": {
    "run_id": "run_1234567890",
    "solidified_at": "2024-01-15T10:32:00.000Z",
    "status": "success"
  }
}
Interpretation:
  • last_run.run_id == last_solidify.run_id: No pending evolution
  • last_run.run_id != last_solidify.run_id: Evolution pending, evolver will back off

Check Dormant Hypothesis

cat memory/evolution/dormant_hypothesis.json | jq .
Output:
{
  "backoff_reason": "active_sessions_exceeded",
  "active_sessions": 15,
  "queue_max": 10,
  "created_at": "2024-01-15T10:30:00.000Z",
  "ttl_ms": 3600000
}
Interpretation: Evolver backed off due to high user session load. Will resume after 1 hour or when sessions drop.

Asset Log Commands

Evolver tracks asset usage in memory/evolution/asset_call_log.jsonl:

Query Asset Usage

# Find all invocations of a specific gene
jq -r 'select(.asset_id == "gene_repair_001")' memory/evolution/asset_call_log.jsonl

# Count gene usage frequency
jq -r '.asset_id' memory/evolution/asset_call_log.jsonl | sort | uniq -c | sort -rn

# Find most recent capsule reuse
jq -r 'select(.asset_type == "Capsule")' memory/evolution/asset_call_log.jsonl | tail -1

Common Debug Scenarios

Scenario 1: Evolution runs but nothing happens

Check:
  1. Bridge enabled: echo $EVOLVE_BRIDGE (should be true or unset)
  2. Hand Agent spawned: grep 'sessions_spawn' logs/*.log
  3. Prompt generated: EVOLVE_PRINT_PROMPT=true node index.js

Scenario 2: Evolution always fails validation

Check:
  1. Gene validation commands: jq -r '.validation[]' assets/gep/genes.json
  2. Run validation manually: npm test
  3. Check for environment-specific issues: Compare env_fingerprint in validation report

Scenario 3: Wrong gene selected

Check:
  1. Selector decision: tail -1 assets/gep/events.jsonl | jq '.meta.selector'
  2. Memory advice: tail -1 assets/gep/events.jsonl | jq '.meta.selector.memory_advice'
  3. Signal extraction: tail -1 assets/gep/events.jsonl | jq '.signals'

Build docs developers (and LLMs) love