Debugging - Evolver

Overview

This guide covers debugging techniques for evolution failures, signal extraction issues, gene selection problems, and validation errors.

Log Files & Locations

Evolution State

memory/evolution/
├── evolution_state.json              # Cycle counter and last run timestamp
├── evolution_solidify_state.json     # Last run/solidify state (for loop gating)
├── memory_graph.jsonl                # Causal graph (signals → outcomes)
└── dormant_hypothesis.json          # Saved state before backoff

GEP Assets

assets/gep/
├── genes.json          # Gene definitions
├── capsules.json       # Success capsules
├── events.jsonl        # Evolution event log (append-only)
├── failed_capsules.jsonl  # Failed evolution records
└── candidates.jsonl   # Capability candidates (extracted from logs)

Session Logs

~/.openclaw/agents/main/sessions/
├── channel_123456_session1.jsonl
├── channel_123456_session2.jsonl
└── general_session.jsonl

Workspace Memory

memory/
├── MEMORY.md                    # Global memory (or scoped at scopes/<scope>/MEMORY.md)
├── scopes/
│   ├── channel_123456/
│   │   └── MEMORY.md
│   └── channel_789012/
│       └── MEMORY.md
└── 2024-01-15.md                # Daily log

Debugging Evolution Failures

Step 1: Check Last Event

tail -1 assets/gep/events.jsonl | jq .

Key fields:

{
  "outcome": {
    "status": "failed",  // success | failed
    "score": 0.2
  },
  "meta": {
    "constraints_ok": false,
    "constraint_violations": [
      "max_files exceeded: 25 > 12"
    ],
    "validation_ok": false,
    "validation": [
      {"cmd": "npm test", "ok": false}
    ],
    "protocol_ok": false,
    "protocol_violations": [
      "missing_or_invalid_mutation"
    ]
  }
}

Step 2: Identify Failure Reason

tail -1 assets/gep/events.jsonl | jq -r '.meta.constraint_violations[], .meta.protocol_violations[]'

Common failure reasons:

Violation	Meaning	Fix
`max_files exceeded: 25 > 12`	Blast radius too large	Increase gene’s `max_files` or reduce scope
`forbidden_path touched: .env`	Modified protected file	Remove `.env` from changes
`validation_failed: npm test`	Tests failed	Fix code, improve validation
`missing_or_invalid_mutation`	No mutation object	Ensure Hand Agent emits mutation
`critical_path_modified: skills/evolver/src/evolve.js`	Self-modification blocked	Set `EVOLVE_ALLOW_SELF_MODIFY=true` or avoid

Step 3: Check Validation Details

tail -1 assets/gep/events.jsonl | jq '.meta.validation'

Output:

[
  {"cmd": "node scripts/validate.js", "ok": true},
  {"cmd": "npm test", "ok": false}
]

Drill into validation report:

tail -1 assets/gep/events.jsonl | jq '.meta.validation_report.results'

Step 4: Analyze Blast Radius

tail -1 assets/gep/events.jsonl | jq '.blast_radius, .meta.blast_breakdown'

Output:

{
  "blast_radius": {"files": 25, "lines": 450},
  "blast_breakdown": [
    {"dir": "src/gep", "files": 12},
    {"dir": "src/ops", "files": 8},
    {"dir": "scripts", "files": 5}
  ]
}

Interpretation: Evolution touched 25 files, mostly in src/gep/. If gene’s max_files: 12, this exceeds the limit.

Step 5: Check Estimate vs. Actual Drift

tail -1 assets/gep/events.jsonl | jq '.meta.blast_estimate_comparison'

Output:

{
  "estimateFiles": 5,
  "actualFiles": 25,
  "ratio": 5.0,
  "drifted": true,
  "message": "Estimate drift: actual 25 files is 5.0x the estimated 5. Agent did not plan accurately."
}

Interpretation: Agent estimated 5 files but changed 25. This indicates poor planning or unintended side effects.

Debugging Signal Extraction

Check Extracted Signals

tail -1 assets/gep/events.jsonl | jq '.signals'

Output:

["log_error", "api_timeout", "recent_error_count_high"]

Verify Signal Sources

Signals come from:

Session logs: Recent errors, LLM errors, tool failures
MEMORY.md: Memory anomalies, context drift
USER.md: User feedback, feature requests
Event history: Repair loops, saturation patterns

Manually inspect sources:

# Recent session errors
tail -100 ~/.openclaw/agents/main/sessions/*.jsonl | grep -i error

# Memory anomalies
grep -i 'error\|todo\|fix' memory/MEMORY.md

# Recent failed evolutions
grep '"status":"failed"' assets/gep/events.jsonl | tail -5

Debug Signal Extraction Logic

Run signal extraction manually:

node -e '
const { extractSignals } = require("./src/gep/signals");
const fs = require("fs");
const inputs = {
  recentSessionTranscript: fs.readFileSync("memory/test_session.txt", "utf8"),
  todayLog: "",
  memorySnippet: fs.readFileSync("memory/MEMORY.md", "utf8"),
  userSnippet: "",
  recentEvents: []
};
const signals = extractSignals(inputs);
console.log(JSON.stringify(signals, null, 2));
'

Debugging Gene Selection

Check Selector Decision

tail -1 assets/gep/events.jsonl | jq '.meta.selector'

Output:

{
  "selected_gene_id": "gene_repair_001",
  "selected_by": "memory_graph+selector",
  "score": 0.87,
  "candidates": [
    {"id": "gene_repair_001", "score": 0.87, "reason": "high_confidence_path"},
    {"id": "gene_repair_002", "score": 0.65, "reason": "signal_match"},
    {"id": "gene_optimize_003", "score": 0.45, "reason": "category_mismatch"}
  ],
  "memory_advice": {
    "preferredGeneId": "gene_repair_001",
    "bannedGeneIds": ["gene_innovate_005"],
    "confidence": 0.92
  }
}

Interpretation:

Selector chose gene_repair_001 with score 0.87
Memory graph advised this gene (high confidence path)
gene_innovate_005 is banned (low success rate)

Debug Selector Scoring

Run selector manually:

node -e '
const { selectGeneAndCapsule } = require("./src/gep/selector");
const { loadGenes, loadCapsules } = require("./src/gep/assetStore");
const genes = loadGenes();
const capsules = loadCapsules();
const signals = ["log_error", "api_timeout"];
const result = selectGeneAndCapsule({ genes, capsules, signals, memoryAdvice: null, driftEnabled: false, failedCapsules: [] });
console.log(JSON.stringify(result, null, 2));
'

Check Memory Graph Advice

grep 'getAdvice' memory/evolution/memory_graph.jsonl | tail -1 | jq .

Output:

{
  "type": "Advice",
  "signals": ["log_error", "api_timeout"],
  "preferredGeneId": "gene_repair_001",
  "bannedGeneIds": ["gene_innovate_005"],
  "confidence": 0.92,
  "reason": "high_confidence_path"
}

Debugging Memory Graph

Query Memory Graph

# Find all attempts with a specific gene
jq -r 'select(.type == "Attempt" and .gene_id == "gene_repair_001")' memory/evolution/memory_graph.jsonl

# Find all outcomes for a signal key
jq -r 'select(.type == "Outcome" and .signal_key == "sig_abc123")' memory/evolution/memory_graph.jsonl

# Count success rate by gene
jq -r 'select(.type == "Outcome") | [.gene_id, .status]' memory/evolution/memory_graph.jsonl | \
  awk '{print $1}' | sort | uniq -c

Inspect Causal Chain

# Find a specific signal snapshot
jq -r 'select(.type == "SignalSnapshot" and .signals[] | contains("api_timeout"))' memory/evolution/memory_graph.jsonl | tail -1

# Follow its hypothesis
SIGNAL_KEY=$(jq -r 'select(.type == "SignalSnapshot") | .signal_key' memory/evolution/memory_graph.jsonl | tail -1)
jq -r "select(.type == \"Hypothesis\" and .signal_key == \"$SIGNAL_KEY\")" memory/evolution/memory_graph.jsonl

# Follow its attempt
HYPO_ID=$(jq -r "select(.type == \"Hypothesis\" and .signal_key == \"$SIGNAL_KEY\") | .id" memory/evolution/memory_graph.jsonl | tail -1)
jq -r "select(.type == \"Attempt\" and .hypothesis_id == \"$HYPO_ID\")" memory/evolution/memory_graph.jsonl

# Follow its outcome
ATTEMPT_ID=$(jq -r "select(.type == \"Attempt\" and .hypothesis_id == \"$HYPO_ID\") | .id" memory/evolution/memory_graph.jsonl | tail -1)
jq -r "select(.type == \"Outcome\" and .attempt_id == \"$ATTEMPT_ID\")" memory/evolution/memory_graph.jsonl

Visualize Memory Graph

Export to GraphViz DOT format:

node -e '
const fs = require("fs");
const lines = fs.readFileSync("memory/evolution/memory_graph.jsonl", "utf8").split("\n").filter(Boolean);
const nodes = lines.map(l => JSON.parse(l));

console.log("digraph MemoryGraph {");
for (const n of nodes) {
  if (n.type === "Hypothesis") console.log(`  "${n.id}" [label="${n.gene_id}"]`);
  if (n.type === "Attempt" && n.hypothesis_id) console.log(`  "${n.hypothesis_id}" -> "${n.id}"`);
  if (n.type === "Outcome" && n.attempt_id) console.log(`  "${n.attempt_id}" -> "${n.id}" [label="${n.status}"]`);
}
console.log("}");
' > graph.dot

dot -Tpng graph.dot -o graph.png

Debugging Validation Failures

Run Validation Manually

# Find the gene's validation commands
jq -r '.validation[]' assets/gep/genes.json | grep gene_repair_001 -A5

# Run them manually
npm test
node scripts/validate.js

Check Validation Safety

Test if a command would be allowed:

node -e '
const { isValidationCommandAllowed } = require("./src/gep/solidify");
const cmd = "bash scripts/deploy.sh";  // Replace with your command
console.log(isValidationCommandAllowed(cmd) ? "ALLOWED" : "BLOCKED");
'

Inspect Validation Report

tail -1 assets/gep/events.jsonl | jq '.meta.validation_report'

Output:

{
  "type": "ValidationReport",
  "id": "vr_1234567890",
  "gene_id": "gene_repair_001",
  "commands": ["npm test"],
  "results": [
    {
      "cmd": "npm test",
      "ok": false,
      "duration_ms": 3200,
      "out": "...",
      "err": "Test suite failed. See above for errors."
    }
  ],
  "env_fingerprint": {"platform": "linux", "node_version": "v18.16.0"},
  "started_at": "2024-01-15T10:30:00.000Z",
  "finished_at": "2024-01-15T10:30:03.200Z"
}

Debugging Loop Gating

Check Pending State

cat memory/evolution/evolution_solidify_state.json | jq .

Output:

{
  "last_run": {
    "run_id": "run_1234567890",
    "started_at": "2024-01-15T10:30:00.000Z",
    "signals": ["log_error"],
    "selected_gene_id": "gene_repair_001"
  },
  "last_solidify": {
    "run_id": "run_1234567890",
    "solidified_at": "2024-01-15T10:32:00.000Z",
    "status": "success"
  }
}

Interpretation:

last_run.run_id == last_solidify.run_id: No pending evolution
last_run.run_id != last_solidify.run_id: Evolution pending, evolver will back off

Check Dormant Hypothesis

cat memory/evolution/dormant_hypothesis.json | jq .

Output:

{
  "backoff_reason": "active_sessions_exceeded",
  "active_sessions": 15,
  "queue_max": 10,
  "created_at": "2024-01-15T10:30:00.000Z",
  "ttl_ms": 3600000
}

Interpretation: Evolver backed off due to high user session load. Will resume after 1 hour or when sessions drop.

Asset Log Commands

Evolver tracks asset usage in memory/evolution/asset_call_log.jsonl:

Query Asset Usage

# Find all invocations of a specific gene
jq -r 'select(.asset_id == "gene_repair_001")' memory/evolution/asset_call_log.jsonl

# Count gene usage frequency
jq -r '.asset_id' memory/evolution/asset_call_log.jsonl | sort | uniq -c | sort -rn

# Find most recent capsule reuse
jq -r 'select(.asset_type == "Capsule")' memory/evolution/asset_call_log.jsonl | tail -1

Common Debug Scenarios

Scenario 1: Evolution runs but nothing happens

Check:

Bridge enabled: echo $EVOLVE_BRIDGE (should be true or unset)
Hand Agent spawned: grep 'sessions_spawn' logs/*.log
Prompt generated: EVOLVE_PRINT_PROMPT=true node index.js

Scenario 2: Evolution always fails validation

Check:

Gene validation commands: jq -r '.validation[]' assets/gep/genes.json
Run validation manually: npm test
Check for environment-specific issues: Compare env_fingerprint in validation report

Scenario 3: Wrong gene selected

Check:

Selector decision: tail -1 assets/gep/events.jsonl | jq '.meta.selector'
Memory advice: tail -1 assets/gep/events.jsonl | jq '.meta.selector.memory_advice'
Signal extraction: tail -1 assets/gep/events.jsonl | jq '.signals'

Configuration

Security

Troubleshooting

​Overview

​Log Files & Locations

​Evolution State

​GEP Assets

​Session Logs

​Workspace Memory

​Debugging Evolution Failures

​Step 1: Check Last Event

​Step 2: Identify Failure Reason

​Step 3: Check Validation Details

​Step 4: Analyze Blast Radius

​Step 5: Check Estimate vs. Actual Drift

​Debugging Signal Extraction

​Check Extracted Signals

​Verify Signal Sources

​Debug Signal Extraction Logic

​Debugging Gene Selection

​Check Selector Decision

​Debug Selector Scoring

​Check Memory Graph Advice

​Debugging Memory Graph

​Query Memory Graph

​Inspect Causal Chain

​Visualize Memory Graph

​Debugging Validation Failures

​Run Validation Manually

​Check Validation Safety

​Inspect Validation Report

​Debugging Loop Gating

​Check Pending State

​Check Dormant Hypothesis

​Asset Log Commands

​Query Asset Usage

​Common Debug Scenarios

​Scenario 1: Evolution runs but nothing happens

​Scenario 2: Evolution always fails validation

​Scenario 3: Wrong gene selected

​Related

Build docs developers (and LLMs) love

Overview

Log Files & Locations

Evolution State

GEP Assets

Session Logs

Workspace Memory

Debugging Evolution Failures

Step 1: Check Last Event

Step 2: Identify Failure Reason

Step 3: Check Validation Details

Step 4: Analyze Blast Radius

Step 5: Check Estimate vs. Actual Drift

Debugging Signal Extraction

Check Extracted Signals

Verify Signal Sources

Debug Signal Extraction Logic

Debugging Gene Selection

Check Selector Decision

Debug Selector Scoring

Check Memory Graph Advice

Debugging Memory Graph

Query Memory Graph

Inspect Causal Chain

Visualize Memory Graph

Debugging Validation Failures

Run Validation Manually

Check Validation Safety

Inspect Validation Report

Debugging Loop Gating

Check Pending State

Check Dormant Hypothesis

Asset Log Commands

Query Asset Usage

Common Debug Scenarios

Scenario 1: Evolution runs but nothing happens

Scenario 2: Evolution always fails validation

Scenario 3: Wrong gene selected

Related