This guide covers common issues encountered when operating IOTA nodes and their solutions.
Node Won’t Start
Missing Configuration File
Symptom: Error message about missing or invalid configuration file
Solution:
# Verify config file exists
ls -l /path/to/node.yaml
# Validate YAML syntax
yamllint /path/to/node.yaml
# Check file permissions
chmod 644 /path/to/node.yaml
Invalid Keypair Files
Symptom: Panic with message about invalid keypair file
Solution:
# Verify keypair files exist
ls -l /path/to/keys/
# Check file permissions
chmod 600 /path/to/keys/*.key
# Regenerate keypairs if corrupted
iota keytool generate ed25519
Never share or expose your keypair files. Store them securely with restricted permissions.
Genesis File Issues
Symptom: Error loading genesis file
Solution:
# Verify genesis file
ls -l /path/to/genesis.blob
# Re-download genesis for your network
wget https://github.com/iotaledger/iota/raw/main/crates/iota-genesis-builder/genesis/mainnet.blob
# Verify checksum (if provided)
sha256sum genesis.blob
Database Issues
Corrupted Database
Symptom: Node crashes on startup with database errors
Solution:
Backup existing database
cp -r /var/lib/iota/db /var/lib/iota/db.backup
Clear database and resync
# Stop the node
docker stop iota-node
# Remove database
rm -rf /var/lib/iota/db
# Restart node (will resync from network)
docker start iota-node
Resyncing from scratch can take several hours to days depending on the blockchain size and your network speed.
Disk Space Full
Symptom: Database write errors, node stops processing
Solution:
# Check disk usage
df -h /var/lib/iota
# Enable aggressive pruning
# In node.yaml:
authority-store-pruning-config:
num-epochs-to-retain: 0
num-epochs-to-retain-for-checkpoints: 2
# Or expand disk space
Write Stall
Symptom: Node becomes unresponsive, high disk I/O wait
Solution:
# Disable write stall for fullnodes (in node.yaml)
enable-db-write-stall: false
# For validators, optimize disk I/O:
# - Use NVMe SSD
# - Tune RocksDB settings
# - Enable periodic compaction
authority-store-pruning-config:
periodic-compaction-threshold-days: 1
Network Connectivity Issues
No Peers Connected
Symptom: Node can’t sync, peer count is zero
Solution:
Check firewall rules
# Allow P2P port
ufw allow 8084/tcp
# Verify port is listening
netstat -tlnp | grep 8084
Configure seed peers
# In node.yaml
p2p-config:
seed-peers:
- address: "/dns/seed1.iota.org/tcp/8084"
- address: "/dns/seed2.iota.org/tcp/8084"
Check external address
# Ensure external address is reachable
p2p-config:
external-address: "/dns/your-node.example.com/tcp/8084"
Slow Synchronization
Symptom: Node syncing slower than expected
Solution:
# Increase concurrency in node.yaml
p2p-config:
state-sync:
checkpoint-header-download-concurrency: 800
checkpoint-content-download-concurrency: 800
checkpoint-content-download-tx-concurrency: 100000
Connection Timeouts
Symptom: Frequent timeout errors in logs
Solution:
p2p-config:
state-sync:
# Increase timeouts (in milliseconds)
timeout-ms: 20000
checkpoint-content-timeout-ms: 120000
High CPU Usage
Symptom: CPU constantly at 100%
Diagnosis:
# Check which process is using CPU
top -p $(pgrep iota-node)
# Review metrics
curl http://localhost:9184/metrics | grep -E "(scope|future|task)"
Solution:
- Verify hardware meets requirements
- Check for thread stalls in metrics
- Reduce concurrent operations:
checkpoint-executor-config:
checkpoint-execution-max-concurrency: 20 # Reduce from default 40
High Memory Usage
Symptom: Node using excessive RAM, potential OOM kills
Solution:
# Reduce cache sizes
execution-cache-config:
writeback-cache:
max-cache-size: 50000 # Default: 100000
transaction-cache-size: 50000
object-cache-size: 50000
Or use environment variables:
export IOTA_CACHE_WRITEBACK_SIZE_MAX=50000
export IOTA_CACHE_WRITEBACK_SIZE_TRANSACTION=50000
export IOTA_CACHE_WRITEBACK_SIZE_OBJECT=50000
Thread Stalls
Symptom: thread_stall_duration_sec metric increasing
Diagnosis:
# Check stall frequency
rate(thread_stall_duration_sec_count[5m])
# Check stall duration
rate(thread_stall_duration_sec_sum[5m]) / rate(thread_stall_duration_sec_count[5m])
Solution:
- Review system load and I/O wait
- Check for disk I/O bottlenecks
- Ensure adequate CPU resources
- Review logs for blocking operations
Validator-Specific Issues
Not Participating in Consensus
Symptom: Validator not signing checkpoints
Diagnosis:
Verify validator status
curl http://localhost:9000 -X POST \
-H 'Content-Type: application/json' \
-d '{"jsonrpc":"2.0","id":1,"method":"iota_getValidators"}' | jq
Check stake amount
Ensure your validator has sufficient stake and is in the active validator set.
Verify keypairs
# Ensure correct keypairs are configured
authority-key-pair:
path: /path/to/authority.key
protocol-key-pair:
path: /path/to/protocol.key
Consensus Database Growing Too Large
Symptom: Consensus DB consuming excessive disk space
Solution:
consensus-config:
# Reduce retention
db-retention-epochs: 1
# More frequent pruning
db-pruner-period-secs: 1800 # 30 minutes
High Transaction Rejection Rate
Symptom: Many transactions being rejected
Diagnosis:
# Check load shedding metrics
rate(grpc_requests{status!="Ok"}[5m])
Solution:
# Adjust overload thresholds
authority-overload-config:
max-transaction-manager-queue-length: 150000 # Increase from 100000
execution-queue-latency-soft-limit: 2s # Increase tolerance
API Issues
JSON-RPC Not Responding
Symptom: RPC requests timeout or fail
Diagnosis:
# Test JSON-RPC endpoint
curl -v http://localhost:9000 -X POST \
-H 'Content-Type: application/json' \
-d '{"jsonrpc":"2.0","id":1,"method":"rpc.discover"}'
# Check if port is listening
netstat -tlnp | grep 9000
Solution:
- Verify
json-rpc-address in configuration
- Check firewall rules
- Review RPC-specific overload settings:
execution-cache-config:
writeback-cache:
backpressure-threshold-for-rpc: 150000
gRPC Connection Refused
Symptom: gRPC clients cannot connect
Solution:
# Ensure gRPC API is enabled
enable-grpc-api: true
grpc-api-config:
address: "0.0.0.0:50051"
max-message-size-bytes: 134217728
# Check port accessibility
telnet localhost 50051
Metrics and Monitoring Issues
Metrics Endpoint Not Accessible
Symptom: Cannot access http://localhost:9184/metrics
Solution:
# Verify metrics address in node.yaml
metrics-address: "0.0.0.0:9184"
# Check port binding
netstat -tlnp | grep 9184
# Test locally
curl http://127.0.0.1:9184/metrics
Metrics Push Failing
Symptom: Warnings about unable to push metrics
Solution:
metrics:
push-url: "https://valid-endpoint.example.com/push"
push-interval-seconds: 120 # Increase interval
Check logs for specific error messages about the push endpoint.
Log Analysis
Finding Errors in Logs
# For Docker deployments
docker logs iota-node | grep -i error
docker logs iota-node | grep -i panic
docker logs iota-node | grep -i fatal
# For systemd services
journalctl -u iota-node | grep -i error
# Filter by time
docker logs --since 1h iota-node | grep -i error
Common Error Messages
| Error | Cause | Solution |
|---|
| ”Failed to load genesis” | Missing/invalid genesis file | Re-download genesis.blob |
| ”Invalid keypair file” | Corrupted or wrong key format | Regenerate keypairs |
| ”Database corruption” | Disk failure or improper shutdown | Restore from backup or resync |
| ”Connection refused” | Port not open or service not running | Check firewall and service status |
| ”Out of memory” | Insufficient RAM | Reduce cache sizes or add RAM |
Emergency Recovery
Node Completely Unresponsive
Collect diagnostics
# Save current logs
docker logs iota-node > node-crash-$(date +%Y%m%d-%H%M%S).log
# Save metrics snapshot
curl http://localhost:9184/metrics > metrics-$(date +%Y%m%d-%H%M%S).txt
Force restart
docker stop -t 30 iota-node # Allow 30s for graceful shutdown
docker start iota-node
Data Corruption After Crash
If database corruption occurs repeatedly, there may be an underlying hardware issue (failing disk, bad RAM, etc.).
Recovery procedure:
- Stop the node
- Backup corrupted database
- Remove database directory
- Restore from latest checkpoint (if available)
- Otherwise, resync from network
Getting Help
When seeking help, provide:
- Node version:
iota-node --version
- Configuration (with sensitive data removed)
- Recent logs showing the error
- Relevant metrics snapshots
- System information: OS, CPU, RAM, disk
- Network: mainnet/testnet
- IOTA Discord: Technical support channels
- GitHub Issues: Bug reports and feature requests
- Documentation: Official IOTA documentation
Preventive Measures
- Regular backups: Backup keypairs and critical data
- Monitoring: Set up alerts for critical metrics
- Updates: Keep node software up to date
- Resource monitoring: Track CPU, memory, disk usage trends
- Log rotation: Configure log rotation to prevent disk fill
- Disaster recovery plan: Document recovery procedures
Next Steps