Error Handling Architecture
The Proxmox VE Helper Scripts implement a comprehensive error handling system that ensures proper cleanup, detailed logging, and telemetry reporting for all failures.Core Components
Error Handler Function
The main error handler is triggered by the ERR trap and provides detailed error context:error_handler.func:232-381
Signal Handlers
EXIT Trap (on_exit)
EXIT Trap (on_exit)
Runs on every script termination to catch orphaned records:Why it’s critical: Prevents records from being stuck in “installing” or “configuring” states forever.
error_handler.func:507-533
INT Trap (on_interrupt) - Ctrl+C
INT Trap (on_interrupt) - Ctrl+C
Handles user interruption via Ctrl+C:Exit code: 130 (128 + SIGINT signal number 2)
error_handler.func:543-558
TERM Trap (on_terminate) - Process Kill
TERM Trap (on_terminate) - Process Kill
Handles process termination via kill command:Exit code: 143 (128 + SIGTERM signal number 15)
error_handler.func:568-583
HUP Trap (on_hangup) - SSH Disconnect
HUP Trap (on_hangup) - SSH Disconnect
Handles terminal disconnection (SSH session closed):Exit code: 129 (128 + SIGHUP signal number 1)Why it’s critical: This was previously missing, causing container processes to become orphans on SSH disconnect — the #1 cause of stuck “installing” and “configuring” states.
error_handler.func:596-605
Initialization
catch_errors()
Initializes error handling and sets up all traps:
error_handler.func:627-638
set -Ee: Exit on error, inherit ERR trap in functionsset -o pipefail: Pipeline fails if any command failsset -u: (optional) Exit on undefined variable if STRICT_UNSET=1
Exit Code Reference
Exit Code Categories
Generic/Shell Errors (1-10, 124-146)
Generic/Shell Errors (1-10, 124-146)
| Code | Description |
|---|---|
| 1 | General error / Operation not permitted |
| 2 | Misuse of shell builtins (syntax error) |
| 3 | General syntax or argument error |
| 10 | Docker / privileged mode required |
| 124 | Command timed out |
| 126 | Command invoked cannot execute |
| 127 | Command not found |
| 130 | Aborted by user (SIGINT/Ctrl+C) |
| 137 | Killed (SIGKILL / Out of memory) |
| 139 | Segmentation fault |
| 143 | Terminated (SIGTERM) |
curl/wget Errors (4-95)
curl/wget Errors (4-95)
| Code | Description |
|---|---|
| 6 | DNS resolution failed (could not resolve host) |
| 7 | Failed to connect (network unreachable) |
| 22 | HTTP error returned (404, 429, 500+) |
| 28 | Operation timeout (network slow or server not responding) |
| 35 | SSL/TLS handshake failed (certificate error) |
| 51 | SSL peer certificate verification failed |
| 52 | Empty reply from server |
| 56 | Receive error (connection reset by peer) |
Package Manager Errors (100-102)
Package Manager Errors (100-102)
| Code | Description |
|---|---|
| 100 | APT: Package manager error (broken packages) |
| 101 | APT: Configuration error (bad sources.list) |
| 102 | APT: Lock held by another process |
Script Validation & Setup (103-123)
Script Validation & Setup (103-123)
| Code | Description |
|---|---|
| 103 | Validation: Shell is not Bash |
| 104 | Validation: Not running as root |
| 105 | Validation: Proxmox VE version not supported |
| 106 | Validation: Architecture not supported (ARM/PiMox) |
| 107 | Validation: Kernel key parameters unreadable |
| 108 | Validation: Kernel key limits exceeded |
| 109 | Proxmox: No available container ID after max attempts |
| 115 | Download: install.func download failed |
| 116 | Proxmox: Default bridge vmbr0 not found |
| 117 | LXC: Container did not reach running state |
| 118 | LXC: No IP assigned to container after timeout |
| 121 | LXC: Container network not ready (no IP) |
| 122 | LXC: No internet connectivity — user declined |
Proxmox Custom Codes (200-231)
Proxmox Custom Codes (200-231)
| Code | Description |
|---|---|
| 203 | Proxmox: Missing CTID variable |
| 205 | Proxmox: Invalid CTID (less than 100) |
| 206 | Proxmox: CTID already in use |
| 207 | Proxmox: Password contains unescaped special characters |
| 209 | Proxmox: Container creation failed |
| 211 | Proxmox: Timeout waiting for template lock |
| 214 | Proxmox: Not enough storage space |
| 215 | Proxmox: Container created but not listed (ghost state) |
| 222 | Proxmox: Template download failed |
Database Errors (170-193)
Database Errors (170-193)
| Code | Description |
|---|---|
| 170 | PostgreSQL: Connection failed |
| 171 | PostgreSQL: Authentication failed |
| 180 | MySQL/MariaDB: Connection failed |
| 181 | MySQL/MariaDB: Authentication failed |
| 190 | MongoDB: Connection failed |
| 191 | MongoDB: Authentication failed |
Application Errors (250-254)
Application Errors (250-254)
| Code | Description |
|---|---|
| 250 | App: Download failed or version not determined |
| 251 | App: File extraction failed (corrupt archive) |
| 252 | App: Required file or resource not found |
| 253 | App: Data migration required — update aborted |
| 254 | App: User declined prompt or input timed out |
Telemetry Integration
Failure Reporting
The error handler automatically reports failures to the telemetry API:error_handler.func:388-469
- Works in both host and container contexts
- Collects last 200 log lines for diagnosis
- Includes exit code explanation in error text
- Retries once on failure
- Never blocks script execution
Orphaned Container Prevention
The Problem
When an installation script is running and the SSH session disconnects (SIGHUP), the container process becomes orphaned and continues running. This causes:- Host script exits without updating telemetry status
- Container continues installation and sends “configuring” status
- Record becomes stuck in “configuring” state forever
The Solution
error_handler.func:485-490
- Installation is in progress (
CONTAINER_INSTALLING=true) - Container ID is set (
CTIDvariable exists) - We’re on the Proxmox host (
pctcommand available)
Error Context Collection
Log Display
When an error occurs, the last 20 lines of the active log are displayed:error_handler.func:295-300
Container Error Flags
Inside the container, error information is written to flag files:error_handler.func:305-309
pct exec to determine the exact failure point.
Best Practices
Always call
catch_errors() early: Place it immediately after sourcing core.func and error_handler.func to ensure all errors are caught.Use specific exit codes: Return meaningful exit codes from functions to aid in debugging.Never suppress errors silently: Always use proper error handling, not || true unless intentional.Test signal handling: Verify scripts handle Ctrl+C, SSH disconnects, and kills gracefully.Include context in errors: Use msg_error with descriptive messages, not just exit codes.Error Recovery
Automatic Container Cleanup
When an error occurs during container creation:- 60-second timeout starts
- User prompt appears: “Remove broken container ? (Y/n)”
- Default action (Y): Container stopped and destroyed
- User can override (n): Container kept for debugging
- Auto-remove on timeout: Container removed if no response
Manual Recovery
If a container is left in a broken state:The error handling system is designed to fail safely. Even if telemetry fails, log collection fails, or cleanup fails, the script will exit cleanly with the correct exit code.