Overview
rs-tunnel uses a background worker called the “reaper” to automatically detect and clean up stale tunnels. This ensures that tunnels whose clients have died or disconnected are properly removed along with their DNS records.
Reaper Worker
The reaper worker runs periodically to sweep for stale leases and process queued cleanup jobs.
Configuration
From apps/api/src/config/env.ts:40:
REAPER_INTERVAL_SEC: z.coerce.number().int().positive().default(30)
The reaper runs every 30 seconds by default.
Worker Implementation
From apps/api/src/workers/reaper.worker.ts:
export class ReaperWorker {
private intervalHandle?: NodeJS.Timeout;
constructor(
private readonly cleanupService: CleanupService,
private readonly intervalSec: number,
) {}
start(): void {
if (this.intervalHandle) {
return;
}
this.intervalHandle = setInterval(() => {
this.tick().catch((error) => {
logger.error('Reaper tick failed', error);
});
}, this.intervalSec * 1000);
void this.tick();
}
stop(): void {
if (this.intervalHandle) {
clearInterval(this.intervalHandle);
this.intervalHandle = undefined;
}
}
private async tick(): Promise<void> {
await this.cleanupService.sweepStaleLeases();
await this.cleanupService.processQueuedJobs();
}
}
The reaper performs two operations on each tick: sweeping stale leases and processing queued cleanup jobs.
Starting the Reaper
From apps/api/src/index.ts:32-35:
const reaper = new ReaperWorker(cleanupService, env.REAPER_INTERVAL_SEC);
if (env.NODE_ENV !== 'test') {
reaper.start();
}
The reaper is started automatically when the API server starts (except in test mode).
Stale Lease Detection
Lease Timeout
From apps/api/src/config/env.ts:39:
LEASE_TIMEOUT_SEC: z.coerce.number().int().positive().default(60)
Tunnels are considered stale if their lease has not been renewed within 60 seconds (default).
Sweeping Stale Leases
From apps/api/src/services/cleanup.service.ts:12-17:
async sweepStaleLeases(): Promise<void> {
const staleTunnelIds = await this.repository.findStaleTunnelIds(new Date());
await Promise.all(
staleTunnelIds.map((tunnelId) => this.repository.enqueueCleanupJob(tunnelId, 'stale_lease')),
);
}
Find stale tunnels
Query the database for tunnels whose lease expiry time has passed.
Enqueue cleanup jobs
Create a cleanup job for each stale tunnel with reason stale_lease.
Stale lease detection relies on accurate system time. Ensure your server’s clock is synchronized.
Cleanup Job Processing
Cleanup jobs are queued when:
- A lease expires (
stale_lease)
- Tunnel deletion fails due to active connections (
active_connections)
- Cloudflare API deletion fails (
deletion_failed)
Processing Jobs
From apps/api/src/services/cleanup.service.ts:19-48:
async processQueuedJobs(): Promise<void> {
const now = new Date();
const jobs = await this.repository.claimDueJobs(now, 25);
for (const job of jobs) {
try {
await this.tunnelService.stopTunnelById(job.tunnelId, `cleanup:${job.reason}`);
await this.repository.markCleanupJobDone(job.id);
} catch (error) {
const attemptCount = job.attemptCount + 1;
const backoffSeconds = calculateCleanupBackoffSeconds(attemptCount);
const nextAttemptAt = addSeconds(now, backoffSeconds);
const message = error instanceof Error ? error.message : 'Unknown cleanup failure';
await this.repository.markCleanupJobFailed({
jobId: job.id,
attemptCount,
nextAttemptAt,
message,
});
logger.error('Cleanup job failed', {
jobId: job.id,
tunnelId: job.tunnelId,
attemptCount,
message,
});
}
}
}
Claim due jobs
Retrieve up to 25 cleanup jobs that are ready to be processed (based on nextAttemptAt).
Stop tunnel
Call stopTunnelById to delete DNS record, delete Cloudflare tunnel, and mark tunnel as stopped.
Handle success
Mark the job as done and remove it from the queue.
Handle failure
Increment attempt count, calculate exponential backoff delay, and reschedule the job.
Retry Logic
Cleanup jobs use exponential backoff when they fail:
const backoffSeconds = calculateCleanupBackoffSeconds(attemptCount);
const nextAttemptAt = addSeconds(now, backoffSeconds);
Failed jobs are automatically retried with increasing delays between attempts.
The system can handle transient failures gracefully. Jobs will be retried until they succeed or reach a maximum attempt limit.
What Happens When a Tunnel is Reaped
When a tunnel is cleaned up (either from a stale lease or a queued job), the following actions occur:
1. DNS Record Deletion
From apps/api/src/services/tunnel.service.ts:181-183:
if (tunnel.cfDnsRecordId) {
await this.cloudflareService.deleteDnsRecord(tunnel.cfDnsRecordId);
}
The CNAME record pointing to the Cloudflare tunnel is deleted from the DNS zone.
2. Cloudflare Tunnel Deletion
From apps/api/src/services/tunnel.service.ts:185-216:
if (tunnel.cfTunnelId) {
const result = await this.cloudflareService.deleteTunnelWithRetry(tunnel.cfTunnelId);
if (!result.success) {
const cleanupReason = result.reason === 'active_connections' ? 'active_connections' : 'deletion_failed';
await this.repository.enqueueCleanupJob(tunnel.id, cleanupReason);
if (result.reason === 'active_connections') {
logger.info('Tunnel has active connections, will retry via cleanup job', {
tunnelId: tunnel.id,
cfTunnelId: tunnel.cfTunnelId,
});
throw new AppError(
503,
'TUNNEL_STOP_PENDING_ACTIVE_CONNECTIONS',
'Tunnel has active connections and will be stopped once they drain.',
);
}
logger.error('Failed to delete tunnel from Cloudflare', {
tunnelId: tunnel.id,
cfTunnelId: tunnel.cfTunnelId,
reason: result.reason,
message: result.message,
});
throw new AppError(
502,
'TUNNEL_CLOUDFLARE_DELETION_FAILED',
result.message ?? 'Failed to delete tunnel from Cloudflare; cleanup will be retried.',
);
}
}
The Cloudflare tunnel is deleted. If deletion fails due to active connections, a cleanup job is enqueued to retry later.
Tunnels with active connections cannot be deleted immediately. The reaper will retry cleanup jobs until all connections drain.
3. Lease Deletion
From apps/api/src/services/tunnel.service.ts:219:
await this.repository.deleteLease(tunnel.id);
The lease record is removed from the database.
4. Tunnel Status Update
From apps/api/src/services/tunnel.service.ts:220:
await this.repository.markTunnelStopped(tunnel.id);
The tunnel status is changed to stopped with a timestamp.
5. Audit Log
From apps/api/src/services/tunnel.service.ts:222-229:
await this.repository.createAuditLog({
userId: tunnel.userId,
action: 'tunnel.stopped',
metadata: {
tunnelId: tunnel.id,
reason,
},
});
An audit log entry is created documenting the cleanup action and reason.
Cleanup Reasons
Cleanup jobs track the reason for cleanup:
| Reason | Description |
|---|
stale_lease | Tunnel lease expired (no heartbeat received) |
user_requested | User explicitly stopped the tunnel |
active_connections | Cloudflare tunnel had active connections during deletion |
deletion_failed | Cloudflare API deletion failed for other reasons |
cleanup:stale_lease | Reaper processing a stale lease job |
cleanup:active_connections | Reaper retrying after connections drain |
Cleanup reasons prefixed with cleanup: indicate the job is being processed by the reaper worker.
Monitoring Cleanup Operations
Cleanup operations are logged for observability:
logger.error('Cleanup job failed', {
jobId: job.id,
tunnelId: job.tunnelId,
attemptCount,
message,
});
Monitor your logs for:
Reaper tick failed: The reaper encountered an error during a sweep
Cleanup job failed: A specific cleanup job failed and will be retried
Tunnel has active connections: Deletion deferred until connections drain
Failed to delete tunnel from Cloudflare: Cloudflare API error
Set up alerts for repeated cleanup failures to detect potential issues with Cloudflare API access or network connectivity.