What is Uptime Monitoring?
Uptime monitoring performs regular health checks on your websites:- HTTP checks: Verify your site responds correctly
- Response time tracking: Monitor TTFB and total request time
- SSL certificate monitoring: Track expiration dates
- Status code validation: Ensure proper HTTP responses
- Geographic probing: Checks from multiple regions
- Downtime detection: Alert on failures and outages
How It Works
Databuddy uses QStash (by Upstash) for distributed scheduling:- Schedule creation: You define check frequency and URL
- Distributed execution: QStash triggers checks from edge locations
- HTTP probe: Databuddy’s uptime service fetches your URL
- Data collection: Records response time, status, SSL info
- Event streaming: Sends results to Kafka → ClickHouse
- Analytics: View uptime metrics in dashboard
Uptime checks are performed from different geographic regions to detect regional outages.
Setting Up Monitors
Via Dashboard
- Navigate to Uptime in the sidebar
- Click Create Monitor
- Enter the URL to monitor
- Configure check settings
- Click Create
Monitor Configuration
| Field | Description | Default |
|---|---|---|
| URL | Website URL to monitor | Required |
| Check Interval | How often to check (minutes) | 5 minutes |
| Timeout | Max time to wait for response (ms) | 30,000 ms |
| Cache Bust | Add random query param to prevent caching | false |
| Website ID | Link to analytics website | Optional |
Example: Basic Monitor
example.com every 5 minutes.
Example: Advanced Monitor
Check Intervals
Choose how frequently to monitor:- 1 minute: Critical services, APIs
- 5 minutes: Standard websites (recommended)
- 15 minutes: Low-priority sites
- 30 minutes: Backup monitoring
- 1 hour: Cost-effective monitoring
Timeout Configuration
Set maximum wait time for responses:- 10 seconds: Fast APIs, CDN-backed sites
- 30 seconds: Default (recommended)
- 60 seconds: Slow backends, heavy pages
Cache Busting
Prevent false positives from cached responses:- Disabled (default): Standard checks
- Enabled: Adds
?_cb=<random>to URL
- CDN-cached pages
- Aggressive browser caching
- Testing origin server health
- Standard monitoring
- APIs that don’t cache
- Conserving bandwidth
Monitored Metrics
Each check records:Response Metrics
- HTTP Status Code: 200, 404, 500, etc.
- TTFB (Time to First Byte): Server response time in ms
- Total Time: Complete request duration in ms
- Response Size: Bytes received
- Content Hash: SHA-256 hash for change detection
SSL Metrics
- Certificate Expiry: Unix timestamp of expiration
- Certificate Valid: Boolean indicating validity
Network Metrics
- Redirect Count: Number of HTTP redirects followed
- Probe Region: Geographic location of check
- Probe IP: IP address of monitoring probe
Status
- UP: Successful check (2xx or 3xx status)
- DOWN: Failed check (4xx, 5xx, timeout, or network error)
Databuddy follows up to 10 redirects automatically. Redirect loops are detected and marked as failures.
SSL Certificate Monitoring
Automatic SSL/TLS certificate tracking:Certificate Checks
- Expiration date: When certificate expires
- Validity: Whether certificate is currently valid
- Issuer: Certificate authority (extracted from cert)
Expiration Alerts
Get notified before certificates expire:- 30 days before expiration
- 14 days before expiration
- 7 days before expiration
- Day of expiration
HTTP Status Handling
Success (UP)
- 2xx: OK (200, 201, 204, etc.)
- 3xx: Redirects (301, 302, 307, 308)
Failure (DOWN)
- 4xx: Client errors (404, 403, etc.)
- 5xx: Server errors (500, 502, 503, etc.)
- 0: Timeout or network error
Viewing Uptime Data
Monitor Dashboard
View all monitors:- Current status (UP/DOWN)
- Last check time
- Uptime percentage (24h, 7d, 30d)
- Average response time
- Recent incidents
Monitor Details
Drill down into individual monitors:- Uptime graph: Visual timeline of status
- Response time chart: TTFB and total time trends
- Incident history: Downtime events with duration
- SSL status: Certificate expiration countdown
- Geographic performance: Response times by region
Metrics Over Time
Analyze trends:- Hourly: Last 24-48 hours
- Daily: Last 30-90 days
- Weekly: Last 6-12 months
Content Change Detection
Monitor for unexpected page changes:- Content hash: SHA-256 hash of response body
- Change detection: Alerts when hash changes
- False positive filtering: Ignore dynamic timestamps, ads
- Detect defacement
- Monitor for unauthorized changes
- Track deployment success
JSON Response Parsing
Monitor API responses with custom validation:JSON Parsing Config
Extract specific JSON fields:response.data.status === "healthy".
Use Cases
- API health endpoints
- Microservice status pages
- Custom validation logic
- Extract metrics from responses
Failure Handling
Retry Logic
Failed checks are retried:- Initial check fails
- Wait 30 seconds
- Retry (up to 3 times)
- Mark as DOWN if all retries fail
Failure Streaks
Track consecutive failures:- Streak count: Number of consecutive DOWN checks
- Alert threshold: Notify after 2-3 consecutive failures
- Recovery: Streak resets on first UP check
Infrastructure
Databuddy’s uptime service architecture:- Runtime: Bun + Elysia
- Scheduler: Upstash QStash (distributed cron)
- Database: PostgreSQL (monitor configs)
- Analytics: Kafka + ClickHouse (check results)
- Observability: OpenTelemetry → Axiom
Request Flow
Security
Webhook Verification
QStash webhooks are cryptographically verified:Request Headers
Uptime checks use realistic browser headers:Compression Support
Automatic response decompression:- Tries
gzip, deflate, brfirst - Falls back to
gzip, deflateon encoding errors - Handles corrupt compression gracefully
Best Practices
Monitor Critical Pages
Prioritize monitoring:- Homepage
- Login/signup pages
- Checkout/payment flows
- API endpoints
- Critical content pages
Choose Appropriate Intervals
- Mission-critical: 1-minute checks
- Production websites: 5-minute checks
- Staging/dev: 15-minute checks
- Backup monitors: 30-minute checks
Set Realistic Timeouts
- Fast sites: 5-10 seconds
- Average sites: 30 seconds (default)
- Slow backends: 60 seconds
Monitor SSL Expiration
For HTTPS sites:- Enable certificate monitoring
- Set up 30-day expiration alerts
- Use Let’s Encrypt auto-renewal where possible
Use JSON Parsing for APIs
Go beyond status codes:- Check response structure
- Validate specific fields
- Monitor API contract compliance
Alerts & Notifications
Alert Channels
Configure notifications:- Slack
- Discord
- Webhook (custom integrations)
- SMS (via Twilio)
Alert Conditions
- Site goes DOWN (after retry logic)
- Site recovers (returns UP)
- SSL certificate expires in less than 30 days
- Response time exceeds threshold
- Content hash changes unexpectedly
Alert configuration is available in the monitor settings page.
Uptime SLA Calculation
Calculate uptime percentage:Example
- Total checks: 288 (24h × 12 checks/hour)
- Failed checks: 3
- Uptime: (285 / 288) × 100 = 98.96%
Industry Standards
- 99.9% (“three nines”): 43 minutes downtime/month
- 99.95%: 22 minutes downtime/month
- 99.99% (“four nines”): 4 minutes downtime/month
Exporting Uptime Data
Export monitor data for analysis:- Open monitor details
- Select date range
- Click Export
- Choose format (CSV or JSON)
- Download data
Next Steps
Analytics
Analyze uptime trends and patterns
AI Insights
Ask the AI about uptime performance
Alerts Setup
Configure uptime notifications
API Reference
Programmatic monitor management