Overview
Monitoring is critical for maintaining a healthy production API. This guide covers health checks, logging, error tracking, and performance monitoring.
Health Check Endpoint
Overview
The health check endpoint provides real-time status of the API (src/server.ts:112):
app.get('/health', (req, res) => {
res.status(200).json({
status: 'ok',
timestamp: new Date().toISOString(),
uptime: process.uptime(),
environment: process.env.NODE_ENV || 'development',
});
});
Usage
curl https://api.tresacontafy.com/health
{
"status": "ok",
"timestamp": "2026-03-07T12:00:00.000Z",
"uptime": 3600.5,
"environment": "production"
}
Health status: ok if server is running
Current server timestamp (ISO 8601)
Server uptime in seconds since last restart
Current environment: development, production, or test
Integration
Uptime Monitoring
Load Balancer
Kubernetes
Use services like UptimeRobot, Pingdom, or Better Uptime:
- Add health check URL:
https://api.yourdomain.com/health
- Set interval: 5 minutes (recommended)
- Alert on: Status code ≠ 200
- Alert channels: Email, Slack, SMS
Configure health checks in your load balancer:AWS ALB:HealthCheck:
Path: /health
Protocol: HTTP
Port: 3001
HealthyThresholdCount: 2
UnhealthyThresholdCount: 3
TimeoutSeconds: 5
IntervalSeconds: 30
Nginx:upstream api {
server api1.example.com:3001 max_fails=3 fail_timeout=30s;
server api2.example.com:3001 max_fails=3 fail_timeout=30s;
}
location /health {
proxy_pass http://api/health;
health_check interval=10s fails=3 passes=2;
}
Use health check for liveness and readiness probes:apiVersion: v1
kind: Pod
spec:
containers:
- name: tresa-api
image: tresa-api:latest
livenessProbe:
httpGet:
path: /health
port: 3001
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 3001
initialDelaySeconds: 5
periodSeconds: 5
Logging
Pino Logger
Tresa Contafy uses Pino for high-performance logging (src/utils/logger.util.ts):
import pino from 'pino';
const isProduction = process.env.NODE_ENV === 'production';
const logger = pino({
level: process.env.LOG_LEVEL || (isProduction ? 'info' : 'debug'),
...(isProduction
? {
formatters: {
level: (label) => ({ level: label }),
},
timestamp: pino.stdTimeFunctions.isoTime,
serializers: {
err: pino.stdSerializers.err,
req: pino.stdSerializers.req,
res: pino.stdSerializers.res,
},
}
: {
transport: {
target: 'pino-pretty',
options: {
colorize: true,
translateTime: 'HH:MM:ss Z',
ignore: 'pid,hostname',
},
},
}),
});
Log Levels
When: Application crash, unrecoverable errorsExample:logger.fatal({ err: error }, 'Database connection failed');
process.exit(1);
When: Error conditions that need attentionExample:logger.error({ err, userId }, 'Failed to process invoice');
When: Warning conditions, potential issuesExample:logger.warn('FRONTEND_URL not configured in production');
When: Informational messages (default in production)Example:logger.info({ port, environment }, 'Server started');
When: Debug information (default in development)Example:logger.debug({ profileId }, 'Fetching profile data');
When: Very detailed trace informationExample:logger.trace({ query }, 'Database query executed');
Production (JSON)
Development (Pretty)
Structured JSON for log aggregation tools:{
"level": "info",
"time": "2026-03-07T12:00:00.000Z",
"msg": "Server started",
"port": 3001,
"environment": "production"
}
Human-readable colored output:[12:00:00] INFO: Server started
port: 3001
environment: "development"
Setting Log Level
Control verbosity via environment variable:
# Production: Only info and above
LOG_LEVEL=info
# Development: Debug and above
LOG_LEVEL=debug
# Troubleshooting: Everything
LOG_LEVEL=trace
HTTP Request Logging
Morgan middleware logs all HTTP requests (src/server.ts:58):
app.use(morgan(process.env.NODE_ENV === 'production' ? 'combined' : 'dev'));
Production format (combined):
::1 - - [07/Mar/2026:12:00:00 +0000] "GET /api/profiles HTTP/1.1" 200 1234 "-" "Mozilla/5.0..."
Development format (dev):
GET /api/profiles 200 123.456 ms - 1234
Error Logging
Centralized error handler logs all errors (src/server.ts:144):
app.use((err, req, res, next) => {
logger.error(
{
err,
method: req.method,
path: req.path,
ip: req.ip,
userAgent: req.get('user-agent'),
},
'Unhandled error'
);
// Send response
});
Error log example:
{
"level": "error",
"time": "2026-03-07T12:00:00.000Z",
"msg": "Unhandled error",
"err": {
"type": "Error",
"message": "Database connection failed",
"stack": "Error: Database connection failed\n at ..."
},
"method": "GET",
"path": "/api/profiles",
"ip": "192.168.1.1"
}
Error Tracking
Process Error Handlers
Global error handlers catch unhandled errors (src/index.ts):
process.on('unhandledRejection', (reason, promise) => {
logger.error({ reason, promise }, 'Unhandled Rejection');
if (process.env.NODE_ENV === 'production') {
// Optional: Exit and let process manager restart
// process.exit(1);
}
});
process.on('uncaughtException', (error) => {
logger.fatal({ err: error }, 'Uncaught Exception');
if (process.env.NODE_ENV === 'production') {
process.exit(1);
}
});
Graceful Shutdown
Handle shutdown signals gracefully (src/index.ts:28):
const gracefulShutdown = (signal) => {
logger.info({ signal }, 'Graceful shutdown initiated');
process.exit(0);
};
process.on('SIGTERM', () => gracefulShutdown('SIGTERM'));
process.on('SIGINT', () => gracefulShutdown('SIGINT'));
External Error Tracking
Integrate with error tracking services:
import * as Sentry from '@sentry/node';
Sentry.init({
dsn: process.env.SENTRY_DSN,
environment: process.env.NODE_ENV,
tracesSampleRate: 1.0,
});
app.use(Sentry.Handlers.requestHandler());
app.use(Sentry.Handlers.errorHandler());
import Rollbar from 'rollbar';
const rollbar = new Rollbar({
accessToken: process.env.ROLLBAR_ACCESS_TOKEN,
environment: process.env.NODE_ENV,
});
app.use(rollbar.errorHandler());
pnpm add @bugsnag/js @bugsnag/plugin-express
import Bugsnag from '@bugsnag/js';
import BugsnagPluginExpress from '@bugsnag/plugin-express';
Bugsnag.start({
apiKey: process.env.BUGSNAG_API_KEY,
plugins: [BugsnagPluginExpress],
});
const middleware = Bugsnag.getPlugin('express');
app.use(middleware.requestHandler);
app.use(middleware.errorHandler);
Log Aggregation
Centralized Logging
Forward logs to centralized systems:
Railway
Datadog
Logtail
CloudWatch
Railway automatically collects logs:
- Go to your service
- Click “Deployments” → “View Logs”
- Filter by level, time, or search text
Install Datadog agent:DD_API_KEY=<your-key> DD_SITE="datadoghq.com" bash -c "$(curl -L https://s3.amazonaws.com/dd-agent/scripts/install_script.sh)"
Logs are automatically collected and indexed. Stream logs to Logtail (Better Stack):pnpm add @logtail/node @logtail/pino
import { Logtail } from '@logtail/node';
import { LogtailTransport } from '@logtail/pino';
const logtail = new Logtail(process.env.LOGTAIL_SOURCE_TOKEN);
const logger = pino({
transport: {
target: '@logtail/pino',
options: { logtail },
},
});
For AWS deployments:pnpm add aws-cloudwatch-log
import CloudWatchTransport from 'aws-cloudwatch-log';
const logger = pino({
transport: {
target: 'pino-cloudwatch',
options: {
logGroupName: '/tresa-api/production',
logStreamName: 'api-logs',
},
},
});
Response Time Monitoring
Morgan logs include response time:
GET /api/profiles 200 123.456 ms - 1234
Watch for slow requests (>1000ms) and optimize.
Enable Sequelize logging in development:
const sequelize = new Sequelize(databaseUrl, {
logging: (sql, timing) => {
logger.debug({ sql, timing }, 'Database query');
},
});
Create newrelic.js:exports.config = {
app_name: ['Tresa Contafy API'],
license_key: process.env.NEW_RELIC_LICENSE_KEY,
logging: {
level: 'info',
},
};
Require at app start:import 'newrelic';
import app from './server';
pnpm add @appsignal/nodejs @appsignal/express
import { Appsignal } from '@appsignal/nodejs';
import { expressMiddleware } from '@appsignal/express';
const appsignal = new Appsignal({
active: true,
name: 'Tresa Contafy API',
pushApiKey: process.env.APPSIGNAL_PUSH_API_KEY,
});
app.use(expressMiddleware(appsignal));
Metrics
Custom Metrics Endpoint
Create a metrics endpoint for monitoring:
app.get('/metrics', (req, res) => {
res.json({
uptime: process.uptime(),
memory: process.memoryUsage(),
cpu: process.cpuUsage(),
timestamp: new Date().toISOString(),
});
});
Response:
{
"uptime": 3600.5,
"memory": {
"rss": 123456789,
"heapTotal": 98765432,
"heapUsed": 87654321,
"external": 1234567
},
"cpu": {
"user": 123456,
"system": 78901
},
"timestamp": "2026-03-07T12:00:00.000Z"
}
Prometheus Integration
For Prometheus metrics:
import client from 'prom-client';
const register = new client.Registry();
client.collectDefaultMetrics({ register });
app.get('/metrics', async (req, res) => {
res.set('Content-Type', register.contentType);
res.end(await register.metrics());
});
Alerting
Alert Conditions
Set up alerts for:
Health check failures - 3+ consecutive failures
High error rate - >1% of requests fail
Slow response time - Average >1000ms
High memory usage - >80% of available memory
Database connection errors - Any connection failures
Rate limit exceeded - Unusual spike in rate limiting
Alert Channels
Email
Critical alerts to on-call team
Slack
Real-time notifications to dev channel
SMS
High-priority alerts for immediate response
PagerDuty
Incident management and escalation
Monitoring Checklist
✓ Health check endpoint configured
✓ Uptime monitoring active (UptimeRobot, Pingdom)
✓ Structured logging with Pino
✓ Log level set appropriately (info in production)
✓ Error tracking service integrated (Sentry, Rollbar)
✓ Centralized log aggregation (Datadog, CloudWatch)
✓ Performance monitoring (APM)
✓ Alerts configured for critical conditions
✓ Graceful shutdown handlers implemented
✓ Database query performance monitored
Next Steps
Security
Review security configuration and best practices
Production Deployment
Complete production deployment guide