Skip to main content
Scans code and config files for 24 production readiness items and produces a scorecard with PASS / WARN / FAIL status, suggested fixes, and an overall readiness percentage. Use when you need to audit a codebase before deploying to production, review operational concerns, or assess deployment readiness.

Invocation

/prod-readiness
Run from the project root. The skill detects the stack automatically by scanning config files and then evaluates all 24 checks against the full project tree.

Workflow

1

Detect project stack

Identify the language, framework, package manager, and container tooling by scanning project root files.Detection covers: Python (Flask, FastAPI, Django), Node.js (Express, Fastify, NestJS), Java/Kotlin (Spring Boot, Micronaut via Gradle or Maven), Go, Rust, and C#/.NET. Container tooling is detected from Dockerfile, docker-compose.yml, and Kubernetes manifests.
2

Load check definitions

Load all 24 production readiness checks organized across 4 categories: Reliability, Observability, Security, and Operations.Each check defines what to look for, the severity level, and the pass/warn/fail criteria.
3

Scan the codebase

For each of the 24 checks, use Glob and Grep with primary, framework-specific, and broad fallback patterns to find evidence across the full project tree — including src/, config files, Dockerfile, CI pipelines, and infrastructure files.Every match is recorded with its file path and line number.
Each check gets three search attempts before it can be marked FAIL. Never mark FAIL after a single search miss.
4

Score each check

Assign PASS, WARN, FAIL, or N/A to each check using the scoring rules:
ScorePointsWhen to use
PASS100At least one file:line reference confirms implementation
WARN50Implementation exists but is incomplete or has gaps
FAIL0All three search attempts returned nothing
N/AexcludedCheck does not apply to this project type
Overall percentage formula:
score = (PASS_count × 100 + WARN_count × 50) / (applicable_checks × 100) × 100
ScoreGradeMeaning
90–100%AProduction ready
75–89%BNear ready — address FAIL items before deploy
60–74%CSignificant gaps — not safe for production
40–59%DMajor gaps — substantial work needed
0–39%FNot production ready
5

Generate scorecard

Output the markdown scorecard table organized by category, fix suggestions for every FAIL and WARN item, and the overall readiness percentage.Every PASS item lists at least one file:line as evidence. Every FAIL item includes a framework-specific fix suggestion with a code or config snippet of 5–15 lines.
6

Self-review

Verify the scorecard before delivering:
  • All 24 checks appear in the scorecard (none skipped)
  • Every FAIL item has a specific, actionable fix referencing the detected stack
  • Every PASS item lists at least one file path as evidence
  • The overall percentage matches the formula
  • No false FAILs: alternative patterns were tried before marking FAIL
  • Category groupings (Reliability, Observability, Security, Operations) are present

The 24 checks

#CheckSeverityDescription
1Health Check EndpointsCriticalHTTP /health, /healthz, /ready, or /actuator/health endpoint exists for load balancer and orchestrator probes
2Graceful ShutdownCriticalProcess handles SIGTERM/SIGINT by draining in-flight requests before exiting
3Circuit BreakersHighStops calling a failing downstream after repeated failures to prevent cascade failures
4Retry PoliciesHighAutomatic retries for transient failures with exponential backoff and jitter
5TimeoutsCriticalExplicit timeouts on all external calls (HTTP, database, cache, message queue)
6Connection PoolingMediumReuse of connections to databases, caches, and external services via pools
PASS criteria examples:
  • Check 1: Route defined for /healthz in src/routes/health.ts:14
  • Check 2: SIGTERM handler calls server.close() before process.exit()
  • Check 5: HTTP client configured with connectTimeout and readTimeout
#CheckSeverityDescription
7Structured LoggingHighLog output in JSON format with consistent fields: timestamp, level, message, correlation_id
8Metrics / InstrumentationHighApplication exports counters, histograms, and gauges (request count, latency, error rate)
9Distributed TracingMediumRequest flows traced across service boundaries with trace IDs and spans
10Monitoring / Alerting ConfigurationHighAlert rules defined in code for key indicators (error rate, P99 latency, disk usage)
11Error TrackingMediumUnhandled exceptions captured and reported to an error tracking service (Sentry, Bugsnag, Rollbar)
PASS criteria examples:
  • Check 7: structlog or zap or winston configured with JSON format
  • Check 8: prometheus-client with custom Counter and Histogram definitions
  • Check 10: Prometheus rules.yml or Datadog monitor config present in repo
#CheckSeverityDescription
12Input ValidationCriticalAll external input validated before processing using a schema library (Zod, Pydantic, Bean Validation)
13Secrets ManagementCriticalSecrets loaded from environment variables or vault; no hardcoded credentials in source files
14Security HeadersMediumHTTP responses include HSTS, CSP, X-Content-Type-Options, and X-Frame-Options
15CORS ConfigurationMediumCORS explicitly configured with allowed origins list — not wildcard *
16TLS / SSLHighAll external communication uses TLS; HTTP redirected to HTTPS
17Dependency Vulnerability ScanningHighDependencies scanned for known vulnerabilities in CI (npm audit, Snyk, Trivy, Dependabot)
PASS criteria examples:
  • Check 12: zod schema applied to all Express route handlers
  • Check 13: All secrets use process.env; .env listed in .gitignore
  • Check 17: .github/dependabot.yml present and npm audit step in CI workflow
#CheckSeverityDescription
18Container Health ProbesHighlivenessProbe and readinessProbe defined in Kubernetes deployment YAML or HEALTHCHECK in Dockerfile
19Resource LimitsHighCPU and memory limits set in Kubernetes deployment YAML, Docker Compose, or JVM flags
20Database MigrationsMediumSchema changes managed through versioned migration files (Flyway, Alembic, Prisma migrate, goose)
21Graceful Degradation / FallbacksMediumSystem returns a degraded response when a dependency fails, instead of a hard error
22API VersioningMediumAPI endpoints versioned (/v1/, /v2/) to allow non-breaking evolution
23Rate LimitingMediumIncoming requests rate-limited at application or API gateway level
24DocumentationLowAPI spec (OpenAPI/Swagger), generated docs, or README with endpoint descriptions
N/A cases:
  • Check 20 (database migrations): marked N/A for stateless services with no database access
  • Check 22 (API versioning): marked N/A for CLI tools, libraries, and non-API services
  • Check 24 (documentation): marked N/A for non-API services

Self-review checklist

Before delivering the scorecard, verify all of the following:
  • All 24 checks from the check definitions appear in the scorecard (none skipped)
  • Every FAIL item has a specific, actionable fix suggestion referencing the project’s stack
  • Every PASS item lists at least one file path as evidence
  • The overall percentage matches the formula: (PASS_count × 100 + WARN_count × 50) / (applicable_checks × 100) × 100
  • No false FAILs: re-scanned with alternative patterns before marking FAIL
  • Category groupings (Reliability, Observability, Security, Operations) are present
  • Fix suggestions reference the detected framework, not generic advice

Golden rules

Never mark FAIL after a single Grep. Use the primary pattern, then the framework-specific pattern, then a broad fallback pattern. Only mark FAIL after all three miss.
Never mark PASS without recording at least one file:line reference. “I saw it somewhere” is not evidence.
Never suggest “add a health check endpoint.” Always suggest “add a Spring Boot Actuator /health endpoint” or “add an Express /healthz route” based on the detected stack.
Mark WARN only when evidence exists but is incomplete — for example, logging exists but is not structured, or timeouts exist on some clients but not all. Never use WARN as “I’m not sure.”
Always search the full project tree. Never limit to src/ alone — config, Docker, CI, and infra files contain critical production readiness signals.
If a check category does not apply to the project type, mark it N/A with a one-line reason and exclude it from the percentage calculation.