Skip to main content

Deployment Overview

Gumroad uses automated deployments with Buildkite and Nomad for orchestration:
  • Staging - Automatically deployed from main branch
  • Production - Automatically deployed from main after staging succeeds
  • Branch Apps - Preview deployments for feature branches
  • Hotfixes - Emergency deployments outside the normal flow

Automated Deployment

The standard deployment flow is fully automated:
1

Merge to main

Pull requests merged to main trigger the deployment pipeline.
2

Run tests

Buildkite runs the full test suite on the main branch.
3

Deploy to staging

If tests pass, code is automatically deployed to staging.
4

Deploy to production

After staging validation, code is automatically deployed to production.
No manual intervention is needed for standard deployments. The CI/CD pipeline handles everything automatically.

Prerequisites

For manual deployments or hotfixes, you’ll need:

Required Setup

  1. Environment variables file - nomad/.env with deployment credentials
  2. AWS CLI - Installed and configured with IAM credentials
  3. Nomad documentation - Read nomad/README.md for architecture understanding
Deployment credentials are stored in 1Password (team members only).

Manual Deployment

Automatic Script Method

For simple deployments of verified commits:
bin/deploy
The script will:
  • Detect your current branch (doesn’t matter which)
  • Show commits to be deployed
  • Prompt for confirmation
  • Handle the deployment process

Manual Method

For more control over the deployment process:
1

Rebase staging into production

git checkout production
git rebase staging
2

Deploy to production

cd nomad/production
dotenv -f ../.env ./deploy_unattended.sh
3

Verify deployment

Check the Nomad UI to ensure your Docker image production-<git-sha> was deployed successfully.

Deployment Architecture

Docker Images

Each deployment creates a Docker image tagged with the git commit SHA:
production-d6b4605    # Example production image
staging-a1b2c3d       # Example staging image

Nomad Jobs

Nomad orchestrates several job types:
  • database_migration - Runs database migrations
  • web - Rails web servers
  • sidekiq_worker - Background job workers
  • rpush - Push notification service
  • post_deployment - Post-deployment tasks

Deployment Sequence

The deployment script follows this sequence:
# 1. Check deployment lock
check_for_deployment_lock

# 2. Run database migrations
run_job database_migration
wait_for_db_migrate

# 3. Start background workers
run_job rpush
run_job sidekiq_worker

# 4. Deploy web servers
if (production_deployment); then
  scale_up_web_server_clusters
  deploy_to_web_servers
else
  deploy_to_web_servers
fi

# 5. Post-deployment tasks
run_job post_deployment
create_release_tag

Deployment Locks

Deployments are automatically locked during active deployments.

Manual Lock Management

cd nomad/production
./lock_deployment.sh
Deployment locks prevent concurrent deployments that could cause conflicts. Only unlock manually if you’re certain no deployment is in progress.

Branch Apps

Create preview deployments for testing features:
1

Create branch with deploy- prefix

git checkout -b deploy-bundle-share
2

Push to remote

git push origin deploy-bundle-share
3

Wait for Buildkite

Buildkite automatically builds and deploys the branch app.
Branch apps are automatically deleted when the branch is deleted from the repository.

Hotfix Deployments

Hotfixes allow deploying critical fixes outside the normal flow.

When to Use Hotfixes

Use hotfixes for:
  • Critical production bugs
  • Security vulnerabilities
  • Data integrity issues
Hotfixes bypass the normal staging validation. Use only for emergencies.

Hotfix Process

1

Create hotfix branch

Branch from the last production tag:
# Find the last deployed tag from #releases Slack channel
# Example: production-d6b4605/2018-05-02-13-16-03

git fetch --tags
git checkout -b comp-assets-critical-fix production-d6b4605/2018-05-02-13-16-03
Branch name must start with comp-assets- for Docker image builds.
2

Make changes and push

# Make your changes
git add .
git commit -m "Fix critical issue"
git push origin comp-assets-critical-fix
3

Wait for Docker build

Wait for Buildkite to finish the docker_asset_compile job.
4

Deploy hotfix

# Get the commit SHA
git rev-parse --short=12 HEAD
# Example output: 491255bb0a4d

# Announce in Slack #releases
# "Deploying hotfix for critical bug: production-491255bb0a4d"

# Deploy
export DEPLOY_TAG="production-491255bb0a4d"
cd nomad/production
dotenv -f ../.env ./deploy_unattended.sh

Hotfix Alternative Tag Sources

If the latest deployment was already a hotfix (not in production-release tag):
  1. Find the last deployed tag from Slack #releases channel
  2. Check the tags page on GitHub
  3. Create branch from that specific tag

Hotfixing Workers Only

To deploy only Sidekiq workers without affecting web servers:

Method 1: Modify Deployment Script

diff --git a/nomad/common.sh b/nomad/common.sh
index bb60a7a598..d128504bca 100644
--- a/nomad/common.sh
+++ b/nomad/common.sh
@@ -138,26 +138,26 @@ function create_release_tag() {
 }
 
 function gr_deploy() {
-  check_for_deployment_lock
+  # check_for_deployment_lock
 
-  run_job database_migration
+  # run_job database_migration
 
-  logger "Waiting for db:migrate to complete"
+  # logger "Waiting for db:migrate to complete"
 
-  wait_for_db_migrate
+  # wait_for_db_migrate
 
-  run_job rpush
+  # run_job rpush
   run_job sidekiq_worker
 
-  if (production_deployment); then
-    scale_up_web_server_clusters
-    deploy_to_web_servers
-  else
-    deploy_to_web_servers
-  fi
+  # if (production_deployment); then
+  #   scale_up_web_server_clusters
+  #   deploy_to_web_servers
+  # else
+  #   deploy_to_web_servers
+  # fi
 
-  run_job post_deployment
+  # run_job post_deployment
 
   create_release_tag
 }

Method 2: Direct Nomad Command

export DEPLOY_TAG=production-abc123
cd nomad
source nomad_proxy_functions.sh
cd production
alias nomad=nomad_insecure_wrapper
dotenv -f ../.env erb sidekiq_worker.nomad.erb > sidekiq_worker.nomad
nomad run sidekiq_worker.nomad

Rollback Procedures

When Deployment Goes Bad

1

Stop running containers

Visit the Nomad UI at http://localhost:8080 and kill problematic containers in the Allocations page.If localhost doesn’t work:
cd nomad
source nomad_proxy_functions.sh
proxy_on production
2

Find previous revision

Check the #releases Slack channel for the last successful deployment tag.Example: production-d6b4605
3

Deploy previous version

export DEPLOY_TAG=production-d6b4605
cd nomad/production
dotenv -f ../.env ./deploy_unattended.sh
Always announce rollbacks in the #releases Slack channel so the team is aware.

Monitoring Deployments

Logs

All production logs are available in Kibana:
Search for your deployment SHA in Kibana to track migration progress and catch errors early.

Slack Notifications

Deployment events are posted to #releases channel:
  • Deployment starts
  • Deployment completes
  • Deployment failures
  • Rollbacks

Nomad UI

Access the Nomad UI to:
  • View running jobs and allocations
  • Check resource utilization
  • Monitor job health
  • View logs for specific allocations

Deployment Checklist

  • All tests passing on CI
  • Code reviewed and approved
  • Database migrations are backwards compatible
  • Feature flags configured if needed
  • Staging deployment successful and validated
  • Team notified in #releases (for manual deployments)
  • Monitor deployment progress in Nomad UI
  • Watch logs in Kibana for errors
  • Verify database migrations complete successfully
  • Check that new workers start properly
  • Confirm web servers are healthy
  • Smoke test critical functionality
  • Monitor error rates in Bugsnag
  • Check performance metrics
  • Verify Sidekiq queues are processing
  • Confirm in #releases channel

Environment-Specific Configuration

Staging Environment

  • Branch: staging
  • URL: https://staging.gumroad.com
  • Auto-deploy: Enabled on main branch
  • Database: Staging database (separate from production)

Production Environment

  • Branch: production (tags)
  • URL: https://gumroad.com
  • Auto-deploy: Enabled after staging succeeds
  • Database: Production database (replicated)

Troubleshooting

Problem: Deployment fails with “deployment is locked” message.Solution: Check if another deployment is in progress. If not, manually unlock:
cd nomad/production
./unlock_deployment.sh
Problem: Deployment fails because Docker image doesn’t exist.Solution: Ensure Buildkite completed the docker_asset_compile job. Check the Buildkite UI for the commit.
Problem: Migration fails during deployment.Solution:
  1. Check migration logs in Kibana
  2. If migration is invalid, rollback to previous version
  3. Fix migration and redeploy
  4. Never deploy migrations that aren’t backwards compatible

Best Practices

Test Migrations

Always test database migrations on staging before production. Ensure they’re reversible.

Use Feature Flags

Deploy code behind feature flags for gradual rollouts and easy rollbacks.

Monitor After Deploy

Watch error rates and performance for 30 minutes after each production deployment.

Document Hotfixes

Always document hotfix deployments in #releases with reason and impact.

Next Steps

Architecture

Understand the deployment architecture

Testing

Ensure tests pass before deploying

Contributing

Follow contribution guidelines

Testing

Test before deploying

Build docs developers (and LLMs) love