Skip to main content
Google Cloud Run provides a fully managed serverless platform for running containerized applications. MABQ Agent is optimized for Cloud Run deployment with automatic scaling, built-in load balancing, and pay-per-use pricing.

Why Cloud Run?

Auto-Scaling

Automatically scales from 0 to N instances based on traffic

Serverless

No infrastructure management required

Pay-Per-Use

Only pay for actual request processing time

Integrated Security

Built-in support for service accounts and IAM

Prerequisites

Before deploying to Cloud Run, ensure you have:
  • Google Cloud project with billing enabled
  • gcloud CLI installed and authenticated
  • Docker images built and pushed to Google Container Registry (GCR) or Artifact Registry
  • Service account with appropriate BigQuery permissions

Backend Deployment

Service Configuration

The backend runs on port 8080 with proxy headers enabled to correctly handle Cloud Run’s reverse proxy:
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8080", "--proxy-headers"]
The --proxy-headers flag ensures proper handling of X-Forwarded-* headers from Cloud Run’s load balancer.

Deploy Backend

1

Build and Push Container

Build your Docker image and push to Artifact Registry:
# Set your project and region
export PROJECT_ID=your-project-id
export REGION=us-east4

# Configure Docker for Artifact Registry
gcloud auth configure-docker ${REGION}-docker.pkg.dev

# Build and push
cd POC_ADK
docker build -t ${REGION}-docker.pkg.dev/${PROJECT_ID}/mabq/backend:latest .
docker push ${REGION}-docker.pkg.dev/${PROJECT_ID}/mabq/backend:latest
2

Deploy to Cloud Run

Deploy the backend service with required environment variables:
gcloud run deploy mabq-backend \
  --image ${REGION}-docker.pkg.dev/${PROJECT_ID}/mabq/backend:latest \
  --region ${REGION} \
  --platform managed \
  --allow-unauthenticated \
  --port 8080 \
  --set-env-vars PROJECT_ID=${PROJECT_ID} \
  --set-env-vars BIGQUERY_DATASET=STG_ACTIVOS \
  --set-env-vars GOOGLE_CLOUD_LOCATION=${REGION} \
  --set-env-vars ANALYTICS_AGENT_MODEL=gemini-2.5-pro \
  --set-env-vars AZURE_TENANT_ID=your-tenant-id \
  --set-env-vars AZURE_CLIENT_ID=your-client-id \
  --set-env-vars FRONTEND_URL=https://your-frontend-url.run.app \
  --service-account=mabq-service-account@${PROJECT_ID}.iam.gserviceaccount.com
3

Verify Deployment

Check the deployment status:
gcloud run services describe mabq-backend --region ${REGION}
Your backend will be available at a URL like:
https://mabq-backend-1093163678323.us-east4.run.app
The backend uses --allow-unauthenticated but implements custom Azure AD authentication at the application level. Consider using Cloud Run’s built-in authentication for additional security layers.

Frontend Deployment

Deploy Frontend

1

Build and Push Container

cd frontend-agente
docker build -t ${REGION}-docker.pkg.dev/${PROJECT_ID}/mabq/frontend:latest .
docker push ${REGION}-docker.pkg.dev/${PROJECT_ID}/mabq/frontend:latest
2

Deploy to Cloud Run

Deploy the frontend with the backend API URL:
gcloud run deploy mabq-frontend \
  --image ${REGION}-docker.pkg.dev/${PROJECT_ID}/mabq/frontend:latest \
  --region ${REGION} \
  --platform managed \
  --allow-unauthenticated \
  --port 3000 \
  --set-env-vars NEXT_PUBLIC_API_URL=https://mabq-backend-1093163678323.us-east4.run.app
3

Update Backend CORS

After deploying the frontend, update the backend’s FRONTEND_URL to allow CORS:
gcloud run services update mabq-backend \
  --region ${REGION} \
  --update-env-vars FRONTEND_URL=https://mabq-frontend-1093163678323.us-east4.run.app

CORS Configuration

The backend implements strict CORS based on the FRONTEND_URL environment variable:
FRONTEND_URL = os.environ.get("FRONTEND_URL", "https://mabq-frontend-1093163678323.us-east4.run.app")

app.add_middleware(
    CORSMiddleware,
    allow_origins=[FRONTEND_URL], 
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)
The CORS configuration only allows requests from the specified frontend URL. Update FRONTEND_URL after deploying the frontend to ensure proper communication.

CORS Flow

  1. Frontend makes request from https://mabq-frontend-*.run.app
  2. Backend validates the Origin header matches FRONTEND_URL
  3. If matched, CORS headers are added to the response
  4. Credentials (cookies, auth headers) are allowed through

Service Account Permissions

The backend service account needs the following IAM roles:
gcloud projects add-iam-policy-binding ${PROJECT_ID} \
  --member=serviceAccount:mabq-service-account@${PROJECT_ID}.iam.gserviceaccount.com \
  --role=roles/bigquery.dataViewer

gcloud projects add-iam-policy-binding ${PROJECT_ID} \
  --member=serviceAccount:mabq-service-account@${PROJECT_ID}.iam.gserviceaccount.com \
  --role=roles/bigquery.jobUser

Production Configuration

Resource Limits

Configure CPU and memory based on your workload:
gcloud run services update mabq-backend \
  --region ${REGION} \
  --cpu 2 \
  --memory 4Gi \
  --max-instances 10 \
  --min-instances 1
cpu
string
default:"1"
Number of CPU cores (1, 2, 4, 6, 8)
memory
string
default:"512Mi"
Memory allocation (256Mi, 512Mi, 1Gi, 2Gi, 4Gi, 8Gi, 16Gi, 32Gi)
max-instances
number
default:"100"
Maximum number of container instances
min-instances
number
default:"0"
Minimum number of container instances (set to 1+ to reduce cold starts)

Request Timeout

For longer-running queries, increase the request timeout:
gcloud run services update mabq-backend \
  --region ${REGION} \
  --timeout 300
Cloud Run has a maximum timeout of 60 minutes for requests. The default is 5 minutes.

Concurrency

Adjust the number of concurrent requests per instance:
gcloud run services update mabq-backend \
  --region ${REGION} \
  --concurrency 80
Higher concurrency can improve throughput but may increase latency. Test with your expected workload.

Monitoring and Logs

View Logs

gcloud run services logs read mabq-backend \
  --region ${REGION} \
  --limit 50

Stream Logs

gcloud run services logs tail mabq-backend \
  --region ${REGION}

Metrics Dashboard

Access Cloud Run metrics in the Google Cloud Console:
https://console.cloud.google.com/run/detail/${REGION}/mabq-backend/metrics

Health Checks

Cloud Run automatically monitors your service health. The backend exposes a health endpoint:
@app.middleware("http")
async def strict_security_middleware(request: Request, call_next):
    if request.method == "OPTIONS" or request.url.path in ["/docs", "/openapi.json", "/health"]:
        return await call_next(request)
The /health endpoint bypasses authentication for Cloud Run health checks.

Serverless Architecture Benefits

1

Zero Scaling

When idle, Cloud Run scales to zero instances, reducing costs to zero for the compute layer.
2

Automatic Scaling

As requests increase, Cloud Run automatically provisions more instances to handle the load.
3

Built-in Load Balancing

Traffic is automatically distributed across available instances.
4

Global CDN

Static content can be served through Google’s global CDN for faster response times.

Troubleshooting

Cold Start Latency

If cold starts are an issue, set --min-instances 1 to keep at least one instance warm:
gcloud run services update mabq-backend \
  --region ${REGION} \
  --min-instances 1

CORS Errors

Verify the FRONTEND_URL environment variable matches your frontend’s actual URL:
gcloud run services describe mabq-backend \
  --region ${REGION} \
  --format 'value(spec.template.spec.containers[0].env)'

Authentication Issues

Check service account permissions and ensure the Azure AD configuration is correct:
gcloud run services describe mabq-backend \
  --region ${REGION} \
  --format 'value(spec.template.spec.serviceAccountName)'

Next Steps

Environment Variables

Configure all required environment variables

Security Configuration

Learn about Azure AD authentication

Build docs developers (and LLMs) love