Google Cloud Run provides a fully managed serverless platform for running containerized applications. MABQ Agent is optimized for Cloud Run deployment with automatic scaling, built-in load balancing, and pay-per-use pricing.
Why Cloud Run?
Auto-Scaling Automatically scales from 0 to N instances based on traffic
Serverless No infrastructure management required
Pay-Per-Use Only pay for actual request processing time
Integrated Security Built-in support for service accounts and IAM
Prerequisites
Before deploying to Cloud Run, ensure you have:
Google Cloud project with billing enabled
gcloud CLI installed and authenticated
Docker images built and pushed to Google Container Registry (GCR) or Artifact Registry
Service account with appropriate BigQuery permissions
Backend Deployment
Service Configuration
The backend runs on port 8080 with proxy headers enabled to correctly handle Cloud Run’s reverse proxy:
CMD [ "uvicorn" , "main:app" , "--host" , "0.0.0.0" , "--port" , "8080" , "--proxy-headers" ]
The --proxy-headers flag ensures proper handling of X-Forwarded-* headers from Cloud Run’s load balancer.
Deploy Backend
Build and Push Container
Build your Docker image and push to Artifact Registry: # Set your project and region
export PROJECT_ID = your-project-id
export REGION = us-east4
# Configure Docker for Artifact Registry
gcloud auth configure-docker ${ REGION } -docker.pkg.dev
# Build and push
cd POC_ADK
docker build -t ${ REGION } -docker.pkg.dev/ ${ PROJECT_ID } /mabq/backend:latest .
docker push ${ REGION } -docker.pkg.dev/ ${ PROJECT_ID } /mabq/backend:latest
Deploy to Cloud Run
Deploy the backend service with required environment variables: gcloud run deploy mabq-backend \
--image ${ REGION } -docker.pkg.dev/ ${ PROJECT_ID } /mabq/backend:latest \
--region ${ REGION } \
--platform managed \
--allow-unauthenticated \
--port 8080 \
--set-env-vars PROJECT_ID= ${ PROJECT_ID } \
--set-env-vars BIGQUERY_DATASET=STG_ACTIVOS \
--set-env-vars GOOGLE_CLOUD_LOCATION= ${ REGION } \
--set-env-vars ANALYTICS_AGENT_MODEL=gemini-2.5-pro \
--set-env-vars AZURE_TENANT_ID=your-tenant-id \
--set-env-vars AZURE_CLIENT_ID=your-client-id \
--set-env-vars FRONTEND_URL=https://your-frontend-url.run.app \
--service-account=mabq-service-account@${ PROJECT_ID }.iam.gserviceaccount.com
Verify Deployment
Check the deployment status: gcloud run services describe mabq-backend --region ${ REGION }
Your backend will be available at a URL like: https://mabq-backend-1093163678323.us-east4.run.app
The backend uses --allow-unauthenticated but implements custom Azure AD authentication at the application level. Consider using Cloud Run’s built-in authentication for additional security layers.
Frontend Deployment
Deploy Frontend
Build and Push Container
cd frontend-agente
docker build -t ${ REGION } -docker.pkg.dev/ ${ PROJECT_ID } /mabq/frontend:latest .
docker push ${ REGION } -docker.pkg.dev/ ${ PROJECT_ID } /mabq/frontend:latest
Deploy to Cloud Run
Deploy the frontend with the backend API URL: gcloud run deploy mabq-frontend \
--image ${ REGION } -docker.pkg.dev/ ${ PROJECT_ID } /mabq/frontend:latest \
--region ${ REGION } \
--platform managed \
--allow-unauthenticated \
--port 3000 \
--set-env-vars NEXT_PUBLIC_API_URL=https://mabq-backend-1093163678323.us-east4.run.app
Update Backend CORS
After deploying the frontend, update the backend’s FRONTEND_URL to allow CORS: gcloud run services update mabq-backend \
--region ${ REGION } \
--update-env-vars FRONTEND_URL=https://mabq-frontend-1093163678323.us-east4.run.app
CORS Configuration
The backend implements strict CORS based on the FRONTEND_URL environment variable:
FRONTEND_URL = os.environ.get( "FRONTEND_URL" , "https://mabq-frontend-1093163678323.us-east4.run.app" )
app.add_middleware(
CORSMiddleware,
allow_origins = [ FRONTEND_URL ],
allow_credentials = True ,
allow_methods = [ "*" ],
allow_headers = [ "*" ],
)
The CORS configuration only allows requests from the specified frontend URL. Update FRONTEND_URL after deploying the frontend to ensure proper communication.
CORS Flow
Frontend makes request from https://mabq-frontend-*.run.app
Backend validates the Origin header matches FRONTEND_URL
If matched, CORS headers are added to the response
Credentials (cookies, auth headers) are allowed through
Service Account Permissions
The backend service account needs the following IAM roles:
BigQuery Access
Vertex AI Access
gcloud projects add-iam-policy-binding ${ PROJECT_ID } \
--member=serviceAccount:mabq-service-account@${ PROJECT_ID }.iam.gserviceaccount.com \
--role=roles/bigquery.dataViewer
gcloud projects add-iam-policy-binding ${ PROJECT_ID } \
--member=serviceAccount:mabq-service-account@${ PROJECT_ID }.iam.gserviceaccount.com \
--role=roles/bigquery.jobUser
Production Configuration
Resource Limits
Configure CPU and memory based on your workload:
gcloud run services update mabq-backend \
--region ${ REGION } \
--cpu 2 \
--memory 4Gi \
--max-instances 10 \
--min-instances 1
Number of CPU cores (1, 2, 4, 6, 8)
Memory allocation (256Mi, 512Mi, 1Gi, 2Gi, 4Gi, 8Gi, 16Gi, 32Gi)
Maximum number of container instances
Minimum number of container instances (set to 1+ to reduce cold starts)
Request Timeout
For longer-running queries, increase the request timeout:
gcloud run services update mabq-backend \
--region ${ REGION } \
--timeout 300
Cloud Run has a maximum timeout of 60 minutes for requests. The default is 5 minutes.
Concurrency
Adjust the number of concurrent requests per instance:
gcloud run services update mabq-backend \
--region ${ REGION } \
--concurrency 80
Higher concurrency can improve throughput but may increase latency. Test with your expected workload.
Monitoring and Logs
View Logs
gcloud run services logs read mabq-backend \
--region ${ REGION } \
--limit 50
Stream Logs
gcloud run services logs tail mabq-backend \
--region ${ REGION }
Metrics Dashboard
Access Cloud Run metrics in the Google Cloud Console:
https://console.cloud.google.com/run/detail/${REGION}/mabq-backend/metrics
Health Checks
Cloud Run automatically monitors your service health. The backend exposes a health endpoint:
@app.middleware ( "http" )
async def strict_security_middleware ( request : Request, call_next ):
if request.method == "OPTIONS" or request.url.path in [ "/docs" , "/openapi.json" , "/health" ]:
return await call_next(request)
The /health endpoint bypasses authentication for Cloud Run health checks.
Serverless Architecture Benefits
Zero Scaling
When idle, Cloud Run scales to zero instances, reducing costs to zero for the compute layer.
Automatic Scaling
As requests increase, Cloud Run automatically provisions more instances to handle the load.
Built-in Load Balancing
Traffic is automatically distributed across available instances.
Global CDN
Static content can be served through Google’s global CDN for faster response times.
Troubleshooting
Cold Start Latency
If cold starts are an issue, set --min-instances 1 to keep at least one instance warm:
gcloud run services update mabq-backend \
--region ${ REGION } \
--min-instances 1
CORS Errors
Verify the FRONTEND_URL environment variable matches your frontend’s actual URL:
gcloud run services describe mabq-backend \
--region ${ REGION } \
--format 'value(spec.template.spec.containers[0].env)'
Authentication Issues
Check service account permissions and ensure the Azure AD configuration is correct:
gcloud run services describe mabq-backend \
--region ${ REGION } \
--format 'value(spec.template.spec.serviceAccountName)'
Next Steps
Environment Variables Configure all required environment variables
Security Configuration Learn about Azure AD authentication