Home Data EngineeringCold Starts Kill Webhooks: Scheduling Cloud Run Min-Instances

Cold Starts Kill Webhooks: Scheduling Cloud Run Min-Instances

by Marc

We lost ~25 webhook events in 7 minutes because our Cloud Run service was scaling to zero. Here is how I used Cloud Scheduler to toggle min-instances during business hours — and the workaround for Cloud Scheduler not supporting PATCH requests.

The Problem: 504s During Cold Starts

We run a custom CRM-to-BigQuery sync on Cloud Run that receives webhook events. The service was deployed with the default min-instances=0 — meaning Cloud Run scales the service to zero containers when idle and spins up a new one when a request arrives.

This works fine for most HTTP services, but webhooks are unforgiving. When the CRM fires a webhook and the container is cold, Cloud Run needs to pull the image, start the container, and initialize the application. Our Python service with BigQuery client libraries takes about 8-12 seconds to cold start. The CRM’s webhook timeout is 10 seconds.

The result: during a burst of CRM activity after a quiet period (typically first thing Monday morning), the first webhook hits a cold container, times out with a 504, and the CRM marks the delivery as failed. The CRM retries with exponential backoff, but by then more webhooks are queuing up. We lost approximately 25 events in a 7-minute window on one Monday morning.

We caught this because our weekly full reconciliation job (which re-syncs all records from the CRM API) detected mismatches. But that runs on Sundays — meaning we had a full week of stale data for those 25 records.

The Obvious Fix and Why It’s Not Enough

The obvious fix is min-instances=1 — always keep at least one container warm. But Cloud Run charges for idle containers at a reduced rate, and we have multiple Cloud Run services. Setting min-instances=1 on all of them 24/7 adds up:

# Cost estimate: min-instances=1, 24/7
# Cloud Run idle pricing (as of 2026):
# - vCPU: $0.00000250 / vCPU-second (idle)
# - Memory: $0.00000025 / GiB-second (idle)
#
# For a 1 vCPU, 512MB instance:
# - vCPU:  0.00000250 * 86400 * 30 = $6.48/month
# - Memory: 0.00000025 * 0.5 * 86400 * 30 = $0.32/month
# Total: ~$6.80/month per service
#
# With 3 services: ~$20/month — not huge, but unnecessary for nights/weekends

More importantly, webhooks from this CRM only arrive during business hours (weekdays, roughly 7am-6pm CET). There is no reason to keep a container warm at 3am on a Saturday.

The Solution: Scheduled Min-Instances Toggle

The plan: use Cloud Scheduler to call the Cloud Run Admin API twice a day on weekdays — once to set min-instances=1 in the morning, and once to set it back to 0 in the evening.

The Cloud Run Admin API Call

The Cloud Run Admin API endpoint to update a service’s scaling configuration is:

PATCH https://run.googleapis.com/v2/projects/{PROJECT}/locations/{REGION}/services/{SERVICE}
Content-Type: application/json

{
  "scaling": {
    "minInstanceCount": 1
  }
}

# With update mask to only modify the scaling field:
?updateMask=scaling.minInstanceCount

Simple enough. But then I hit the first snag.

Cloud Scheduler Does Not Support PATCH

Cloud Scheduler’s HTTP target only supports GET, POST, PUT, DELETE, and HEAD. No PATCH. The Cloud Run Admin API requires PATCH for partial updates.

The workaround: use POST with the X-HTTP-Method-Override: PATCH header. The Cloud Run Admin API (which is a Google API behind the scenes) honours this header and treats the POST as a PATCH.

Creating the Scheduler Jobs

Here is the full setup for both the “wake” and “sleep” jobs:

# Variables
PROJECT_ID="my-gcp-project"
REGION="europe-west3"
SERVICE_NAME="crm-sync"
SA_EMAIL="[email protected]"

SERVICE_URL="https://run.googleapis.com/v2/projects/${PROJECT_ID}/locations/${REGION}/services/${SERVICE_NAME}?updateMask=scaling.minInstanceCount"

# Wake job: Mon-Fri 7:00 CET, set min-instances=1
gcloud scheduler jobs create http "${SERVICE_NAME}-wake" \
  --project="${PROJECT_ID}" \
  --location="europe-west3" \
  --schedule="0 7 * * 1-5" \
  --time-zone="Europe/Amsterdam" \
  --uri="${SERVICE_URL}" \
  --http-method="POST" \
  --headers="Content-Type=application/json,X-HTTP-Method-Override=PATCH" \
  --message-body='{"scaling":{"minInstanceCount":1}}' \
  --oauth-service-account-email="${SA_EMAIL}" \
  --oauth-token-scope="https://www.googleapis.com/auth/cloud-platform"

# Sleep job: Mon-Fri 18:00 CET, set min-instances=0
gcloud scheduler jobs create http "${SERVICE_NAME}-sleep" \
  --project="${PROJECT_ID}" \
  --location="europe-west3" \
  --schedule="0 18 * * 1-5" \
  --time-zone="Europe/Amsterdam" \
  --uri="${SERVICE_URL}" \
  --http-method="POST" \
  --headers="Content-Type=application/json,X-HTTP-Method-Override=PATCH" \
  --message-body='{"scaling":{"minInstanceCount":0}}' \
  --oauth-service-account-email="${SA_EMAIL}" \
  --oauth-token-scope="https://www.googleapis.com/auth/cloud-platform"

IAM: The Service Account Needs run.services.update

The scheduler’s service account needs permission to update Cloud Run services. I granted the roles/run.admin role, though a custom role with just run.services.update and run.services.get would be more precise:

# Grant the scheduler SA permission to update Cloud Run services
gcloud projects add-iam-policy-binding "${PROJECT_ID}" \
  --member="serviceAccount:${SA_EMAIL}" \
  --role="roles/run.admin"

Testing

To test without waiting for the cron schedule:

# Manually trigger the wake job
gcloud scheduler jobs run "${SERVICE_NAME}-wake" \
  --project="${PROJECT_ID}" \
  --location="europe-west3"

# Verify it worked
gcloud run services describe "${SERVICE_NAME}" \
  --project="${PROJECT_ID}" \
  --region="${REGION}" \
  --format="value(spec.template.spec.scaling.minInstanceCount)"
# Output: 1

# Trigger sleep
gcloud scheduler jobs run "${SERVICE_NAME}-sleep" \
  --project="${PROJECT_ID}" \
  --location="europe-west3"

# Verify
gcloud run services describe "${SERVICE_NAME}" \
  --project="${PROJECT_ID}" \
  --region="${REGION}" \
  --format="value(spec.template.spec.scaling.minInstanceCount)"
# Output: 0

Cost Analysis

With the scheduled approach, we only pay for an idle container during business hours:

# Scheduled min-instances: weekdays 7am-6pm (11 hours, 5 days/week)
# Weekly idle hours: 11 * 5 = 55 hours
# Monthly idle hours: 55 * 4.33 = ~238 hours
# vs 24/7: 720 hours/month
#
# Savings: (720 - 238) / 720 = 67% reduction in idle costs
#
# For 1 vCPU, 512MB:
# - 24/7:      ~$6.80/month
# - Scheduled: ~$2.25/month
# - Savings:   ~$4.55/month per service
#
# Cloud Scheduler cost: $0.10/month per job * 2 jobs = $0.20/month
# Net savings: ~$4.35/month per service

The savings per service are modest, but we applied this pattern to three Cloud Run services that receive webhooks. And more importantly, the cost scales linearly if we add more services or need larger containers.

Edge Cases and Gotchas

What about webhooks outside business hours?

For this particular CRM, webhook events outside business hours are extremely rare (the CRM is used by office workers in a single timezone). For services that receive webhooks 24/7, we keep min-instances=1 permanently and skip the scheduler pattern.

What if the scheduler job fails?

If the “wake” job fails, the service stays at min-instances=0 and we risk cold-start 504s. Cloud Scheduler has built-in retry logic (configurable), and I also set up a simple monitoring alert on the scheduler job’s error rate. In practice, over three months of operation, neither job has ever failed.

Deployments reset min-instances

This is the most important gotcha: when you deploy a new revision of a Cloud Run service with gcloud run deploy, the deployment sets min-instances based on your deploy command or service YAML — which is usually 0. This overwrites whatever the scheduler set.

I added a post-deploy step to our deployment script that triggers the wake job if it is currently within business hours:

#!/bin/bash
# deploy.sh (excerpt)

# ... build and deploy ...
gcloud run deploy "${SERVICE_NAME}" \
  --image="${IMAGE}" \
  --region="${REGION}" \
  --project="${PROJECT_ID}"

# Re-apply min-instances if within business hours (Mon-Fri 7-18 CET)
HOUR=$(TZ="Europe/Amsterdam" date +%H)
DOW=$(date +%u)  # 1=Monday, 7=Sunday
if [ "$DOW" -le 5 ] && [ "$HOUR" -ge 7 ] && [ "$HOUR" -lt 18 ]; then
  echo "Within business hours — re-applying min-instances=1"
  gcloud scheduler jobs run "${SERVICE_NAME}-wake" \
    --project="${PROJECT_ID}" \
    --location="europe-west3"
fi

Results

Since deploying the scheduled min-instances pattern three months ago:

  • Zero webhook 504s during business hours — the warm container responds in ~200ms
  • Zero lost webhook events — down from ~25 per incident
  • 67% reduction in idle Cloud Run costs compared to 24/7 min-instances
  • Weekly reconciliation diffs dropped to zero — confirming no events are being missed

The X-HTTP-Method-Override trick for Cloud Scheduler is not well documented, and I spent a frustrating hour trying to figure out why my PATCH requests were being rejected before finding it. Hopefully this saves someone else that hour.

Categories: Data Engineering, Tools & Automation

You may also like

Leave a Comment