Kubernetes Resource Request Limit Rightsizer

Optimize Kubernetes CPU and memory requests/limits based on actual usage for cost savings. Enter values for instant results with step-by-step formulas.

December 2025

Worked Examples

Example 1: Web Application - Over-Provisioned

Problem:3-replica web app. Current: 1000m CPU request, 2Gi memory request. Actual usage: 200m avg / 600m peak CPU, 400Mi avg / 800Mi peak memory.

Solution:Analysis:\n- CPU utilization: 200m / 1000m = 20% (wasteful)\n- Memory utilization: 400Mi / 2048Mi = 20% (wasteful)\n\nRecommended (1.3x buffer for web):\n- CPU request: 200 × 1.3 = 260m\n- CPU limit: 600 × 1.2 = 720m\n- Memory request: 400 × 1.3 = 520Mi\n- Memory limit: 800 × 1.2 = 960Mi\n\nResource reduction:\n- CPU: 1000m → 260m (74% reduction)\n- Memory: 2Gi → 520Mi (75% reduction)\n\nCost impact (assuming $30/core/mo, $5/GB/mo):\n- Current: 3 × ($30 + $10) = $120/mo\n- Optimized: 3 × ($7.80 + $2.54) = $31/mo\n- Savings: $89/mo ($1,068/year)

Result:CPU: 260m/720m | Memory: 520Mi/960Mi | Savings: $89/month

Example 2: ML Inference - Bursty Workload

Problem:ML model serving with high variance. 2 replicas. Current: 2000m CPU, 4Gi memory. Usage: 500m avg / 1800m peak CPU, 2Gi avg / 3.5Gi peak memory.

Solution:Analysis (2.0x buffer for ML):\n- CPU is bursty: 500m base, 1800m peak (3.6x variance)\n- Memory spikes during batch inference\n\nRecommended:\n- CPU request: 500 × 2.0 = 1000m (handle variance)\n- CPU limit: 1800 × 1.2 = 2160m (allow bursts)\n- Memory request: 2048 × 1.3 = 2662Mi\n- Memory limit: 3584 × 1.2 = 4301Mi\n\nFor ML workloads, slightly over-provision to avoid latency spikes.\n\nCost comparison:\n- Current: 2 × ($60 + $20) = $160/mo\n- Optimized: 2 × ($30 + $13) = $86/mo\n- Savings: $74/mo while maintaining performance headroom

Result:CPU: 1000m/2160m | Memory: 2662Mi/4301Mi | Savings: $74/month

Example 3: Database - Memory-Critical

Problem:PostgreSQL pod. 1 replica. Current: 500m CPU, 8Gi memory. Usage: 100m avg / 300m peak CPU, 6Gi avg / 7.2Gi peak memory.

Solution:Analysis (1.4x buffer for database):\n- CPU is over-provisioned (100m vs 500m)\n- Memory is appropriately sized (high utilization is expected)\n\nRecommended:\n- CPU request: 100 × 1.4 = 140m\n- CPU limit: 300 × 1.2 = 360m\n- Memory request: 6144 × 1.1 = 6758Mi (databases benefit from stable memory)\n- Memory limit: 7372 × 1.2 = 8847Mi\n\nCAUTION: For databases, memory OOM is catastrophic.\nKeep memory limit generous. Memory savings are less important than stability.\n\nCost:\n- Current: $15 + $40 = $55/mo\n- Optimized: $4.20 + $33 = $37.20/mo\n- Savings: $17.80/mo (primarily from CPU)

Result:CPU: 140m/360m | Memory: 6758Mi/8847Mi | Savings: $17.80/month (conservative)

Frequently Asked Questions

What's the difference between requests and limits in Kubernetes?

Requests are guaranteed resources the container needs to run—used by the scheduler to place pods on nodes. Limits are maximum resources a container can use—exceeding CPU causes throttling; exceeding memory causes OOMKill. Set requests based on typical usage; set limits based on peak usage plus buffer.

How do I determine the right CPU request?

Analyze actual CPU usage over 1-2 weeks using metrics (Prometheus, Datadog). Set request to: (average usage × 1.2-1.5 buffer). The buffer accounts for variance and prevents throttling during normal operation. For bursty workloads, use higher buffers (1.5-2x).

How do I determine the right memory request and limit?

Memory is less elastic than CPU—exceeding limit causes OOMKill. Set request to (average usage × 1.3-1.5). Set limit to (peak usage × 1.2) minimum. For memory, it's better to over-provision slightly than risk OOMKill. Monitor OOMKilled events in your cluster.

How do cloud costs relate to Kubernetes resource settings?

Cloud providers charge for provisioned node capacity, not pod requests. However, requests determine how many pods fit per node. Over-provisioned requests mean fewer pods per node, requiring more nodes. Right-sizing requests directly reduces node count and costs.