Skip to main content

SLA Error Budget Burn Rate Calculator

Calculate SLA error budgets, track burn rate, and project exhaustion timeline for reliability management.

Share this calculator

Worked Examples

Example 1: SaaS Platform Monthly Budget

Problem: SaaS with 99.95% SLO, 50M requests/month. Day 15: 15,000 failed requests. Assess burn rate and project exhaustion.

Solution: Error Budget Calculation:\n- SLO: 99.95% β†’ Error budget: 0.05%\n- Total budget: 50M Γ— 0.0005 = 25,000 errors\n- Used: 15,000 errors (60%)\n- Remaining: 10,000 errors (40%)\n\nBurn Rate Analysis:\n- Ideal burn at day 15: 50%\n- Actual burn: 60%\n- Burn rate ratio: 60%/50% = 1.2x\n\nProjection:\n- Daily burn: 15,000/15 = 1,000 errors/day\n- Days until exhaustion: 10,000/1,000 = 10 days\n- Projected exhaustion: Day 25 (5 days early)\n\nActual reliability: 99.97% (above SLO, but trending wrong)\n\nAction: Burn rate 1.2x is concerning but not critical.\nMonitor closely; implement quick reliability wins.

Result: 60% used | 1.2x burn rate | Exhausts day 25 | Status: CAUTION

Example 2: Post-Incident Budget Impact

Problem: API service with 99.9% SLO. Day 20: Major outage (2 hours, 100% error rate during incident). 10M requests/day average. Previous: 5,000 errors in 20 days.

Solution: Pre-Incident State:\n- Monthly budget: 300M Γ— 0.001 = 300,000 errors\n- Used before incident: 5,000 (1.7%)\n- Burn rate: 0.085x (excellent)\n\nIncident Impact:\n- 2 hours = 833,333 requests (10M/24 Γ— 2)\n- 100% error = 833,333 errors\n- Single incident consumed 278% of total budget!\n\nPost-Incident State:\n- Total errors: 5,000 + 833,333 = 838,333\n- Budget consumed: 279%\n- SLO breached: 99.72% vs 99.9% target\n\nRecovery Timeline:\n- Need 30+ days of zero errors to recover\n- Or: reset budget at period boundary\n\nAction Required:\n- Declare SLO breach\n- Implement error budget policy consequences\n- RCA and prevention measures mandatory

Result: Single incident: 278% budget | SLO BREACHED | Mandatory reliability focus

Example 3: Multi-Tier SLO Tracking

Problem: E-commerce: web (99.9%), API (99.95%), payments (99.99%). Track each tier's budget status mid-month.

Solution: Tier Analysis (Day 15 of 30):\n\nWeb (99.9% SLO):\n- Budget: 0.1% Γ— 20M = 20,000 errors\n- Used: 8,000 (40%)\n- Burn rate: 0.8x (healthy)\n- Status: βœ… GREEN\n\nAPI (99.95% SLO):\n- Budget: 0.05% Γ— 100M = 50,000 errors\n- Used: 35,000 (70%)\n- Burn rate: 1.4x (concerning)\n- Status: ⚠️ YELLOW\n\nPayments (99.99% SLO):\n- Budget: 0.01% Γ— 5M = 500 errors\n- Used: 450 (90%)\n- Burn rate: 1.8x (critical)\n- Status: πŸ”΄ RED\n\nPrioritization:\n1. Payments: Only 50 errors remaining; freeze changes\n2. API: Investigate elevated error rate\n3. Web: Continue normal operation\n\nPayments requires immediate attentionβ€”\n10 errors/day remaining vs historical 30/day

Result: Web: GREEN | API: YELLOW | Payments: RED (50 errors left)

Frequently Asked Questions

How do I calculate error budget burn rate?

Burn rate = (Error budget consumed / Total error budget) / (Time elapsed / Period). A burn rate of 1.0 means you're consuming budget exactly as fast as it regenerates. Above 1.0 means you're trending toward exhaustion before period end. Below 1.0 means you have slack.

What happens when error budget is exhausted?

When error budget exhausts, you've breached your SLO commitment. Teams should: (1) Freeze feature deployments, (2) Prioritize reliability work, (3) Investigate root causes, (4) Implement safeguards. Some orgs have formal error budget policies requiring these actions.

What is the relationship between SLO and SLA?

SLO (Service Level Objective) is an internal target (99.9% uptime). SLA (Service Level Agreement) is an external commitment with consequences (refunds, credits). SLOs should be stricter than SLAsβ€”e.g., target 99.95% internally when SLA promises 99.9%. Error budgets derive from SLOs.

How long should an error budget period be?

Typically 30 days (monthly) or 90 days (quarterly). Shorter periods (weekly) create noise and stress. Longer periods delay feedback. Monthly aligns with most business cycles. Some use rolling windows instead of fixed periods to avoid edge effects.

What is a fast-burn vs slow-burn alert?

Fast-burn alerts trigger when budget consumption rate threatens exhaustion within hours (e.g., 2% budget burned in 1 hour). Slow-burn alerts trigger when trending toward exhaustion by period end. Fast-burn pages engineers; slow-burn creates tickets. Both are essential for proactive management.

Should error budget include planned maintenance?

Philosophically, users experience downtime regardless of cause. Practically, many organizations exclude planned maintenance from error budget to enable necessary work. Document your policy clearly. Consider: if maintenance hurts users, maybe it should count.

Background & Theory

The SLA Error Budget Burn Rate Calculator applies the following established principles and formulas. Break-even analysis identifies the sales volume at which total revenue equals total costs, producing neither profit nor loss. The formula divides total fixed costs by the contribution margin per unit, where contribution margin equals selling price minus variable cost per unit. If a software product has $50,000 in monthly fixed costs and each licence generates $20 above its variable cost, break-even requires 2,500 unit sales per month. Above that threshold, each additional unit contributes directly to profit. Gross margin expresses the percentage of revenue remaining after direct cost of goods sold: gross margin equals revenue minus COGS, divided by revenue. A SaaS company with 80 percent gross margins retains $0.80 of every revenue dollar to cover operating expenses, while a manufacturer with 30 percent gross margins faces much tighter operating leverage. Customer acquisition cost (CAC) divides total sales and marketing expenditure in a period by the number of new customers acquired in that same period. Customer lifetime value (LTV) estimates the total profit attributable to a customer relationship. The standard formula multiplies average revenue per user (ARPU) by gross margin and divides by the monthly churn rate. A business with $50 ARPU, 75 percent gross margin, and 2 percent monthly churn has an LTV of $1,875. The LTV:CAC ratio benchmarks unit economics health; a ratio above 3:1 is generally considered sustainable, while ratios below 1:1 indicate the business is acquiring customers at a loss. Burn rate measures monthly cash expenditure net of revenue. Cash runway equals current cash reserves divided by net monthly burn. A company with $1.2 million in the bank burning $100,000 per month has twelve months of runway. The Rule of 40 is a benchmark for SaaS health: the sum of annual revenue growth rate (as a percentage) and profit margin (as a percentage) should equal or exceed 40. High-growth companies burning cash can still pass this rule if their growth rate compensates.

History

The history behind the SLA Error Budget Burn Rate Calculator traces back through the following developments. Early economic thought centred on mercantilism, the 16th and 17th century doctrine that national wealth derived from accumulating precious metals through export surpluses and colonial extraction. Adam Smith's "Wealth of Nations" in 1776 dismantled this framework, arguing that genuine prosperity arose from specialisation, division of labour, and freely operating markets. David Ricardo extended Smith's work with the theory of comparative advantage in 1817, demonstrating mathematically that mutually beneficial trade was possible even when one country was less productive in every industry. Alfred Marshall's "Principles of Economics" published in 1890 provided the modern framework of supply and demand curves, consumer surplus, price elasticity, and marginal analysis, establishing neoclassical economics as the dominant academic paradigm for decades. The Great Depression exposed the limits of laissez-faire assumptions, and John Maynard Keynes's "General Theory of Employment, Interest and Money" in 1936 argued that private-sector aggregate demand failures required countercyclical government fiscal intervention to restore full employment, shifting the policy consensus toward active macroeconomic management. The post-World War II decades constructed mixed-economy models combining market allocation with expanded welfare states and Keynesian demand management. Milton Friedman and the Chicago School challenged this consensus from the 1960s onward, championing monetarism and arguing that stable money supply growth was superior to discretionary fiscal policy. Their influence shaped the deregulatory and privatisation policies of the Reagan and Thatcher eras in the 1980s. Behavioural economics emerged through the work of Daniel Kahneman and Amos Tversky in the 1970s and Richard Thaler in the 1980s, using psychology to demonstrate that real human decision-making deviates systematically from rational-actor models through heuristics and biases. The rise of the internet and mobile platforms in the 2000s and 2010s created a new category of platform economics, where network effects, near-zero marginal cost of digital goods, and two-sided market dynamics generated winner-take-most competitive outcomes requiring new analytical frameworks for business valuation.

References