Test Flakiness Retry Budget
Plan test retry budget based on flakiness and CI costs. Enter values for instant results with step-by-step formulas.
Worked Examples
Example 1: Small Test Suite Retry Planning
Problem: A team has 100 tests, 5% are flaky with 10% failure rate. They run CI 30 times/day at $0.01/min with 20-second average test time.
Solution: Configuration:\nTotal tests: 100\nFlaky tests: 5 (5%)\nFlake rate: 10%\nMax retries: 2\n\nProbability analysis:\nPass on first try: 90%\nPass with 1 retry: 99%\nPass with 2 retries: 99.9%\n\nExpected retries per flaky test:\n0.10 + 0.01 = 0.11 retries\nTotal expected retries: 5 ร 0.11 = 0.55\n\nTime impact:\nBase run: 100 ร 20s = 2000s = 33.3 min\nRetry time: 0.55 ร 20s = 11s = 0.18 min\nOverhead: 0.5%\n\nCost:\nBase: 33.3 ร $0.01 ร 30 = $10/day\nRetries: ~$0.05/day\nMonthly: ~$222
Result: 2 retries optimal | 0.5% overhead | $222/month | 99.9% confidence
Example 2: High-Flakiness CI Pipeline
Problem: An e-commerce platform has 2000 tests, 12% flaky with 25% avg failure rate. 100 CI runs/day, $0.005/min, 45s avg test time.
Solution: Configuration:\nTotal tests: 2000\nFlaky tests: 240 (12%)\nFlake rate: 25%\nMax retries: 3\n\nProbability:\nPass on first: 75%\nPass with 3 retries: 1 - 0.25โด = 99.6%\n\nExpected retries per flaky test:\n0.25 + 0.0625 + 0.0156 = 0.33\nTotal: 240 ร 0.33 = 79 retries\n\nTime impact:\nBase: 2000 ร 45s = 90,000s = 1500 min\nRetries: 79 ร 45s = 3555s = 59 min\nOverhead: 4%\n\nCost:\nBase: 1500 ร $0.005 ร 100 = $750/day\nRetries: 59 ร $0.005 ร 100 = $30/day\nMonthly: $17,160\n\nโ ๏ธ 12% flakiness is high!\nRecommendation: Fix top 50 flakiest tests
Result: โ ๏ธ High flakiness | 4% overhead | $17K/month | Prioritize fixing tests
Example 3: Optimizing Retry Strategy
Problem: Compare retry strategies: 0, 1, 2, 3 retries for 500 tests, 8% flaky, 15% fail rate.
Solution: Scenario analysis at 50 runs/day:\n\n0 Retries:\n- Suite pass rate: 52% (many false failures)\n- Cost: $550/month\n- Developer time wasted: High\n\n1 Retry:\n- Suite pass rate: 93%\n- Overhead: 1.2%\n- Cost: $556/month\n- Improvement: 41% fewer false failures\n\n2 Retries:\n- Suite pass rate: 98.9%\n- Overhead: 1.4%\n- Cost: $558/month\n- Improvement: 47% from baseline\n\n3 Retries:\n- Suite pass rate: 99.8%\n- Overhead: 1.5%\n- Cost: $559/month\n- Marginal improvement: 0.9%\n\nOptimal: 2 retries\nReason: 98.9% confidence at minimal extra cost\n3rd retry adds little value for cost
Result: 2 retries optimal | 98.9% confidence | $558/month | Diminishing returns at 3+
Frequently Asked Questions
What is test flakiness?
Test flakiness refers to tests that sometimes pass and sometimes fail without any code changes. Causes include timing issues, race conditions, test order dependencies, shared state, network variability, and environmental differences between runs.
What's a good retry budget?
A good starting point is 2-3 retries for tests known to be flaky. The optimal budget balances confidence (reducing false failures) against cost (CI time and compute). Most teams find 2 retries sufficient for <15% flake rates.
Should I retry all tests or only flaky ones?
Best practice is to only retry known flaky tests. Retrying all tests wastes resources on stable tests and can mask real failures. Use historical data to identify and tag flaky tests for selective retries.
How do I measure test flakiness?
Track pass/fail results across multiple runs of the same code. A test that fails even once on unchanged code is flaky. Calculate flake rate as (inconsistent runs / total runs). Many CI systems provide built-in flakiness detection.
What causes test flakiness?
Common causes include: timing/async issues (race conditions, timeouts), test pollution (shared state, order dependency), external dependencies (network, databases, APIs), environment differences (timezone, locale), and resource constraints (memory, CPU).
Can I use Test Flakiness Retry Budget on a mobile device?
Yes. All calculators on NovaCalculator are fully responsive and work on smartphones, tablets, and desktops. The layout adapts automatically to your screen size.
Background & Theory
History
References
- Google Testing Blog: Flaky Tests at Google
- Microsoft Research: Test Flakiness Study
- CircleCI: Test Insights & Flaky Test Detection
- GitHub: Identifying Flaky Tests
- Spotify Engineering: Flaky Test Management
- Buildkite: Test Analytics
- Jest: Retry Configuration
- Martin Fowler: Eradicating Non-Determinism in Tests