Skip to main content

Abtest Significance Calculator

Test whether your A/B experiment results are statistically significant. Enter control and variation conversion rates and sample sizes to get p-value and

Skip to calculator
Statistics & Data Science

A/B Test Significance Calculator — Statistical Significance for Experiments

Calculate statistical significance, conversion lift, p-value, and confidence level for your A/B test experiments. Free A/B testing significance calculator.

Last updated: December 2025Reviewed by NovaCalculator Mathematics Team

Calculator

Adjust values & calculate

Conversion rate: 3.00%

Conversion rate: 3.80%

Variant Wins!
Confidence: 95-99% | p-value = 0.0273
Control Rate
3.00%
SE: 0.241%
Variant Rate
3.80%
SE: 0.270%
Relative Lift
+26.67%
Absolute difference: 0.80pp

Statistical Details

Z-Score2.2072
p-value (two-tailed)0.0273
95% CI for Difference[0.090%, 1.510%]
Statistical Power59.8%

Confidence Thresholds

90% confidence (p < 0.10)
95% confidence (p < 0.05)
99% confidence (p < 0.01)
Disclaimer: This calculator uses a two-proportion z-test with a two-tailed hypothesis. Results assume independent, randomly assigned visitors. Do not peek at results repeatedly (peeking inflates false positive rates). For sequential testing, use group sequential methods or always-valid p-values. This tool is for educational and planning purposes.
Your Result
Control: 3.00% | Variant: 3.80% | Lift: 26.67% | p = 0.0273 | Significant
Share Your Result
Understand the Math

Formula

z = (p1 - p2) / sqrt(p_pooled × (1-p_pooled) × (1/n1 + 1/n2))

Uses a two-proportion z-test. The pooled proportion combines both groups. The z-score measures how many standard errors the observed difference is from zero. The p-value is derived from the standard normal distribution (two-tailed). Statistical significance is declared at p < 0.05.

Last reviewed: December 2025

Worked Examples

Example 1: Landing Page A/B Test

Control: 10,000 visitors, 300 conversions (3.0%). Variant: 10,000 visitors, 360 conversions (3.6%). Is the variant significantly better?
Solution:
Lift: (3.6-3.0)/3.0 = 20.0% Pooled rate: 660/20,000 = 3.3% SE = sqrt(0.033 × 0.967 × 2/10,000) = 0.00253 z = 0.006/0.00253 = 2.37 p = 0.018 < 0.05
Result: Significant at 95% confidence | 20% lift | p = 0.018

Example 2: Button Color Test (Not Significant)

Control: 2,000 visitors, 50 conversions (2.5%). Variant: 2,000 visitors, 55 conversions (2.75%). Is this significant?
Solution:
Lift: (2.75-2.5)/2.5 = 10.0% Pooled rate: 105/4,000 = 2.625% SE = sqrt(0.02625 × 0.97375 × 2/2,000) = 0.00504 z = 0.0025/0.00504 = 0.496 p = 0.62 > 0.05
Result: Not significant (p = 0.62) — need more data
Expert Insights

Background & Theory

The A/B Test Significance Calculator — Statistical Significance for Experiments applies the following established principles and formulas. Statistics and probability provide the mathematical framework for drawing conclusions from data under uncertainty. The measures of central tendency describe where data cluster. The mean is the arithmetic average, computed as the sum of all values divided by the count. The median is the middle value of an ordered dataset, robust to extreme outliers. The mode is the most frequent value. Spread is quantified by variance, the average squared deviation from the mean, and by its square root, the standard deviation. For a sample, variance uses n minus one in the denominator to correct for bias in estimation. The normal distribution, defined by its mean and standard deviation, is the cornerstone of parametric statistics. Its bell-shaped probability density follows the formula f(x) = (1 / (sigma * sqrt(2*pi))) * exp(-0.5 * ((x - mu) / sigma)^2). The empirical rule states that approximately 68 percent of observations fall within one standard deviation of the mean, 95 percent within two, and 99.7 percent within three. A z-score standardizes a data point by subtracting the mean and dividing by the standard deviation, expressing how many standard deviations an observation lies from the mean. In hypothesis testing, the p-value is the probability of observing a result at least as extreme as the one obtained, assuming the null hypothesis is true. Confidence intervals express the range within which the true population parameter falls with a specified probability, typically 95 percent. Correlation measures linear association between two variables, with Pearson's r ranging from negative one to positive one. Correlation does not imply causation. Linear regression fits a line of the form y = a + bx to minimize the sum of squared residuals. Bayes' theorem relates conditional probabilities: P(A|B) = P(B|A) * P(A) / P(B), allowing prior beliefs to be updated on new evidence. The law of large numbers guarantees that the sample mean converges to the population mean as sample size grows. The central limit theorem states that the distribution of sample means approaches normality regardless of the population distribution, provided the sample size is sufficiently large, typically 30 or more.

History

The history behind the A/B Test Significance Calculator — Statistical Significance for Experiments traces back through the following developments. The mathematical study of probability emerged in the 17th century from correspondence between Blaise Pascal and Pierre de Fermat in 1654. Their exchange, prompted by a gambling problem posed by the Chevalier de Mere, established the foundations of probability theory by calculating expected outcomes through systematic enumeration of cases. Jacob Bernoulli formalized the law of large numbers in his posthumously published Ars Conjectandi of 1713, proving rigorously that empirical frequencies converge to theoretical probabilities with increasing observations. His work laid the groundwork for inferential statistics by connecting mathematical probability to observed data. Carl Friedrich Gauss developed the method of least squares around 1795 while adjusting astronomical observations, and he recognized the bell-shaped error distribution that now bears his name. Pierre-Simon Laplace independently worked on the normal distribution and proved an early version of the central limit theorem around 1810, demonstrating why errors in measurement tend toward normality. The late 19th century saw statistics emerge as a distinct scientific discipline. Francis Galton introduced regression and correlation in the 1880s while studying heredity. Karl Pearson formalized these concepts, developed the chi-squared test, and founded the journal Biometrika in 1901, establishing statistics as a rigorous academic field. Ronald Fisher transformed statistical practice in the early 20th century. His 1925 book Statistical Methods for Research Workers introduced significance testing, analysis of variance, and the concept of the p-value as a decision threshold, establishing the framework still used in scientific research. Fisher and Jerzy Neyman engaged in a prolonged methodological dispute over the interpretation of hypothesis tests. The Bayesian approach, rooted in the 18th century work of Thomas Bayes and Laplace, was largely eclipsed by frequentist methods through much of the 20th century but experienced a revival after World War II and accelerated with computational advances. The late 20th and early 21st centuries brought statistics into every domain through big data, machine learning, and the routine availability of software capable of processing millions of observations.

Share this calculator

Explore More

Frequently Asked Questions

Statistical significance in A/B testing means the observed difference between control and variant is unlikely to be due to random chance alone. The standard threshold is p < 0.05 (95% confidence), meaning there is less than a 5% probability the result occurred by chance. A significant result does not mean the effect is large or practically important — it only means it is unlikely to be zero.
Sample size depends on your baseline conversion rate, the minimum detectable effect (MDE) you care about, and the desired statistical power (typically 80%). For a baseline of 3% conversion and a 20% relative lift (3% to 3.6%), you need roughly 15,000-20,000 visitors per variation. Smaller effects require larger samples. Use a power calculator before starting your test to avoid underpowered experiments.
Use a z-test when the population standard deviation is known and the sample size is large (n > 30). Use a t-test when the population SD is unknown and you estimate it from the sample. For small samples (n < 30), the t-distribution accounts for the extra uncertainty in estimating SD.
The chi-square test compares observed frequencies to expected frequencies in categorical data. A goodness-of-fit test checks if data follows an expected distribution. A test of independence checks if two categorical variables are related. The test statistic increases as observed and expected frequencies diverge.
You may use the results for reference and educational purposes. For professional reports, academic papers, or critical decisions, we recommend verifying outputs against peer-reviewed sources or consulting a qualified expert in the relevant field.
All calculations use established mathematical formulas and are performed with high-precision arithmetic. Results are accurate to the precision shown. For critical decisions in finance, medicine, or engineering, always verify results with a qualified professional.
Educational Note: This calculator is provided for educational and informational purposes. Results are based on the formulas and inputs provided. Always verify important calculations independently. NovaCalculator processes calculator inputs client-side; optional analytics follow visitor consent settings.Reviewed by: NovaCalculator Mathematics TeamVerified against standard mathematical and scientific references. Last reviewed: December 2025. © 2024–2026 NovaCalculator.

Share this calculator

Reviewed by Daniel Agrici, Founder & Lead Developer · Editorial policy

Abtest Significance Calculator Formula

z = (p1 - p2) / sqrt(p_pooled × (1-p_pooled) × (1/n1 + 1/n2))

Uses a two-proportion z-test. The pooled proportion combines both groups. The z-score measures how many standard errors the observed difference is from zero. The p-value is derived from the standard normal distribution (two-tailed). Statistical significance is declared at p < 0.05.

Abtest Significance Calculator — Worked Examples

Example 1: Landing Page A/B Test

Problem: Control: 10,000 visitors, 300 conversions (3.0%). Variant: 10,000 visitors, 360 conversions (3.6%). Is the variant significantly better?

Solution: Lift: (3.6-3.0)/3.0 = 20.0%\nPooled rate: 660/20,000 = 3.3%\nSE = sqrt(0.033 × 0.967 × 2/10,000) = 0.00253\nz = 0.006/0.00253 = 2.37\np = 0.018 < 0.05

Result: Significant at 95% confidence | 20% lift | p = 0.018

Example 2: Button Color Test (Not Significant)

Problem: Control: 2,000 visitors, 50 conversions (2.5%). Variant: 2,000 visitors, 55 conversions (2.75%). Is this significant?

Solution: Lift: (2.75-2.5)/2.5 = 10.0%\nPooled rate: 105/4,000 = 2.625%\nSE = sqrt(0.02625 × 0.97375 × 2/2,000) = 0.00504\nz = 0.0025/0.00504 = 0.496\np = 0.62 > 0.05

Result: Not significant (p = 0.62) — need more data

Abtest Significance Calculator — Frequently Asked Questions

What is A/B test statistical significance?

Statistical significance in A/B testing means the observed difference between control and variant is unlikely to be due to random chance alone. The standard threshold is p < 0.05 (95% confidence), meaning there is less than a 5% probability the result occurred by chance. A significant result does not mean the effect is large or practically important — it only means it is unlikely to be zero.

Does Abtest Significance Calculator work offline?

Once the page is loaded, the calculation logic runs entirely in your browser. If you have already opened the page, most calculators will continue to work even if your internet connection is lost, since no server requests are needed for computation.

What inputs do I need to use Abtest Significance Calculator accurately?

Each field is labelled with the required unit (metric or imperial). Gather your source values before starting — for example, a weight measurement in kilograms, a distance in metres, or a dollar amount — and enter them exactly as measured. The formula section on this page lists every variable and explains what each represents.

Why might my result differ from another tool or reference?

Differences typically arise from rounding conventions, the specific version of a formula (for example, simple vs compound interest), or unit inconsistencies between inputs. Check that both tools are using the same formula variant and the same units. The References section links to the authoritative source behind the formula used here.

How do I interpret the result?

Results are displayed with a label and unit to help you understand the output. Many calculators include a short explanation or classification below the result (for example, a BMI category or risk level). Refer to the worked examples section on this page for real-world context.

How do I verify Abtest Significance Calculator's result independently?

The Formula section on this page shows the equation used. You can reproduce the calculation manually or in a spreadsheet using those steps. Compare your answer against the worked examples in the Examples section, which use known reference values so you can confirm the calculator is behaving as expected.

Abtest Significance Calculator — Background & Theory

The A/B Test Significance Calculator — Statistical Significance for Experiments applies the following established principles and formulas. Statistics and probability provide the mathematical framework for drawing conclusions from data under uncertainty. The measures of central tendency describe where data cluster. The mean is the arithmetic average, computed as the sum of all values divided by the count. The median is the middle value of an ordered dataset, robust to extreme outliers. The mode is the most frequent value. Spread is quantified by variance, the average squared deviation from the mean, and by its square root, the standard deviation. For a sample, variance uses n minus one in the denominator to correct for bias in estimation. The normal distribution, defined by its mean and standard deviation, is the cornerstone of parametric statistics. Its bell-shaped probability density follows the formula f(x) = (1 / (sigma * sqrt(2*pi))) * exp(-0.5 * ((x - mu) / sigma)^2). The empirical rule states that approximately 68 percent of observations fall within one standard deviation of the mean, 95 percent within two, and 99.7 percent within three. A z-score standardizes a data point by subtracting the mean and dividing by the standard deviation, expressing how many standard deviations an observation lies from the mean. In hypothesis testing, the p-value is the probability of observing a result at least as extreme as the one obtained, assuming the null hypothesis is true. Confidence intervals express the range within which the true population parameter falls with a specified probability, typically 95 percent. Correlation measures linear association between two variables, with Pearson's r ranging from negative one to positive one. Correlation does not imply causation. Linear regression fits a line of the form y = a + bx to minimize the sum of squared residuals. Bayes' theorem relates conditional probabilities: P(A|B) = P(B|A) * P(A) / P(B), allowing prior beliefs to be updated on new evidence. The law of large numbers guarantees that the sample mean converges to the population mean as sample size grows. The central limit theorem states that the distribution of sample means approaches normality regardless of the population distribution, provided the sample size is sufficiently large, typically 30 or more.

History of the Abtest Significance Calculator

The history behind the A/B Test Significance Calculator — Statistical Significance for Experiments traces back through the following developments. The mathematical study of probability emerged in the 17th century from correspondence between Blaise Pascal and Pierre de Fermat in 1654. Their exchange, prompted by a gambling problem posed by the Chevalier de Mere, established the foundations of probability theory by calculating expected outcomes through systematic enumeration of cases. Jacob Bernoulli formalized the law of large numbers in his posthumously published Ars Conjectandi of 1713, proving rigorously that empirical frequencies converge to theoretical probabilities with increasing observations. His work laid the groundwork for inferential statistics by connecting mathematical probability to observed data. Carl Friedrich Gauss developed the method of least squares around 1795 while adjusting astronomical observations, and he recognized the bell-shaped error distribution that now bears his name. Pierre-Simon Laplace independently worked on the normal distribution and proved an early version of the central limit theorem around 1810, demonstrating why errors in measurement tend toward normality. The late 19th century saw statistics emerge as a distinct scientific discipline. Francis Galton introduced regression and correlation in the 1880s while studying heredity. Karl Pearson formalized these concepts, developed the chi-squared test, and founded the journal Biometrika in 1901, establishing statistics as a rigorous academic field. Ronald Fisher transformed statistical practice in the early 20th century. His 1925 book Statistical Methods for Research Workers introduced significance testing, analysis of variance, and the concept of the p-value as a decision threshold, establishing the framework still used in scientific research. Fisher and Jerzy Neyman engaged in a prolonged methodological dispute over the interpretation of hypothesis tests. The Bayesian approach, rooted in the 18th century work of Thomas Bayes and Laplace, was largely eclipsed by frequentist methods through much of the 20th century but experienced a revival after World War II and accelerated with computational advances. The late 20th and early 21st centuries brought statistics into every domain through big data, machine learning, and the routine availability of software capable of processing millions of observations.

References