Geometric Distribution Calculator
Free Geometric distribution Calculator for sequences. Enter values to get step-by-step solutions with formulas and graphs.
Geometric Distribution Calculator
Calculate geometric distribution probabilities, expected values, variance, CDF, and survival functions. Determine the number of trials until first success with detailed probability tables.
Last updated: December 2025Reviewed by NovaCalculator Mathematics Team
Calculator
Adjust values & calculateProbability Distribution Table
Formula
Where p is the probability of success on each trial, k is the trial number on which the first success occurs, and (1-p) is the probability of failure. The mean is 1/p and the variance is (1-p)/p^2.
Last reviewed: December 2025
Worked Examples
Example 1: Quality Control Inspection
Example 2: Sales Conversion Rate
Background & Theory
The Geometric Distribution Calculator applies the following established principles and formulas. Statistics and probability provide the mathematical framework for drawing conclusions from data under uncertainty. The measures of central tendency describe where data cluster. The mean is the arithmetic average, computed as the sum of all values divided by the count. The median is the middle value of an ordered dataset, robust to extreme outliers. The mode is the most frequent value. Spread is quantified by variance, the average squared deviation from the mean, and by its square root, the standard deviation. For a sample, variance uses n minus one in the denominator to correct for bias in estimation. The normal distribution, defined by its mean and standard deviation, is the cornerstone of parametric statistics. Its bell-shaped probability density follows the formula f(x) = (1 / (sigma * sqrt(2*pi))) * exp(-0.5 * ((x - mu) / sigma)^2). The empirical rule states that approximately 68 percent of observations fall within one standard deviation of the mean, 95 percent within two, and 99.7 percent within three. A z-score standardizes a data point by subtracting the mean and dividing by the standard deviation, expressing how many standard deviations an observation lies from the mean. In hypothesis testing, the p-value is the probability of observing a result at least as extreme as the one obtained, assuming the null hypothesis is true. Confidence intervals express the range within which the true population parameter falls with a specified probability, typically 95 percent. Correlation measures linear association between two variables, with Pearson's r ranging from negative one to positive one. Correlation does not imply causation. Linear regression fits a line of the form y = a + bx to minimize the sum of squared residuals. Bayes' theorem relates conditional probabilities: P(A|B) = P(B|A) * P(A) / P(B), allowing prior beliefs to be updated on new evidence. The law of large numbers guarantees that the sample mean converges to the population mean as sample size grows. The central limit theorem states that the distribution of sample means approaches normality regardless of the population distribution, provided the sample size is sufficiently large, typically 30 or more.
History
The history behind the Geometric Distribution Calculator traces back through the following developments. The mathematical study of probability emerged in the 17th century from correspondence between Blaise Pascal and Pierre de Fermat in 1654. Their exchange, prompted by a gambling problem posed by the Chevalier de Mere, established the foundations of probability theory by calculating expected outcomes through systematic enumeration of cases. Jacob Bernoulli formalized the law of large numbers in his posthumously published Ars Conjectandi of 1713, proving rigorously that empirical frequencies converge to theoretical probabilities with increasing observations. His work laid the groundwork for inferential statistics by connecting mathematical probability to observed data. Carl Friedrich Gauss developed the method of least squares around 1795 while adjusting astronomical observations, and he recognized the bell-shaped error distribution that now bears his name. Pierre-Simon Laplace independently worked on the normal distribution and proved an early version of the central limit theorem around 1810, demonstrating why errors in measurement tend toward normality. The late 19th century saw statistics emerge as a distinct scientific discipline. Francis Galton introduced regression and correlation in the 1880s while studying heredity. Karl Pearson formalized these concepts, developed the chi-squared test, and founded the journal Biometrika in 1901, establishing statistics as a rigorous academic field. Ronald Fisher transformed statistical practice in the early 20th century. His 1925 book Statistical Methods for Research Workers introduced significance testing, analysis of variance, and the concept of the p-value as a decision threshold, establishing the framework still used in scientific research. Fisher and Jerzy Neyman engaged in a prolonged methodological dispute over the interpretation of hypothesis tests. The Bayesian approach, rooted in the 18th century work of Thomas Bayes and Laplace, was largely eclipsed by frequentist methods through much of the 20th century but experienced a revival after World War II and accelerated with computational advances. The late 20th and early 21st centuries brought statistics into every domain through big data, machine learning, and the routine availability of software capable of processing millions of observations.
Frequently Asked Questions
Formula
P(X = k) = p * (1 - p)^(k-1)
Where p is the probability of success on each trial, k is the trial number on which the first success occurs, and (1-p) is the probability of failure. The mean is 1/p and the variance is (1-p)/p^2.
Worked Examples
Example 1: Quality Control Inspection
Problem: A factory has a 5% defect rate. What is the probability of finding the first defective item on the 10th inspection?
Solution: P(X = 10) = 0.05 * (1 - 0.05)^(10-1)\nP(X = 10) = 0.05 * 0.95^9\nP(X = 10) = 0.05 * 0.6302\nP(X = 10) = 0.03151\nExpected inspections until first defect: 1/0.05 = 20\nP(finding defect within 10 inspections) = 1 - 0.95^10 = 0.4013
Result: P(X=10) = 3.15% | Expected trials = 20 | P(X<=10) = 40.13%
Example 2: Sales Conversion Rate
Problem: A salesperson converts 20% of leads. What is the probability of making the first sale within 5 calls?
Solution: P(X <= 5) = 1 - (1 - 0.20)^5\nP(X <= 5) = 1 - 0.80^5\nP(X <= 5) = 1 - 0.32768\nP(X <= 5) = 0.67232\nExpected calls until first sale: 1/0.20 = 5\nP(exactly on 5th call) = 0.20 * 0.80^4 = 0.08192
Result: P(within 5 calls) = 67.23% | P(exactly 5th) = 8.19% | Expected = 5 calls
Frequently Asked Questions
What is a geometric distribution and when is it used?
A geometric distribution models the number of independent Bernoulli trials needed to achieve the first success, where each trial has the same probability of success p. It is the discrete analog of the exponential distribution. Common applications include modeling the number of coin flips until the first heads, the number of products inspected until finding a defective one, the number of sales calls before making a sale, and the number of attempts to establish a network connection. The distribution assumes each trial is independent and the probability of success remains constant. It is the only memoryless discrete probability distribution, meaning past failures do not affect future success probability.
What is the probability mass function (PMF) of the geometric distribution?
The PMF gives the probability that the first success occurs on exactly the kth trial: P(X = k) = p * (1 - p)^(k-1), where p is the success probability and k is the trial number (k = 1, 2, 3, ...). This formula makes intuitive sense: you need (k-1) failures each with probability (1-p), followed by one success with probability p. For example, with p = 0.3, the probability of first success on trial 4 is 0.3 * 0.7^3 = 0.3 * 0.343 = 0.1029 or about 10.29%. The PMF decreases geometrically (by a constant ratio of 1-p), which is where the distribution gets its name.
How do you calculate the cumulative distribution function (CDF) of the geometric distribution?
The CDF gives the probability of achieving the first success within k trials: P(X <= k) = 1 - (1 - p)^k. This formula is derived by noting that not succeeding within k trials means failing k times in a row with probability (1-p)^k, so the complement gives the CDF. For example, with p = 0.2, the probability of success within 5 trials is 1 - 0.8^5 = 1 - 0.32768 = 0.67232 or about 67.2%. The CDF is useful for answering questions like how many trials are needed for a given confidence level. To find the number of trials needed for 95% certainty, solve 1 - (1-p)^k >= 0.95 for k.
What are the mean and variance of the geometric distribution?
The mean (expected value) of the geometric distribution is E[X] = 1/p, and the variance is Var(X) = (1-p)/p^2. The standard deviation is sqrt((1-p)/p^2) = sqrt(1-p)/p. For example, if p = 0.25, the expected number of trials until the first success is 1/0.25 = 4, with variance = 0.75/0.0625 = 12 and standard deviation = sqrt(12) = 3.46. The mean being 1/p is intuitive: if you succeed 25% of the time, you expect to need 4 tries on average. As p approaches 1, both mean and variance decrease toward 1 and 0 respectively, since success becomes nearly certain on the first trial.
What is the memoryless property of the geometric distribution?
The memoryless property states that P(X > s + t | X > s) = P(X > t) for any positive integers s and t. In practical terms, if you have already failed s times, the probability of needing more than t additional trials is the same as if you were starting fresh. For example, if you are flipping a coin waiting for heads and have already gotten 10 tails, the expected number of additional flips is still the same as when you started. This property is unique to the geometric distribution among discrete distributions (and to the exponential distribution among continuous ones). It mathematically captures the idea that independent trials have no memory of past outcomes.
How does the geometric distribution relate to the negative binomial distribution?
The geometric distribution is a special case of the negative binomial distribution with r = 1, where r represents the number of successes desired. While the geometric distribution counts trials until the first success, the negative binomial counts trials until the rth success. If X1, X2, ..., Xr are independent geometric random variables with the same parameter p, their sum follows a negative binomial distribution with parameters r and p. This relationship is useful for extending geometric distribution problems: for example, if you want to know how many sales calls are needed to make 5 sales (not just the first), you use the negative binomial distribution with r = 5.
References
Reviewed by Manoj Kumar, Mathematics Educator ยท Editorial policy