Skip to main content

Hypergeometric Distribution Calculator

Our free exponents & logarithms calculator solves hypergeometric distribution problems. Get worked examples, visual aids, and downloadable results.

Share this calculator

Formula

P(X=k) = C(K,k) * C(N-K, n-k) / C(N,n)

Where N is the population size, K is the number of success states in the population, n is the number of draws (sample size), and k is the desired number of observed successes. C(a,b) is the binomial coefficient 'a choose b'.

Worked Examples

Example 1: Drawing Hearts from a Deck

Problem: What is the probability of drawing exactly 2 hearts in a 5-card hand from a standard 52-card deck (13 hearts)?

Solution: N = 52 (population), K = 13 (hearts), n = 5 (draw), k = 2 (desired hearts)\nP(X=2) = C(13,2) * C(39,3) / C(52,5)\nC(13,2) = 78\nC(39,3) = 9,139\nC(52,5) = 2,598,960\nP(X=2) = 78 * 9,139 / 2,598,960 = 712,842 / 2,598,960

Result: P(X=2) = 0.2743 or 27.43%

Example 2: Quality Control Inspection

Problem: A lot of 200 items contains 15 defective items. If 10 items are inspected, what is the probability of finding exactly 1 defective?

Solution: N = 200, K = 15, n = 10, k = 1\nP(X=1) = C(15,1) * C(185,9) / C(200,10)\nMean = 10 * 15/200 = 0.75\nP(X=1) = 15 * C(185,9) / C(200,10)

Result: P(X=1) = 0.3670 or 36.70%

Frequently Asked Questions

What is the hypergeometric distribution?

The hypergeometric distribution models the probability of drawing a specific number of successes from a finite population without replacement. Unlike the binomial distribution which assumes replacement (or infinite population), the hypergeometric distribution accounts for the changing probability as items are drawn. A classic example is drawing cards from a deck: what is the probability of getting exactly 2 hearts in a 5-card hand from a standard 52-card deck? The distribution is defined by three parameters: the population size N, the number of success states K in the population, and the number of draws n. Each draw changes the composition of the remaining population.

How is the hypergeometric distribution different from the binomial distribution?

The key difference is sampling with versus without replacement. The binomial distribution assumes each trial is independent with a constant probability of success, which applies when sampling with replacement or from an effectively infinite population. The hypergeometric distribution accounts for the fact that each draw changes the remaining population composition. For example, after drawing a heart from a deck, the probability of the next card being a heart changes from 13/52 to 12/51. When the population is very large relative to the sample size, the hypergeometric distribution approximates the binomial distribution because removing one item barely changes the probabilities.

What is the formula for the hypergeometric probability?

The probability mass function is P(X = k) = C(K,k) * C(N-K, n-k) / C(N,n), where C(a,b) is the binomial coefficient (a choose b). Here N is the total population, K is the number of success items, n is the sample size, and k is the desired number of successes. The numerator counts the favorable outcomes: C(K,k) ways to choose k successes from K success items, times C(N-K, n-k) ways to choose the remaining n-k items from the N-K non-success items. The denominator C(N,n) counts all possible ways to draw n items from N. This ratio gives the exact probability.

What are common applications of the hypergeometric distribution?

The hypergeometric distribution appears in quality control when inspecting a batch of products without replacement, such as testing 10 items from a batch of 100 to check for defects. It is used in ecology for capture-recapture methods to estimate animal populations. In card games, it calculates the probability of specific hands. In genetics, it models the likelihood of observing a certain number of genes of interest in a random sample. Statistical tests like Fisher's exact test use the hypergeometric distribution for analyzing contingency tables, especially with small sample sizes where chi-squared approximations are unreliable.

What are the mean and variance of the hypergeometric distribution?

The mean (expected value) of the hypergeometric distribution is E(X) = nK/N, which is intuitive: if 25% of the population are successes and you draw 10 items, you expect 2.5 successes on average. The variance is Var(X) = n*K*(N-K)*(N-n) / (N^2*(N-1)). The factor (N-n)/(N-1) is called the finite population correction factor, and it makes the variance smaller than the corresponding binomial variance. As the population grows much larger than the sample, this correction factor approaches 1 and the variance approaches the binomial variance of n*p*(1-p) where p = K/N.

How do you calculate cumulative hypergeometric probabilities?

Cumulative probabilities are computed by summing individual probabilities. P(X <= k) sums P(X = x) for all valid x from the minimum possible successes to k. The minimum is max(0, n+K-N), which accounts for cases where you must draw some successes because there are not enough non-successes to fill the sample. The maximum possible successes is min(K, n). For P(X >= k), compute 1 - P(X <= k-1). These cumulative values answer practical questions like: what is the probability of getting at least 3 defective items when inspecting 10 items from a lot where 20 out of 200 are defective?

References