Question 1

What is statistical power and why does it matter?

Accepted Answer

Statistical power is the probability that a test will correctly reject a false null hypothesis (i.e., detect a real effect when one exists). A power of 0.80 means there is an 80% chance of detecting the effect if it truly exists, and a 20% chance of a Type II error (missing the effect). In biological research, underpowered studies waste resources and may fail to detect important effects like drug efficacy or genetic associations. Most journals and regulatory agencies require a minimum power of 0.80, though 0.90 is recommended for critical studies.

Question 2

How do I choose an appropriate effect size?

Accepted Answer

Effect size (Cohen's d) quantifies the magnitude of the difference between groups relative to variability. Cohen suggested benchmarks: small (d=0.2), medium (d=0.5), and large (d=0.8). However, it is better to base your effect size on prior research, pilot studies, or the minimum clinically meaningful difference. For example, if a drug must reduce blood pressure by at least 5 mmHg (SD=10) to be clinically relevant, d = 5/10 = 0.5. Using standardized benchmarks without domain knowledge can lead to misleadingly sized studies.

Question 3

What is the relationship between sample size, power, and effect size?

Accepted Answer

These three quantities are mathematically linked: increasing any one allows you to decrease another. Larger sample sizes increase power for a given effect size. Larger effect sizes require smaller samples for the same power. Common trade-offs include: to detect a small effect (d=0.2) at 80% power requires about 394 per group; a medium effect (d=0.5) requires about 64 per group; a large effect (d=0.8) requires only about 26 per group. Doubling sample size does not double power; the relationship follows a curve that flattens as power approaches 1.0.

Question 4

How does the significance level (alpha) affect sample size?

Accepted Answer

A smaller alpha (e.g., 0.01 vs 0.05) means stricter criteria for significance, requiring a larger sample to achieve the same power. At alpha = 0.05 and power 0.80 for medium effect, you need about 64 per group. At alpha = 0.01, this increases to about 95 per group. In genomics and multiple-testing scenarios, researchers often use Bonferroni-corrected alpha values (e.g., 0.05/1000 = 0.00005), which dramatically increases required sample sizes and is why genome-wide association studies need thousands of participants.

Power Sample Size Calculator

Formula

Worked Examples

Example 1: Clinical Drug Trial Sample Size

Example 2: Gene Expression Study with Small Effect

Frequently Asked Questions

What is statistical power and why does it matter?

How do I choose an appropriate effect size?

What is the relationship between sample size, power, and effect size?

How does the significance level (alpha) affect sample size?

References