Sample Size for Continuous Outcomes
Comprehensive power analysis for clinical trials with continuous endpoints (e.g., blood pressure, cholesterol, weight).
Contents
1. When to Use This Method
Use this methodology when:
- Your primary endpoint is measured on a continuous scale (interval or ratio data)
- You are comparing means between two or more groups
- You need to power a superiority, non-inferiority, or equivalence trial
Common Applications
Do NOT Use When
- • Your outcome is binary (use proportions method)
- • Your outcome is time-to-event (use survival analysis method)
- • Your outcome is a count or rate (use Poisson method)
2. Mathematical Formulation
2.1 Two-Sample Parallel Design (Superiority)
For a randomized trial comparing treatment () to control (), the required sample size per group to detect a clinically meaningful difference :
| Symbol | Description |
|---|---|
| Common variance (assumed equal across groups) | |
| Critical value for Type I error (1.96 for α = 0.05, two-sided) | |
| Critical value for power (0.84 for 80%; 1.28 for 90%) | |
| Minimum clinically important difference (MCID) |
2.2 Unequal Allocation
For allocation ratio :
Note: 1:1 allocation is most efficient. A 2:1 ratio increases total N by ~12%.
2.3 Clustered Designs
When observations are nested within clusters (e.g., patients within clinics), apply the variance inflation factor (design effect):
| Average cluster size | |
| Intraclass correlation coefficient (ICC) |
2.4 Repeated Measures / Longitudinal
For measurements per subject with compound-symmetric correlation , the per-arm sample size relative to a single-measurement comparison is:
The factor is the information-time ratio for an unweighted average of repeated measurements under compound symmetry. It collapses to when (independent measurements) and to when (no benefit from repeated measures).
What the calculator actually uses. The textbook factor above is illustrative. The Sample Size Calculator selects an exact effective-variance formula based on the analysis target you choose:
- •Slope (CS): , exact for OLS slope under compound-symmetric errors with equally-spaced times.
- •Endpoint, ANCOVA-adjusted: — the CUPED variance-reduction factor.
- •Change from baseline (CS): .
- •An AR(1) variant is also available for slope and change-from-baseline targets.
See the Longitudinal Studies guide for the full derivations and the choice between targets.
Note: Benefits plateau quickly—increasing beyond 4-5 measurements yields diminishing returns.
2.5 Dropout Adjustment
Inflate sample size to account for anticipated dropout. For ordinary two-arm trials where each enrolled subject has probability of not contributing a valid analysis observation:
Where = expected dropout rate.
When the squared form applies. A few designs require in the denominator:
- •Paired or change-from-baseline analyses requiring both a baseline and an outcome measurement — losing either one drops the subject.
- •Crossover trials where missing either period invalidates the within-subject comparison.
Inflating by does not fix informative missingness; that requires an analysis-stage strategy (multiple imputation, tipping-point sensitivity, pattern mixture). Dropout inflation and missing-data handling are separate decisions.
3. Assumptions
3.1 Core Assumptions
| Assumption | Testable Criterion | Violation Consequence |
|---|---|---|
| Normality | Shapiro-Wilk p > 0.05; Q-Q plot linearity | Moderate: CLT protects with n > 30/group |
| Equal variances | Levene's test p > 0.05; ratio of SDs < 2 | Use Welch's t-test or Satterthwaite df |
| Independence | Study design ensures no clustering | Severe: inflated Type I error if ignored |
| MCID validity | Literature/clinical consensus supports Δ | Underpowered if Δ too optimistic |
3.2 Parameter Estimates
Variance ()
Should come from prior studies, pilot data, or published literature in similar populations. If uncertain, conduct sensitivity analysis across plausible range.
Effect size ()
Must be clinically meaningful, not just statistically detectable. Overly optimistic effect sizes are the #1 cause of underpowered trials.
4. Regulatory Guidance
FDA
ICH E9 (Statistical Principles for Clinical Trials)
"The number of subjects...should always be large enough to provide a reliable answer to the questions addressed." Requires justification of effect size and variance assumptions.
ICH E20 Adaptive Designs for Clinical Trials (2025)
Permits sample size re-estimation based on interim variance, but effect size must remain blinded.
EMA
CHMP Points to Consider on Adjustment for Baseline Covariates
Recommends ANCOVA for continuous outcomes, which can reduce variance and required sample size.
EMA Guideline on Missing Data (2010)
Requires sensitivity analyses for missing data; dropout adjustment should be pre-specified.
Key Citations
- ICH E9: Statistical Principles for Clinical Trials (1998)
- ICH E20: Adaptive Designs for Clinical Trials (2025)
- EMA: Guideline on Adjustment for Baseline Covariates in Clinical Trials (2015)
5. Validation Against Industry Standards
| Scenario | Parameters | PASS 2024 | nQuery 9.5 | Zetyra | Status |
|---|---|---|---|---|---|
| Two-sample t-test | α=0.05, power=0.80, Δ=5, σ=10 | 64/group | 64/group | 64/group | ✓ Match |
| Two-sample t-test | α=0.05, power=0.90, Δ=5, σ=10 | 86/group | 86/group | 86/group | ✓ Match |
| Unequal allocation (2:1) | α=0.05, power=0.80, Δ=5, σ=10 | 48/96 | 48/96 | 48/96 | ✓ Match |
| Cluster RCT | ICC=0.05, m=20, Δ=5, σ=10 | 127/group | 127/group | 127/group | ✓ Match |
Minor variations (±1 subject) may occur due to rounding conventions.
6. Example SAP Language
Sample Size Justification
The primary endpoint is change from baseline in [outcome] at Week [X]. Based on prior studies (Author et al., Year), we assume a standard deviation of [σ] units. A difference of [Δ] units is considered the minimum clinically important difference based on [justification].
Using a two-sample t-test with a two-sided significance level of 0.05 and 80% power, [n] subjects per group are required. To account for an anticipated dropout rate of [X]%, we will enroll [N*] subjects per group ([total] subjects total).
Calculations were performed using [Zetyra / gsDesign / PASS] and validated against [reference software].
7. R Code
# Two-sample t-test sample size
power.t.test(
delta = 5, # MCID
sd = 10, # Standard deviation
sig.level = 0.05, # Alpha (two-sided)
power = 0.80, # 1 - beta
type = "two.sample",
alternative = "two.sided"
)
# Result: n = 64 per group
# With unequal allocation (2:1)
# Using pwr package
library(pwr)
pwr.t2n.test(
d = 5/10, # Cohen's d = delta/sd
n1 = NULL,
n2 = NULL,
sig.level = 0.05,
power = 0.80,
alternative = "two.sided"
)
# Cluster RCT adjustment
n_simple <- 64
m <- 20 # cluster size
icc <- 0.05 # intraclass correlation
deff <- 1 + (m - 1) * icc # design effect = 1.95
n_cluster <- ceiling(n_simple * deff)
# Result: n = 125 per group (rounded up)
# Dropout adjustment (ordinary independent-subject dropout)
dropout_rate <- 0.15
n_adjusted <- ceiling(n_cluster / (1 - dropout_rate))
# Result: n = 148 per group
# Use (1 - dropout_rate)^2 only for paired/change-from-baseline designs
# where losing either measurement drops the subject.8. References
- Chow SC, Shao J, Wang H, Lokhnygina Y. Sample Size Calculations in Clinical Research. 3rd ed. CRC Press; 2017.
- Julious SA. Sample Sizes for Clinical Trials. CRC Press; 2010.
- Cohen J. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. Lawrence Erlbaum Associates; 1988.
- Vickers AJ, Altman DG. Analysing controlled trials with baseline and follow up measurements. BMJ. 2001;323(7321):1123-1124.
- Diggle PJ, Heagerty P, Liang KY, Zeger SL. Analysis of Longitudinal Data. 2nd ed. Oxford University Press; 2002.
- Donner A, Klar N. Design and Analysis of Cluster Randomization Trials in Health Research. Arnold; 2000.
- International Council for Harmonisation (ICH). E9 Statistical Principles for Clinical Trials. February 1998.
Last updated: May 2026
Ready to calculate your sample size?
Use our Sample Size Calculator to quickly determine the number of subjects needed for your clinical trial.
Open Sample Size Calculator