Docs/Guides/Binary Outcomes

Sample Size for Binary Outcomes

Name: Zetyra
Price: 99 USD
Rating: 4.9 (47 reviews)
Author: Zetyra

Comprehensive power analysis for clinical trials with dichotomous endpoints (e.g., response rates, event incidence, success/failure).

1. When to Use This Method

Use this methodology when:

Your primary endpoint is a binary outcome (yes/no, success/failure, event/no event)
You are comparing proportions between two or more groups
You need to power a superiority, non-inferiority, or equivalence trial

Common Applications

Response rates (responder vs. non-responder)

Mortality or adverse event incidence

Cure rates (cured vs. not cured)

Conversion rates (A/B testing)

Disease recurrence (yes/no)

Do NOT Use When

• Your outcome is continuous (use means comparison method)
• Your outcome is time-to-event with censoring (use survival analysis method)
• Your outcome is a count with no upper bound (use Poisson method)
• You have paired/matched binary data (use McNemar's test method)

2. Mathematical Formulation

2.1 Two-Sample Parallel Design (Superiority)

For a randomized trial comparing intervention ( $p_I$ ) to control ( $p_C$ ), the sample size per group:

n = \frac{[z_{1-\alpha/2}\sqrt{2\bar{p}(1-\bar{p})} + z_{1-\beta}\sqrt{p_I(1-p_I) + p_C(1-p_C)}]^2}{(p_I - p_C)^2}

Symbol	Description
$\bar{p}$	Pooled proportion under H₀: $(p_I + p_C)/2$
$z_{1-\alpha/2}$	Critical value for Type I error (1.96 for α = 0.05, two-sided)
$z_{1-\beta}$	Critical value for power (0.84 for 80%; 1.28 for 90%)

Simplified approximation (equal groups):

n = \frac{2\bar{p}(1-\bar{p})(z_{1-\alpha/2} + z_{1-\beta})^2}{(p_I - p_C)^2}

2.2 Unequal Allocation

For allocation ratio $k = n_C/n_I$ :

n_I = \frac{[z_{1-\alpha/2}\sqrt{(1+1/k)\bar{p}(1-\bar{p})} + z_{1-\beta}\sqrt{p_I(1-p_I) + p_C(1-p_C)/k}]^2}{(p_I - p_C)^2}

With $n_C = k \times n_I$ .

2.3 Non-Inferiority Design

For testing whether the new treatment is no worse than control by margin $\delta$ :

n = \frac{[z_{1-\alpha} + z_{1-\beta}]^2 [p_I(1-p_I) + p_C(1-p_C)]}{(p_I - p_C + \delta)^2}

Note: One-sided α (typically 0.025) is standard for non-inferiority. Non-inferiority margins are typically small, resulting in substantially larger sample sizes than superiority trials.

2.4 Equivalence Design

For testing whether treatments differ by no more than $\pm\delta$ :

n = \frac{[z_{1-\alpha} + z_{1-\beta/2}]^2 [p_I(1-p_I) + p_C(1-p_C)]}{(\delta - |p_I - p_C|)^2}

2.5 Clustered Designs

When observations are nested within clusters, apply the variance inflation factor (design effect):

n_{\text{clustered}} = n_{\text{simple}} \times [1 + (m-1)\rho]

For unequal cluster sizes, adjust using coefficient of variation (CV):

n_{\text{clustered}} = n_{\text{simple}} \times [1 + (m-1)\rho] \times [1 + CV^2]

2.6 Continuity Correction

For small samples or proportions near 0 or 1, apply continuity correction:

n_{\text{corrected}} = \frac{n}{4}\left(1 + \sqrt{1 + \frac{4}{n|p_I - p_C|}}\right)^2

2.7 Dropout Adjustment

Inflate sample size to account for anticipated dropout. For ordinary two-arm trials where each enrolled subject has probability $d$ of not contributing a valid analysis observation:

N^* = \frac{N}{1 - d}

Where $d$ = expected dropout rate.

When the squared form applies. Use $N/(1-d)^2$ for paired or change-from-baseline designs requiring both a baseline and an outcome measurement, or for crossover trials where missing either period invalidates the within-subject comparison.

Inflating $N$ does not fix informative missingness; that requires an analysis-stage strategy (multiple imputation, tipping-point sensitivity).

3. Assumptions

3.1 Core Assumptions

Assumption	Testable Criterion	Violation Consequence
Independence	Study design ensures no clustering	Severe: inflated Type I error if ignored
Fixed proportions	Event rates stable over enrollment period	Moderate: time-varying rates may require stratification
Large sample	$n \times p \geq 5$ and $n \times (1-p) \geq 5$	Use exact methods (Fisher's) if violated
No confounding	Randomization successful	Bias in effect estimate

3.2 Parameter Estimates

Control rate ( $p_C$ )

Should come from prior studies, pilot data, or published literature. Consider secular trends—rates may have changed since historical studies.

Treatment effect

Can be specified as absolute difference ( $p_I - p_C$ ), relative risk ( $p_I/p_C$ ), or odds ratio. Ensure clinical relevance, not just statistical detectability.

Event Rate Impact on Sample Size

Control Rate	25% Relative Reduction	Required n/group (80% power)
40%	40% → 30%	356
20%	20% → 15%	906
10%	10% → 7.5%	1,996
5%	5% → 3.75%	4,182

4. Regulatory Guidance

FDA

ICH E9 (Statistical Principles for Clinical Trials)

Requires prospective sample size justification with clearly stated assumptions for event rates and effect sizes.

FDA Guidance on Non-Inferiority Trials (2016)

Non-inferiority margin must preserve a clinically meaningful fraction of the active control effect. Recommends the 95-95 method or fixed margin approach.

FDA Guidance on Multiple Endpoints (2022)

When multiple binary endpoints are co-primary, apply multiplicity adjustment (e.g., Bonferroni: α/k), which increases required sample size.

EMA

CHMP Guideline on Non-Inferiority (2005)

Margin selection must be justified based on historical evidence of active control efficacy vs. placebo.

EMA Points to Consider on Switching

Pre-specification required for switching between superiority and non-inferiority; cannot switch post-hoc based on results.

Key Citations

ICH E9: Statistical Principles for Clinical Trials (1998)
FDA Guidance: Non-Inferiority Clinical Trials to Establish Effectiveness (2016)
FDA Guidance: Multiple Endpoints in Clinical Trials (2022)
CHMP: Guideline on the Choice of the Non-Inferiority Margin (2005)

5. Validation Against Industry Standards

Scenario	Parameters	PASS 2024	nQuery 9.5	Zetyra	Status
Two-proportion (superiority)	p₁=0.30, p₂=0.20, α=0.05, power=0.80	294/group	294/group	294/group	✓ Match
Two-proportion (superiority)	p₁=0.30, p₂=0.20, α=0.05, power=0.90	392/group	393/group	392/group	✓ Match
Non-inferiority	p₁=p₂=0.20, δ=0.10, α=0.025, power=0.80	252/group	252/group	252/group	✓ Match
Cluster RCT	p=0.25, ICC=0.05, m=20	582/group	583/group	582/group	✓ Match

Minor variations (±1 subject) may occur due to rounding conventions and continuity correction options.

6. Example SAP Language

Superiority Trial

The primary endpoint is the proportion of subjects achieving [response criterion] at Week [X]. Based on prior studies (Author et al., Year), the expected response rate in the control group is [p_C]%. We hypothesize that the intervention will achieve a response rate of [p_I]%, representing an absolute improvement of [difference]%.

Using a two-sided chi-square test with α = 0.05 and 80% power, [n] subjects per group are required. To account for an anticipated dropout rate of [X]%, we will enroll [N*] subjects per group ([total] subjects total).

Calculations were performed using [Zetyra / PASS / nQuery] and validated against published formulas (Fleiss et al., 2003).

Non-Inferiority Trial

The primary endpoint is the proportion of subjects achieving [outcome] at Week [X]. This is a non-inferiority trial comparing [new treatment] to [active control].

Based on historical trials (Author et al., Year), the active control achieves a response rate of approximately [p_C]%. We assume the new treatment will have a similar response rate. The non-inferiority margin is set at [δ]%, which preserves at least [X]% of the historical treatment effect over placebo, consistent with FDA guidance.

Using a one-sided test with α = 0.025 and 80% power, [n] subjects per group are required. To account for an anticipated dropout rate of [X]%, we will enroll [N*] subjects per group.

7. R Code

# Two-proportion superiority test
library(pwr)

# Method 1: Using pwr package (effect size h)
p1 <- 0.30  # Intervention proportion
p2 <- 0.20  # Control proportion
h <- ES.h(p1, p2)  # Cohen's h effect size

pwr.2p.test(
  h = h,
  sig.level = 0.05,
  power = 0.80,
  alternative = "two.sided"
)
# Result: n = 294 per group

# Method 2: Using power.prop.test (base R)
power.prop.test(
  p1 = 0.30,
  p2 = 0.20,
  sig.level = 0.05,
  power = 0.80,
  alternative = "two.sided"
)
# Result: n = 294 per group

# Non-inferiority test
# Using TrialSize package
library(TrialSize)

p_control <- 0.20
p_treatment <- 0.20  # Assume equal under H1
delta <- 0.10        # Non-inferiority margin
alpha <- 0.025       # One-sided

# Manual calculation
z_alpha <- qnorm(1 - alpha)
z_beta <- qnorm(0.80)
var_sum <- p_treatment*(1-p_treatment) + p_control*(1-p_control)

n_ni <- ((z_alpha + z_beta)^2 * var_sum) / (delta)^2
ceiling(n_ni)
# Result: n = 252 per group

# Cluster RCT adjustment
n_simple <- 294
m <- 20          # cluster size
icc <- 0.05      # intraclass correlation
deff <- 1 + (m - 1) * icc  # design effect = 1.95
n_cluster <- ceiling(n_simple * deff)
# Result: n = 574 per group

# Dropout adjustment (ordinary independent-subject dropout)
dropout_rate <- 0.15
n_adjusted <- ceiling(n_cluster / (1 - dropout_rate))
# Result: n = 676 per group
# Use (1 - dropout_rate)^2 only for paired/change-from-baseline designs
# where losing either measurement drops the subject.

8. References

Fleiss JL, Levin B, Paik MC. Statistical Methods for Rates and Proportions. 3rd ed. Wiley; 2003.
Chow SC, Shao J, Wang H, Lokhnygina Y. Sample Size Calculations in Clinical Research. 3rd ed. CRC Press; 2017.
Cohen J. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. Lawrence Erlbaum Associates; 1988.
Yates F. Contingency tables involving small numbers and the χ² test. JRSS Supplement. 1934;1:217-235.
Donner A, Klar N. Design and Analysis of Cluster Randomization Trials in Health Research. Arnold; 2000.
U.S. Food and Drug Administration. Non-Inferiority Clinical Trials to Establish Effectiveness: Guidance for Industry. November 2016.
International Council for Harmonisation (ICH). E9 Statistical Principles for Clinical Trials. February 1998.

Last updated: May 2026

Ready to calculate your sample size?

Use our Chi-Square Calculator to determine the sample size needed for comparing proportions between groups.

Open Sample Size Calculator (Binary)Open Chi-Square Calculator