Docs/Guides/Bayesian Predictive Power

A Complete Guide to Bayesian Predictive Power

Bayesian predictive power (PP)—often called weight-averaged conditional power or predictive probability of success—is a statistical method used during interim analyses to estimate the probability that an ongoing clinical trial will ultimately achieve its primary success criterion, given the data observed so far.

Analogy: GPS for a Clinical Trial

Predictive power is like a GPS for a clinical trial. Traditional power is your estimated arrival time before you leave the driveway. Conditional power is a recalculation assuming you will drive at exactly one fixed speed for the rest of the trip. Bayesian predictive power uses how fast you have actually been driving so far (the data) and what you know about your driving habits (the prior) to give the most realistic probability that you will still arrive on time.

1. Conceptual Framework

In the Bayesian framework, the unknown treatment effect (denoted θ\theta or μ\mu) is treated as a random variable rather than a fixed constant.

Predictive power is defined as:

1

A weighted average of conditional power

Instead of assuming a single future effect size, predictive power averages conditional power over many plausible values of θ\theta.

2

Posterior-weighted

The weights come from the posterior distribution π(θDk)\pi(\theta \mid D_k), which updates the prior belief using the interim data (DkD_k).

3

Decision-oriented

The goal is not hypothesis testing per se, but answering a pragmatic question: Given what we know now, is it still worth continuing this trial?

2. Mathematical Definition

At interim analysis (kk), Bayesian predictive power is defined as:

PPk=CPk(θ)π(θDk)dθPP_k = \int CP_k(\theta) \, \pi(\theta \mid D_k) \, d\theta

where:

  • CPk(θ)CP_k(\theta) is the conditional power assuming a specific true treatment effect θ\theta
  • π(θDk)\pi(\theta \mid D_k) is the posterior distribution of θ\theta given interim data

Because this expression integrates over a full distribution, predictive power is typically computed using numerical integration or Monte Carlo simulation.

A Simple Worked Illustration

Prior BeliefInterim TrendResulting PP
Skeptical (centered near null)Modest benefitLow–moderate PP
Neutral / weakly informativeClear benefitModerate–high PP
Enthusiastic (optimistic mean)Flat or negative trendVery low PP

Key feature: Strong interim data can overwhelm optimistic priors, while weak data cannot be rescued by belief alone.

3. Role in Trial Monitoring and Stopping Decisions

Predictive power is most commonly used for stochastic curtailment—stopping a trial early when future outcomes are highly predictable.

Stopping for Futility

If predictive power drops below a pre-specified threshold (often 10–20%), the trial is unlikely to succeed even if continued to completion.

Stopping for Efficacy

If predictive power approaches certainty, the accumulating evidence may justify early termination for benefit.

Clarifying Null Trends

When interim effects are near zero, predictive power helps determine whether continuing will meaningfully narrow uncertainty or merely consume resources.

Important: Predictive power focuses on future success, not on whether a boundary has already been crossed.

4. Comparison with Conditional Power

Traditional frequentist conditional power requires specifying a single assumed future effect size—commonly the design alternative or the observed interim estimate.

Bayesian predictive power differs in several important ways:

No single-point assumption: It averages over uncertainty in the effect size rather than fixing it.

Explicit uncertainty accounting: Prior beliefs and interim variability are formally incorporated.

Improved interpretability: Results are expressed as a probability of eventual success, which is often more intuitive for decision-makers.

Operational flexibility: Predictive power can accommodate enrollment overruns and additional looks if the monitoring rule is prespecified or the operating characteristics are revalidated by simulation under the revised schedule.

This framing avoids the common situation where different conditional power assumptions lead to dramatically different—and equally fragile—conclusions.

5. Practical Implementation and Choice of Priors

The behavior of predictive power depends strongly on the prior distribution.

Noninformative or weakly informative priors

Predictive power closely resembles a “current trend” conditional power calculation.

Skeptical priors

Guard against overreaction to early positive noise.

Enthusiastic priors

Reflect strong historical or mechanistic belief, but can still be overturned by unfavorable data.

Case Example: VEST Trial

In the Vesnarinone in Heart Failure Trial (VEST), predictive power was computed using an enthusiastic prior derived from an earlier phase II trial that had suggested a substantial mortality reduction. Despite this favorable prior, the predictive probability of a beneficial outcome at the interim analysis was low — supporting early termination of the larger Phase III. The case is often cited as an illustration of predictive power's usefulness for futility assessment even when the prior is optimistic; see Berry et al. (2010) and Spiegelhalter et al. (2004) for detailed treatments.

Best Practice: Always run sensitivity analyses showing how predictive power changes under different prior assumptions. If the decision is robust across priors, you have strong evidence. If not, the prior choice is driving the conclusion.

Estimand vs. decision rule: two different questions

Predictive power answers a model-based probability question about the trial outcome given the prior, the interim data, and the planned final analysis. The posterior probability you compute is an estimand of the design: what your model says about success.

The decision rule — the threshold you stop at, the timing of looks, the prior you use — is a separate object that has to pass the regulator's criteria. FDA's January 2026 draft Bayesian guidance treats these as distinct: the posterior probability is a legitimate analysis output, but regulatory acceptability hinges on (1) prespecification of the rule in the SAP, (2) justification of the prior with sensitivity analysis, and (3) simulated frequentist operating characteristics (Type I error, power, expected N) under the full rule.

Reporting PPoS without the OC validation panel is the most common gap in early FDA feedback. The Zetyra calculator outputs the simulated OCs alongside the headline PPoS for exactly this reason.

6. When Not to Use Predictive Power

Predictive power is powerful—but not universal. Avoid or interpret with caution when:

Priors are poorly specified or politically motivated

Results can be manipulated by choosing priors to achieve a desired outcome.

Interim looks occur extremely early with sparse data

With minimal information, predictive power is dominated by the prior and provides little actionable guidance.

Endpoints are noisy, unstable, or poorly measured

High measurement error inflates uncertainty and makes predictions unreliable.

Operational bias could influence interim estimates

Unblinded analyses or selective outcome measurement can distort predictions.

Caution: In these settings, predictive power can give a false sense of inevitability.

7. Summary

Bayesian predictive power reframes interim monitoring around a single, decision-relevant question:

“Given what we know now, how likely is this trial to succeed if we continue as planned?”

By combining current evidence with uncertainty-aware modeling, predictive power provides a disciplined way to decide when a trial has learned enough—and when continuing is unlikely to change the answer.

VIII. References

  1. Spiegelhalter DJ, Freedman LS, Blackburn PR. Monitoring clinical trials: conditional or predictive power? Controlled Clinical Trials. 1986;7(1):8-17.
  2. Spiegelhalter DJ, Abrams KR, Myles JP. Bayesian Approaches to Clinical Trials and Health-Care Evaluation. Wiley; 2004.
  3. Berry SM, Carlin BP, Lee JJ, Muller P. Bayesian Adaptive Methods for Clinical Trials. CRC Press; 2010.
  4. Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. Bayesian Data Analysis. 3rd ed. CRC Press; 2013.
  5. Morita S, Thall PF, Müller P. Determining the effective sample size of a parametric prior. Biometrics. 2008;64(2):595-602.
  6. U.S. Food and Drug Administration. Use of Bayesian Methodology in Clinical Trials of Drug and Biological Products: Draft Guidance for Industry. January 12, 2026.
  7. U.S. Food and Drug Administration. Guidance for the Use of Bayesian Statistics in Medical Device Clinical Trials. February 2010.

Last updated: May 2026

Ready to calculate predictive probability for your trial?

Use the Bayesian Calculator to compute PPoS with different priors, run sensitivity analyses, and see the traffic-light decision gauge.

Open Bayesian Calculator