Docs/Bayesian Toolkit

Bayesian Toolkit Overview

FDA-aligned Bayesian trial design in six integrated steps. The Bayesian Toolkit provides a complete workflow for designing clinical trials using Bayesian methodology, aligned with the FDA's January 2026 Bayesian guidance.

Composing adaptive mechanisms? Validate the pipeline, not the parts.

When two or more components from this toolkit (e.g. historical borrowing + sequential monitoring + sample-size re-estimation + response-adaptive randomization) update based on overlapping interim information sets, operating characteristics of the composed system are not the sum of its parts. Each component can pass its individual validation check while the composed pipeline inflates Type I error.

In a Phase II oncology composition (MAP prior + Bayesian monitoring + SSR + RAR), a 2-percentage-point prior–data drift produces pipeline-level T1E of 0.0771 (+208% above α=0.025). A separate mechanism-isolation analysis (under a more adversarial scenario) shows that prior-data conflict and time trends interact super-additively: their joint effect (0.135) exceeds the sum of their independent contributions (0.107) by 0.028, so a protocol that validates each mechanism in isolation can still under-estimate the composed pipeline's T1E (Qian 2026, JSM).

Practical guidance: simulate operating characteristics at the full composed pipeline under realistic prior-data conflict (±2pp, ±5pp) and time-trend (linear drift δ ≈ 0.05) scenarios; pre-specify a maximum acceptable T1E in the SAP; consider robust mixture priors, dynamic borrowing, or pipeline-level recalibration of γ* as mitigation. The FDA January 2026 draft guidance specifically requires this kind of composed evaluation.

1. When to Use Bayesian Methods

Bayesian methods are particularly valuable in these scenarios:

You have relevant historical data

Phase II results, published trials, or registry data can be formally incorporated via informative priors—potentially reducing sample size by 20–40% while maintaining statistical rigor.

Sample sizes are constrained

Rare diseases, pediatric populations, and oncology basket trials often can't achieve traditional frequentist power. Bayesian borrowing provides a principled way to augment limited data.

Interim decisions matter

Predictive probability of success (PPoS) answers the question stakeholders actually care about: “Given what we've seen so far, how likely is this trial to succeed?”

Regulators expect it

The FDA's January 2026 guidance explicitly endorses Bayesian methods for pivotal drug and biologic trials—not just devices. The guidance cites REBYOTA as a successful example.

Bayesian vs. Frequentist: A Practical View

Aspect	Frequentist	Bayesian
Prior information	Ignored (or informal)	Formally incorporated
Interim interpretation	Conditional power at assumed effect	PPoS across posterior uncertainty
Result statement	“p < 0.05” or “95% CI excludes null”	“92% probability treatment is effective”
Regulatory status	Standard	Accepted with documentation (FDA 2026)

Hybrid Designs

The toolkit supports hybrid designs—Bayesian priors and interim monitoring with frequentist final analysis—which is the FDA's recommended approach for most pivotal trials.

2. The 6-Step Workflow

The Bayesian Toolkit follows a sequential workflow. Each step produces outputs that feed into subsequent calculators.

PRIOR

BORROW

(optional)

3/4

SAMPLE SIZE

SEQUENTIAL

(optional)

POWER

(PPoS)

PRIOR

BORROW

(optional)

3/4

SAMPLE SIZE

SEQUENTIAL

(optional)

POWER

(PPoS)

Step 1: Prior Elicitation

What it does: Translates clinical knowledge into Beta distribution parameters (α, β) for binary endpoints.

Three methods: Quantile matching, ESS-based, Historical data

Output: Beta(α, β) prior with documented ESS and justification

Prior Elicitation Documentation

Step 2: Bayesian Borrowing (Optional)

What it does: Formally incorporates external control data with appropriate discounting.

Three methods: Power prior (static δ), Commensurate prior, MAP prior

Output: Effective prior, prior-data conflict diagnostics, sample size comparison

Bayesian Borrowing Documentation

Step 3: Single-Arm Sample Size

What it does: Determines sample size for single-arm Bayesian trials with operating characteristics.

Key outputs: Recommended N, Type I error rate, power curve, sensitivity analysis

Bayesian Sample Size Documentation

Step 4: Two-Arm Design

What it does: Sizes randomized two-arm Bayesian trials (superiority or non-inferiority).

Design options: Superiority, Non-inferiority, Allocation ratios (1:2, 1:1, 2:1)

Two-Arm Bayesian Design Documentation

Step 5: Sequential Monitoring (Optional)

What it does: Designs Bayesian interim stopping rules using posterior probability thresholds for efficacy and futility.

Key outputs: Stopping boundaries, operating characteristics (Type I error, power, expected N), power/ASN curves

When to use: When the trial design includes planned interim analyses and you want Bayesian stopping rules rather than frequentist alpha-spending

Sequential Monitoring Documentation

Step 6: Predictive Power (Interim PPoS)

What it does: Calculates probability of trial success given interim data.

Key outputs: PPoS with decision gauge, sensitivity analysis, posterior visualization

Bayesian Predictive Power Documentation

See It in Action

Walk through the complete 6-step workflow using REBYOTA, BOIN, and other FDA-approved trials as case studies — from prior elicitation to regulatory documentation.

End-to-end tutorial: From Prior to Approval

PPoS Decision Framework

PPoS	Recommendation
≥ 90%	Predicted Success — Verify current posterior meets significance
20–90%	Continue — Insufficient evidence for early stopping
< 20%	Stop for Futility — Very low probability of success

3. Which Calculators Do I Need?

Single-Arm Trial

Do you have historical data to incorporate?

YES → 1. Prior → 2. Borrowing → 3. Sample Size → 6. PPoS

NO → 1. Prior → 3. Sample Size → 6. PPoS

Two-Arm Randomized Trial

Do you have historical control data?

YES → 1. Prior → 2. Borrowing → 4. Two-Arm → 5. Sequential → 6. PPoS

NO → 1. Prior → 4. Two-Arm → 5. Sequential → 6. PPoS

Two-Arm (Fixed Sample, No Interim)

No planned interim analyses?

1. Prior → 4. Two-Arm Design → 6. PPoS (at interim if needed)

Interim Monitoring Only

Trial already designed, need interim decision support?

1. Prior → 6. PPoS (with interim data)

4. Common Design Patterns

Pattern A: Phase II Oncology (Single-Arm with Historical Borrowing)

Single-arm Phase II evaluating ORR against historical control rate.

Step	Calculator	Purpose
1	Prior Elicitation	Convert Phase I/II data to Beta prior with δ = 0.5 discount
2	Bayesian Borrowing	Quantify sample size reduction from borrowing
3	Sample Size	Size trial for 80% power, verify Type I ≤ 0.05
6	PPoS	Monitor at 50% enrollment for futility

Example: Historical ORR = 12%, target ORR = 25%, Phase II data: 24/200 responders

• Prior: Beta(13, 89) with ESS = 102 after 50% discount
• Sample size: ~85 patients (vs. ~110 without borrowing)
• Interim PPoS threshold: < 20% → stop for futility

Pattern B: Confirmatory RCT (Two-Arm with Sequential Monitoring)

Phase III RCT with expert-elicited prior, Bayesian sequential stopping rules.

Step	Calculator	Purpose
1	Prior Elicitation	Quantile matching from expert opinion (skeptical)
4	Two-Arm Design	Size for superiority with 1:1 allocation
5	Sequential Monitoring	Design Bayesian stopping rules for 3 interim analyses
6	PPoS	Ad-hoc interim futility assessment

Fully Bayesian approach: Use Sequential Monitoring for pre-specified stopping rules (efficacy + futility), and PPoS for ad-hoc interim decision support between planned analyses.

Pattern C: Rare Disease (Maximal Borrowing)

Small population, strong historical data from natural history study.

Step	Calculator	Purpose
1	Prior Elicitation	Historical data method with minimal discount (δ = 0.8)
2	Bayesian Borrowing	MAP prior from multiple natural history cohorts
3	Sample Size	Aggressive sample size reduction justified by ESS
6	PPoS	Continuous monitoring given small N

Regulatory context: FDA Section IV.B.2 addresses non-calibrated designs for rare diseases where traditional operating characteristics aren't feasible.

5. Regulatory Alignment

All six calculators are designed to satisfy FDA January 2026 guidance requirements:

Requirement	How the Toolkit Addresses It
Prior justification (Section V.D)	Prior Elicitation documents source, method, and ESS
Discounting rationale (Section V.D.4)	Bayesian Borrowing provides power prior δ with justification
Operating characteristics (Section IV.A)	Sample Size calculators report Type I error and power via simulation
Sensitivity analysis	All calculators support prior sensitivity across optimistic/skeptical scenarios
Decision thresholds (Section IV.A)	PPoS calculator implements FDA's three approaches to success criteria

SAP Documentation Checklist

Each calculator generates outputs for your Statistical Analysis Plan:

Prior specification with α, β (or μ, σ²), source, and ESS
Discounting method and justification (if borrowing)
Decision rule: threshold γ, comparison metric, success criterion
Sample size with operating characteristics
Sensitivity analysis plan across prior specifications
Interim monitoring rules with PPoS thresholds

6. Quick Navigation

Calculator Documentation

Step	Calculator	Documentation
1	Prior Elicitation	Technical Reference
2	Bayesian Borrowing	Technical Reference
3	Single-Arm Sample Size	Technical Reference
4	Two-Arm Design	Technical Reference
5	Sequential Monitoring	Technical Reference
6	Predictive Power	Technical Reference ・ Conceptual Guide

7. References

Spiegelhalter DJ, Abrams KR, Myles JP. Bayesian Approaches to Clinical Trials and Health-Care Evaluation. Wiley; 2004.
Berry SM, Carlin BP, Lee JJ, Muller P. Bayesian Adaptive Methods for Clinical Trials. CRC Press; 2010.
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. Bayesian Data Analysis. 3rd ed. CRC Press; 2013.
Morita S, Thall PF, Müller P. Determining the effective sample size of a parametric prior. Biometrics. 2008;64(2):595-602.
Schmidli H, et al. Robust meta-analytic-predictive priors in clinical trials with historical control information. Biometrics. 2014;70(4):1023-1032.
Zhou T, Ji Y. On Bayesian sequential clinical trial designs. New England Journal of Statistics in Data Science. 2024;2(1).
U.S. Food and Drug Administration. Use of Bayesian Methodology in Clinical Trials of Drug and Biological Products: Draft Guidance for Industry. January 12, 2026.
U.S. Food and Drug Administration. Guidance for the Use of Bayesian Statistics in Medical Device Clinical Trials. February 2010.

Last updated: May 2026

Bayesian Toolkit Overview

Composing adaptive mechanisms? Validate the pipeline, not the parts.

Contents

1. When to Use Bayesian Methods

You have relevant historical data

Sample sizes are constrained

Interim decisions matter

Regulators expect it

Bayesian vs. Frequentist: A Practical View

Hybrid Designs

2. The 6-Step Workflow

Step 1: Prior Elicitation

Step 2: Bayesian Borrowing (Optional)

Step 3: Single-Arm Sample Size

Step 4: Two-Arm Design

Step 5: Sequential Monitoring (Optional)

Step 6: Predictive Power (Interim PPoS)

See It in Action

PPoS Decision Framework

3. Which Calculators Do I Need?

Single-Arm Trial

Two-Arm Randomized Trial

Two-Arm (Fixed Sample, No Interim)

Interim Monitoring Only

4. Common Design Patterns

Pattern A: Phase II Oncology (Single-Arm with Historical Borrowing)

Pattern B: Confirmatory RCT (Two-Arm with Sequential Monitoring)

Pattern C: Rare Disease (Maximal Borrowing)

5. Regulatory Alignment

SAP Documentation Checklist

6. Quick Navigation

Calculator Documentation

7. References

Related Documentation

Bayesian Workflow Tutorial

Complete Guide to Bayesian PPoS

CUPED vs. GSD vs. Bayesian

Ready to design Bayesian-ly?