Bayesian Toolkit Overview
FDA-aligned Bayesian trial design in six integrated steps. The Bayesian Toolkit provides a complete workflow for designing clinical trials using Bayesian methodology, aligned with the FDA's January 2026 Bayesian guidance.
Composing adaptive mechanisms? Validate the pipeline, not the parts.
When two or more components from this toolkit (e.g. historical borrowing + sequential monitoring + sample-size re-estimation + response-adaptive randomization) update based on overlapping interim information sets, operating characteristics of the composed system are not the sum of its parts. Each component can pass its individual validation check while the composed pipeline inflates Type I error.
In a Phase II oncology composition (MAP prior + Bayesian monitoring + SSR + RAR), a 2-percentage-point prior–data drift produces pipeline-level T1E of 0.0771 (+208% above α=0.025). A separate mechanism-isolation analysis (under a more adversarial scenario) shows that prior-data conflict and time trends interact super-additively: their joint effect (0.135) exceeds the sum of their independent contributions (0.107) by 0.028, so a protocol that validates each mechanism in isolation can still under-estimate the composed pipeline's T1E (Qian 2026, JSM).
Practical guidance: simulate operating characteristics at the full composed pipeline under realistic prior-data conflict (±2pp, ±5pp) and time-trend (linear drift δ ≈ 0.05) scenarios; pre-specify a maximum acceptable T1E in the SAP; consider robust mixture priors, dynamic borrowing, or pipeline-level recalibration of γ* as mitigation. The FDA January 2026 draft guidance specifically requires this kind of composed evaluation.
Contents
1. When to Use Bayesian Methods
Bayesian methods are particularly valuable in these scenarios:
You have relevant historical data
Phase II results, published trials, or registry data can be formally incorporated via informative priors—potentially reducing sample size by 20–40% while maintaining statistical rigor.
Sample sizes are constrained
Rare diseases, pediatric populations, and oncology basket trials often can't achieve traditional frequentist power. Bayesian borrowing provides a principled way to augment limited data.
Interim decisions matter
Predictive probability of success (PPoS) answers the question stakeholders actually care about: “Given what we've seen so far, how likely is this trial to succeed?”
Regulators expect it
The FDA's January 2026 guidance explicitly endorses Bayesian methods for pivotal drug and biologic trials—not just devices. The guidance cites REBYOTA as a successful example.
Bayesian vs. Frequentist: A Practical View
| Aspect | Frequentist | Bayesian |
|---|---|---|
| Prior information | Ignored (or informal) | Formally incorporated |
| Interim interpretation | Conditional power at assumed effect | PPoS across posterior uncertainty |
| Result statement | “p < 0.05” or “95% CI excludes null” | “92% probability treatment is effective” |
| Regulatory status | Standard | Accepted with documentation (FDA 2026) |
Hybrid Designs
The toolkit supports hybrid designs—Bayesian priors and interim monitoring with frequentist final analysis—which is the FDA's recommended approach for most pivotal trials.
2. The 6-Step Workflow
The Bayesian Toolkit follows a sequential workflow. Each step produces outputs that feed into subsequent calculators.
Step 1: Prior Elicitation
What it does: Translates clinical knowledge into Beta distribution parameters (α, β) for binary endpoints.
Three methods: Quantile matching, ESS-based, Historical data
Output: Beta(α, β) prior with documented ESS and justification
Prior Elicitation DocumentationStep 2: Bayesian Borrowing (Optional)
What it does: Formally incorporates external control data with appropriate discounting.
Three methods: Power prior (static δ), Commensurate prior, MAP prior
Output: Effective prior, prior-data conflict diagnostics, sample size comparison
Bayesian Borrowing DocumentationStep 3: Single-Arm Sample Size
What it does: Determines sample size for single-arm Bayesian trials with operating characteristics.
Key outputs: Recommended N, Type I error rate, power curve, sensitivity analysis
Bayesian Sample Size DocumentationStep 4: Two-Arm Design
What it does: Sizes randomized two-arm Bayesian trials (superiority or non-inferiority).
Design options: Superiority, Non-inferiority, Allocation ratios (1:2, 1:1, 2:1)
Two-Arm Bayesian Design DocumentationStep 5: Sequential Monitoring (Optional)
What it does: Designs Bayesian interim stopping rules using posterior probability thresholds for efficacy and futility.
Key outputs: Stopping boundaries, operating characteristics (Type I error, power, expected N), power/ASN curves
When to use: When the trial design includes planned interim analyses and you want Bayesian stopping rules rather than frequentist alpha-spending
Sequential Monitoring DocumentationStep 6: Predictive Power (Interim PPoS)
What it does: Calculates probability of trial success given interim data.
Key outputs: PPoS with decision gauge, sensitivity analysis, posterior visualization
Bayesian Predictive Power DocumentationSee It in Action
Walk through the complete 6-step workflow using REBYOTA, BOIN, and other FDA-approved trials as case studies — from prior elicitation to regulatory documentation.
End-to-end tutorial: From Prior to ApprovalPPoS Decision Framework
| PPoS | Recommendation |
|---|---|
| ≥ 90% | Predicted Success — Verify current posterior meets significance |
| 20–90% | Continue — Insufficient evidence for early stopping |
| < 20% | Stop for Futility — Very low probability of success |
3. Which Calculators Do I Need?
Single-Arm Trial
Do you have historical data to incorporate?
YES → 1. Prior → 2. Borrowing → 3. Sample Size → 6. PPoS
NO → 1. Prior → 3. Sample Size → 6. PPoS
Two-Arm Randomized Trial
Do you have historical control data?
YES → 1. Prior → 2. Borrowing → 4. Two-Arm → 5. Sequential → 6. PPoS
NO → 1. Prior → 4. Two-Arm → 5. Sequential → 6. PPoS
Two-Arm (Fixed Sample, No Interim)
No planned interim analyses?
1. Prior → 4. Two-Arm Design → 6. PPoS (at interim if needed)
Interim Monitoring Only
Trial already designed, need interim decision support?
1. Prior → 6. PPoS (with interim data)
4. Common Design Patterns
Pattern A: Phase II Oncology (Single-Arm with Historical Borrowing)
Single-arm Phase II evaluating ORR against historical control rate.
| Step | Calculator | Purpose |
|---|---|---|
| 1 | Prior Elicitation | Convert Phase I/II data to Beta prior with δ = 0.5 discount |
| 2 | Bayesian Borrowing | Quantify sample size reduction from borrowing |
| 3 | Sample Size | Size trial for 80% power, verify Type I ≤ 0.05 |
| 6 | PPoS | Monitor at 50% enrollment for futility |
Example: Historical ORR = 12%, target ORR = 25%, Phase II data: 24/200 responders
- • Prior: Beta(13, 89) with ESS = 102 after 50% discount
- • Sample size: ~85 patients (vs. ~110 without borrowing)
- • Interim PPoS threshold: < 20% → stop for futility
Pattern B: Confirmatory RCT (Two-Arm with Sequential Monitoring)
Phase III RCT with expert-elicited prior, Bayesian sequential stopping rules.
| Step | Calculator | Purpose |
|---|---|---|
| 1 | Prior Elicitation | Quantile matching from expert opinion (skeptical) |
| 4 | Two-Arm Design | Size for superiority with 1:1 allocation |
| 5 | Sequential Monitoring | Design Bayesian stopping rules for 3 interim analyses |
| 6 | PPoS | Ad-hoc interim futility assessment |
Fully Bayesian approach: Use Sequential Monitoring for pre-specified stopping rules (efficacy + futility), and PPoS for ad-hoc interim decision support between planned analyses.
Pattern C: Rare Disease (Maximal Borrowing)
Small population, strong historical data from natural history study.
| Step | Calculator | Purpose |
|---|---|---|
| 1 | Prior Elicitation | Historical data method with minimal discount (δ = 0.8) |
| 2 | Bayesian Borrowing | MAP prior from multiple natural history cohorts |
| 3 | Sample Size | Aggressive sample size reduction justified by ESS |
| 6 | PPoS | Continuous monitoring given small N |
Regulatory context: FDA Section IV.B.2 addresses non-calibrated designs for rare diseases where traditional operating characteristics aren't feasible.
5. Regulatory Alignment
All six calculators are designed to satisfy FDA January 2026 guidance requirements:
| Requirement | How the Toolkit Addresses It |
|---|---|
| Prior justification (Section V.D) | Prior Elicitation documents source, method, and ESS |
| Discounting rationale (Section V.D.4) | Bayesian Borrowing provides power prior δ with justification |
| Operating characteristics (Section IV.A) | Sample Size calculators report Type I error and power via simulation |
| Sensitivity analysis | All calculators support prior sensitivity across optimistic/skeptical scenarios |
| Decision thresholds (Section IV.A) | PPoS calculator implements FDA's three approaches to success criteria |
SAP Documentation Checklist
Each calculator generates outputs for your Statistical Analysis Plan:
- Prior specification with α, β (or μ, σ²), source, and ESS
- Discounting method and justification (if borrowing)
- Decision rule: threshold γ, comparison metric, success criterion
- Sample size with operating characteristics
- Sensitivity analysis plan across prior specifications
- Interim monitoring rules with PPoS thresholds
7. References
- Spiegelhalter DJ, Abrams KR, Myles JP. Bayesian Approaches to Clinical Trials and Health-Care Evaluation. Wiley; 2004.
- Berry SM, Carlin BP, Lee JJ, Muller P. Bayesian Adaptive Methods for Clinical Trials. CRC Press; 2010.
- Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. Bayesian Data Analysis. 3rd ed. CRC Press; 2013.
- Morita S, Thall PF, Müller P. Determining the effective sample size of a parametric prior. Biometrics. 2008;64(2):595-602.
- Schmidli H, et al. Robust meta-analytic-predictive priors in clinical trials with historical control information. Biometrics. 2014;70(4):1023-1032.
- Zhou T, Ji Y. On Bayesian sequential clinical trial designs. New England Journal of Statistics in Data Science. 2024;2(1).
- U.S. Food and Drug Administration. Use of Bayesian Methodology in Clinical Trials of Drug and Biological Products: Draft Guidance for Industry. January 12, 2026.
- U.S. Food and Drug Administration. Guidance for the Use of Bayesian Statistics in Medical Device Clinical Trials. February 2010.
Last updated: May 2026
Related Documentation
Bayesian Workflow Tutorial
End-to-end walkthrough of the six-step workflow with a worked example from prior elicitation through interim monitoring.
Complete Guide to Bayesian PPoS
GPS analogy, mathematical foundations, VEST case study, and a comparison with conditional power.
CUPED vs. GSD vs. Bayesian
Side-by-side comparison of design-stage variance reduction, sequential monitoring, and Bayesian interim assessment.
Ready to design Bayesian-ly?
Use our Bayesian Toolkit for the complete six-step Bayesian workflow from prior elicitation through predictive monitoring — aligned with FDA's January 2026 draft guidance.
Open Bayesian Toolkit