Adaptive Population Enrichment (APE)

Analysis

Statistical methods for monitoring and analysis

For monitoring during an APE trial, population selection is performed in the same way as treatment selection described in for MAMS trials. Here, the only difference is that researchers use an appropriate statistical model (e.g., linear or logistic regression) to estimate measures of treatment effects in subpopulations of interest and the full population at an interim analysis. Prespecified population selection rules (see “how decision rules are defined?”) are then applied considering these interim results to inform which trial population should be enrolled further.

It is equally important to obtain reliable estimates of the treatment effects and related quantities (e.g., CIs) following an APE trial. Traditional naïve (maximum likelihood, ML) estimators may yield biased treatment effects and the bias can be substantial depending on the population selection rules 1, 2, 3, 4, 5. That is, on average, estimates of treatment effects in selected subpopulations tend to be larger and those in dropped subpopulations smaller than they should be. Such biases could lead to incorrect conclusions about the effect of treatment in subpopulations and the full population. For instance, a treatment could be approved for use in the wrong population when it should not, or an effective treatment deemed ineffective. 
 
The impact of selection bias in APE trials is similar to what can arise in MAMS trials 6 and arises as a consequence of population selection. Since subpopulations are carried forward because they are promising, they are likely to demonstrate efficacy at the final analysis stage even if the treatment worked equally well across all subpopulations. Notably, this bias depends on several elements of the design such as the population selection rule, additional adaptations used (e.g., increase in sample size to enrich certain subpopulations or early trial stopping for futility), the timing of interim analysis for population selection, and size of underlying true treatment effects. Researchers are faced with a dilemma; early population selection often reduces bias but at the expense of an increase in uncertainty (standard errors) in interim treatment effects – thus negatively affecting the chances of selecting the correct subpopulations. On the other hand, population selection is improved at the expense of an increase in selection bias 6. Since the whole point of adaptive designs is to make good decisions at the interim stage, it is logical to prioritise the decision making (i.e., having more data at an interim analysis), and then address any bias in treatment effects at the end of the trial through statistical adjustment.

In APE trials, the eventual goal is to obtain reliable estimates of treatment effects (point estimates) with interval estimates with correct coverage (e.g. confidence intervals) in both dropped and selected subpopulations including the full population at the end of the trial depending on the research goals. Several methods (e.g., 1, 3, 4, 5, 7) have been proposed for normally distributed outcomes and compared in different settings, mainly to reduce the bias in the ML estimator, and their precision has been assessed. These include mean unbiased, conditionally unbiased, conditional bias-adjusted, shrinkage, bootstrap bias-adjusted, double bootstrap bias-adjusted, empirical Bayes, and hybrid estimators.

Finding a mean unbiased estimator with lower mean square error than the ML estimator has been noted as challenging in this setting 1.The double bootstrap adjusted estimator generally performed well in reducing bias and giving interval estimates with better coverage  1 . Some estimators balancing unbiasedness and the loss in precision depending on the subpopulations that have been selected have been recommended 3, 4. For example, a naïve ML estimator when the full population has been selected has been suggested 3 while others concluded that the hybrid and conditional bias-adjusted estimators control bias in most scenarios but at the expense of reduced precision 4. It is difficult to provide recommendations that apply in all situations as the performance of estimators is influenced by several factors (e.g., the population selection rules, selection decisions made, and the number of interim analyses). Also, researchers must balance the opposing demands of controlling bias and increasing precision. As such, one may choose an estimator that controls bias quite well while compromising on precision around the estimated effect (e.g., double bootstrap 1, hybrid or conditional 4, or mean unbiased estimator 5). Some proposed a hybrid estimator 5 that allows different estimators to be used depending on population selection decisions made which seems a reasonable approach. These methods have been extended to cover time-to-event outcomes 8 , including methods to compute other estimators (e.g.,  single-iteration bias-adjusted estimator) 11. Detailed methods for estimating point estimates of treatment effects following an adaptive trial are summarised and discussed in detail including examples on their application (see 9).   

In summary, researchers should clearly state the method used for bias correction (if any) 10. Also, regardless of bias-adjustment used, it is more informative to provide bias-adjusted treatment effect estimates (if any were obtained) alongside naïve ML estimates to enhance interpretation. This is, however, an additional reporting burden that comes with the complexity of more advanced adaptive trials. 

References

1. Magnusson et al. Group sequential enrichment design incorporating subgroup selection. Stat Med. 2013;32(16):2695-2714.
2. Chiu et al. Design and estimation in clinical trials with subpopulation selection. Stat Med. 2018;37(29):4335-4352.
3. Kimani et al. Estimation after subpopulation selection in adaptive seamless trials. Stat Med. 2015;34(18):2581-2601.
4. Kunzmann et al. Point estimation in adaptive enrichment designs. Stat Med. 2017;36(25):3935-3947. doi:10.1002/sim.7412
5. Kimani et al. Point estimation following two-stage adaptive threshold enrichment clinical trials. Stat Med. 2018;37(22):3179-3196.
6. Bauer et al. Selection and bias-two hostile brothers. Stat Med. 2010;29(1):1-13.
7. Wassmer et al. Designing issues in confirmatory adaptive population enrichment trials. J Biopharm Stat. 2015;25(4):651-669.
8. Kimani et al. Point and interval estimation in two-stage adaptive designs with time to event data and biomarker-driven subpopulation selection. Stat Med. 2020;39(19):2568-2586.
9. Robertson et al. Point estimation for adaptive trial designs. In peer review. 2021
10. Dimairo et al. The Adaptive designs CONSORT Extension (ACE) statement: a checklist with explanation and elaboration guideline for reporting randomised trials that use an adaptive design. BMJ. 2020;369:m115.
11. Di Stefano et al. A comparison of estimation methods adjusting for selection bias in adaptive enrichment designs with time-to-event endpoints. Stat Med. 2022;41(10):1767–79.