The goal in every trial is to conduct robust inference to produce reliable results about the benefits and harms of new treatments. Specifically, the focus is on obtaining the best (i.e., reliable) estimate of the treatment effect and its uncertainty, and other relevant measures of evidence for testing hypotheses of interest (e.g., p-values) ^{1}. The analysis of group sequential trials focuses on two elements ^{1}^{, }^{2}^{, }^{3}:

- measures of interim results that should be reported at each stage,
- overall final results that should be reported once the trial has been stopped.

Stopping rules and repeated significance testing introduce problems during analysis. First, researchers stop early after observing overwhelming or extreme results ^{17}. As such, earlier interim results tend to exaggerate the true effects and these results tend to have large variance because of less information (see RATPAC example Figure 5 or Figure 1 here). Second, the sampling distribution of the point estimate, the calculation of treatment-related quantities, and their properties are altered. For example, traditional confidence intervals for fixed trial designs tend to result in excessive coverage than desired ^{4}, ^{5} and the calculation of p-values depends on how the sample space is ordered to classify what is more extreme than the observed ^{6}. Below is a summary of specialised methods that have been developed to address these issues.

First, repeated confidence intervals that should be reported at each interim analysis ensure that the overall coverage is as desired ^{7}. Second, several treatment effect estimators with different properties exist, specifically on the magnitude of bias and variance: bias-adjusted mean estimator ^{5}, Rao-Blackwell adjusted estimator ^{8}^{,} ^{9}, and median unbiased estimator ^{8}. The latter is based on a specific ordering of the sample space such as stagewise, likelihood ratio or z-score, mean or maximum likelihood estimate, and score test ordering ^{2}^{, }^{10}^{, }^{11}. The same sample space ordering approach is also used to compute confidence intervals and p-values for final reporting when the trial is stopped ^{2}^{, }^{6}^{, }^{11}^{, }^{12}. Literature discourages the use of score test ordering as it may result in results inconsistent with the stopping decision ^{6}^{, }^{13}. Only p-values from stagewise ordering meets the following essential properties (other methods only guarantees the first point) ^{11}^{, }^{12}:

- are uniformly distributed;
- are consistent with stopping rules (e.g. the p-value must not exceed the planned nominal level when an efficacy boundary is crossed and the trial is stopped early);
- do not depend on the timing and/or frequency of future interim analyses;
- are the same as those obtained from a fixed trial design (without any interim analysis) if a trial is stopped at the first interim analysis.

Stagewise ordering is widely preferred, it can be implemented together with frequently used flexible stopping rules as it is not influenced by future results ^{11}^{, }^{12}. Of note, some authors have also recommended likelihood ratio ordering suggesting it yields desirable confidence intervals ^{14} and reliably captures the level of evidence better than other methods ^{6}. All these methods except the score ordering are implemented in R package “*RCTdesign*” ^{15} using modules “*seqMonitor()*” and “*seqInference()*”. R package “*rpact*” ^{16} and some software such as *ADDPLAN* and *East* offer stepwise ordering to produce median unbiased estimates with related confidence intervals and p-values. Recent literature addresses the performance of these estimators ^{18}.

PANDA users should note that if a trial is stopped early at the first interim analysis, the final results to be reported based on any sample space ordering method will be the same as that obtained using traditional analysis that assumes a fixed trial design.

PANDA users should note that if a trial is stopped early at the first interim analysis, the final results to be reported based on any sample space ordering method will be the same as that obtained using traditional analysis that assumes a fixed trial design.

1. Todd *et al*. Interim analyses and sequential designs in phase III studies. *Br J Clin Pharmacol*. 2001;51(5):394–9.

2. Jennison*et al*. Analysis following a sequential test. In: Group sequential methods with applications to clinical trials. *Chapman & Hall/CRC*. 2000;171–87

2. Jennison

3. Whitehead. The analysis of a sequential trial. In: The design and analysis of sequential clinical trials. *John Wiley & Sons Ltd*. 1997;135–81.

4. Jennison*et al*. Repeated confidence intervals. In: Group sequential methods with applications to clinical trials. *Chapman & Hall/CRC*. 2000;89–204.

5. Whitehead. On the bias of maximum likelihood estimation following a sequential test.*Biometrika*. 1986;73(3):573–81.

6. Cook. P-value adjustment in sequential clinical trials.*Biometrics*. 2002;58(4):1005–11.

7. Jennison*et al*. Interim analyses: The repeated confidence interval approach. *J R Stat Soc Ser B*. 1989;51(3):305–61.

8. Emerson*et al*. Parameter estimation following group sequential hypothesis testing. *Biometrika*. 1990;77(4):875–92.

9. Emerson*et al*. A computationally simpler algorithm for the UMVUE of a normal mean following a group sequential trial. *Biometrics*. 1997;53(1):365-9.

10. Tsiatis*et al*. Exact confidence intervals following a group sequential test. *Biometrics*. 1984;40(3):797–803.

11. Proschan*et al*. Inference following a group sequential trial. In: Statistical monitoring of clinical trials - A unified approach. *Springer*. 2006;113–35.

12. Wassmer*et al*. Group sequential and confirmatory adaptive designs in clinical trials. *Springer.* 2016.

13. Chang*et al*. P-values for group sequential testing. *Biometrika*. 1995;82(3):650.

14. Rosner*et al*. Exact confidence intervals following a group sequential trial: A comparison of methods. *Biometrika*. 1988;75(4):723.

15. Gillen*et al*. Designing , monitoring , and analyzing group sequential clinical trials using the “RCTdesign” package for R. 2012.

16. Lakens*et al*. Group sequential designs : A tutorial. *Preprint*. 2021;1–13.

17. Zhang*et al*. Overestimation of the effect size in group sequential trials. *Clin Cancer Res*. 2012;18(18):4872–6.

18. Robertson*et al**. *Point estimation for adaptive trial designs. In peer review. 2021.

4. Jennison

5. Whitehead. On the bias of maximum likelihood estimation following a sequential test.

6. Cook. P-value adjustment in sequential clinical trials.

7. Jennison

8. Emerson

9. Emerson

10. Tsiatis

11. Proschan

12. Wassmer

13. Chang

14. Rosner

15. Gillen

16. Lakens

17. Zhang

18. Robertson

©2024 The University of Sheffield

In collaboration with epiGenesys

Sorry, it appears that you are using a web browser without JavaScript - this prevents the app from functioning correctly.

Please enable JavaScript or use a different web browser. If you require further assistance please contact support@epigenesys.org.uk