Sample Size Re-estimation (SSR)

Analysis

Statistical methods for monitoring and analysis

The implementation of SSR at interim analysis is straightforward as addressed above (see design concepts and underpinning statistical methods).

For the final analysis, both restricted and unrestricted SSR methods assume a single comparative analysis is undertaken after gathering outcome data from all participants. Data gathered before and after the interim analysis are aggregated to estimate the treatment effect and for hypothesis testing at the end of the study (using traditional methods for fixed trial designs). This ignores the fact that interim data are used twice to estimate nuisance parameters (e.g. interim variance) to inform SSR and for comparative hypothesis testing at the end (e.g., final variance).

A frequently asked question is on the impact of SSR methods on controlling the type I error rate which is essential for frequentist methods. The literature focuses on the most common nuisance parameters: standard deviation and control event rate for continuous and binary outcomes, respectively. Several authors show a negligible increase in the type I error rate (more than the allocated) when blinded SSR methods are used for both binary and continuous outcomes 1, 2, 3, 4. This is reinforced in regulatory guidance 5. The inflation of the type I error rate is also caused by the bias in the interim variance estimate and is only substantial for very small interim sample sizes but diminishes to become immaterial as the interim sample sizes increase (see 3, 6).

For a continuous outcome, the pooled variance estimator obtained from unblinded data has also been shown to result in small inflation to the type I error rate which diminishes with increasing interim sample sizes 4, 7. This was shown to be only substantial for small interim sample sizes. It is possible to choose a reasonably large interim sample size that results in immaterial inflation of the type I error rate. Similar conclusions were drawn when a control event rate is estimated rather than the pooled event rate 3. Therefore, in general, adjustment of the type I error rate is not required when non-comparative (blinded and unblinded) SSR methods are used, especially for reasonably large interim sample sizes 4, 5.  

However, in situations where strict control of the type I error rate is necessary (e.g., as a regulatory requirement), especially when unblinded SSR is used, methods are available to achieve this. One approach applies an adjustment to decrease the level at which statistical significance is determined (e.g., see 4, 6). A more flexible approach uses combination test methods 8, 9. In summary, the methods are applied as follows:

  • pre-specify how independent outcome data before and after SSR will be combined using a combination test function (e.g. inverse normal 9) to produce an overall test statistic for the final analysis, and how these data are weighted when pooled together; 
  • when the trial is complete, partition outcome data into two independent groups according to whether they contributed to SSR or not. Stage 1 includes outcome data of patients that contributed to the SSR and stage 2 are outcome data that accrued after SSR;
  • estimate a test statistic for each independent stage using a pre-specified statistical model (e.g. multiple regression model) to give stage 1 and 2 test statistics; 
  • calculate the overall test statistic using the pre-specified combination test method, stagewise test statistics, and respective stagewise weights; 
  • perform a hypothesis test by comparing the overall test statistic to the critical value at the nominal significance level.
The RATPAC trial below illustrates how this approach is implemented.

For continuous outcomes, the bias in the estimates of the treatment effect and its variance following a blinded SSR was shown to be substantial when interim sample sizes are very small, but this bias becomes negligible with increasing sample sizes 10. Thus, analysis methods for fixed trial designs are adequate so long as SSR is performed with sufficient data. It should be noted that estimation of treatment effects following comparative (unblinded) SSR which is out of scope of this section has been addressed elsewhere (see 11 for summary of available methods).

References

1. Kieser et al. Simple procedures for blinded sample size adjustment that do not affect the type I error rate. Stat Med. 2003;22(23):3571–81.
2. Friede et al. Blinded continuous monitoring of nuisance parameters in clinical trials. J R Stat Soc Ser C. 2012;61(4):601–18.
3. Friede et al. Sample size recalculation for binary data in internal pilot study designs. Pharm Stat. 2004;3(4):269–79.
4. Friede et al. Sample size recalculation in internal pilot study designs: A review. Biometrical J. 2006;48(4):537–55.
5. FDA. Adaptive designs for clinical trials of drugs and biologics guidance for industry. 2019.
6. Kieser et al. Re-calculating the sample size in internal pilot study designs with control of the type I error rate. Stat Med. 2000;19(7):901–11.
7. Gould et al. Sample size re-estimation without unblinding for normally distributed outcomes with unknown variance. Commun Stat - Theory Methods. 1992;21(10):2833–53.
8. Bauer et al. Combining different phases in the development of medical treatments within a single trial. Stat Med. 1999;18(14):1833–48.
9. Lehmacher et al. Adaptive sample size calculations in group sequential trials. Biometrics. 1999;55:1286–90.
10. Posch et al. Estimation after blinded sample size reassessment. Stat Methods Med Res. 2018;27(6):1830–46.
11. Pritchett et al. Sample size re-estimation designs in confirmatory clinical trials—current state, statistical considerations, and practical guidance. Stat Biopharm Res. 2015;7(4):309–21.