Cameron Bieganek, Constantin Aliferis, Sisi Ma
Clinical trials represent a critical milestone of translational and clinical sciences. However, poor recruitment to clinical trials has been a long standing problem affecting institutions all over the world. One way to reduce the cost incurred by insufficient enrollment is to minimize initiating trials that are most likely to fall short of their enrollment goal. Hence, the ability to predict which proposed trials will meet enrollment goals prior to the start of the trial is highly beneficial.
Clinical trials represent a critical milestone of translational and clinical science with the most direct impact potential for advancing healthcare related outcomes. Patient recruitment is a necessary condition of success for clinical trials. Under specific situations such as the broad impact of COVID-19, rapid and high volume enrollment for vaccine trials is pivotal to global public health. However, poor recruitment to clinical trials has been a long standing problem affecting institutions in the US and all over the world.
Our primary data source is ClinicalTrials.gov, a Web-based resource with information on publicly and privately supported clinical studies on a wide range of diseases and conditions, maintained by the National Library of Medicine (NLM). The information regarding a specific clinical study is provided and updated by the sponsor or principal investigator of the clinical study and available for download.
Outcome of interest.
The target variable of interest for this study was clinical trial enrollment rate. The rate is defined as the total enrollment divided by the study duration, where the total enrollment and study duration were extracted from the XML records.
Prediction of clinical trials enrollment rate
In the current study, we quantified the predictive signal for clinical trial enrollment rate in trial characteristics available prior to their initiation. We adopted a nested time-series cross-validation design.
Limitations and future work
We only examined structured field from the ClinicalTrials.gov as features for predicting enrollment rate. Unstructured free-text data summarizing the goal and procedures of each study is also available from the ClinicalTrials.gov
Acknowledgments: The authors thank Dr. David Haynes II for his advice regarding the census data. The authors thank the Minnesota Supercomputing institute for providing the high performance computing resource.
Citation: Bieganek C, Aliferis C, Ma S (2022) Prediction of clinical trial enrollment rates. PLoS ONE 17(2): e0263193. https://doi.org/10.1371/journal.pone.0263193
Editor: Sathishkumar V E, Hanyang University, KOREA, REPUBLIC OF
Received: November 24, 2021; Accepted: January 13, 2022; Published: February 24, 2022.
Copyright: © 2022 Bieganek et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The data used in this study can be downloaded from the following urls: https://clinicaltrials.gov/ct2/resources/download https://www2.census.gov/programs-surveys/popest/tables/2010-2018/state/totals/PEP_2018_PEPANNRES.zip https://www.natureindex.com/annual-tables/2018/institution/all/all/.
Funding: SM’s time on this work is partially supported by Grant UL1TR002494.
Competing interests: The authors have declared that no competing interests exist.