Turning an Ocean of Data Into Actionable Wisdom To Support Optimal Site Identification

Advancements in AI can help the pharmaceutical industry identify the best sites for their clinical trials through the harnessing of Big Data
Travis Caudill at ICON
Despite the Big Data boom and ever-advancing technology developed to support clinical trials, one of the most critical challenges trials face today is still people. Across the industry, trials still struggle to meet patient enrolment goals, which directly impact the schedule and budget, with each day of delay carrying cripplingly high costs. Successful, timely patient recruitment is directly linked to effective site selection. Conversely, selecting the wrong site can negatively impact recruitment or result in total failure to enrol patients. In the face of these long-standing challenges, evolving and expanding data sets offer opportunities for technology to leverage key evidence to improve site selection and patient enrolment for clinical trials. This is a promising prospect for an industry where a multitude of historical complexities and long lead times have resulted in a diminishing return on investment. How, though, do we determine the key levers of success in an endlessly deepening ocean of data?

AI in The Evolving Data Landscape

Healthcare is increasingly committed to Big Data, which presents opportunities for novel applications along with unique challenges. These robust datasets can generate insights to improve site selection and patient recruitment. However, sponsors and CROs now face the challenge of integrating and analysing the breadth of data available. ICON has one of the single largest clinical trial performance and patient connection datasets. Additionally, ICON’s Symphony Health is the most inclusive longitudinal healthcare database in the industry, with billions of data points and trillions of data connections inclusive of medical, hospital, and prescription claims across 317 million patients, 1.9 million unique active practitioners, and 12,000 payers. We also gather clinical trial intelligence data from strategic partners in the industry, enriching our data with 371,000 additional studies across 183,000 institutions. In combining these data, we deepen our understanding of the clinical trial ecosystem and the patient journey to generate insights into how trial sites will perform across enrolment, speed, and quality markers.
Developments in machine learning (ML) AI enable new capabilities for understanding, prioritising, and applying this growing data. ML does more than sort and organise data; it learns from it over time and refines its own ability. ICON has developed a proprietary ML and visualisation tool to facilitate higher-quality, more accurate insights. ICON’s system, One Search, ingests this data, learns from the patterns within it, and delivers insights in real time to more accurately forecast the performance of potential trial sites. However, AI doesn’t do this alone. ICON’s clinical and medical experts set the parameters for One Search’s ML algorithm, teaching it to prioritise the types of experience and capabilities most significant for understanding trial performance. One Search uses these parameters to uncover relevant connections in the performance data and then visualise those findings for ICON’s expert teams to review and improve site selection decisions.

Data-Driven Site Selection – Right Sites the First Time

Clinical trials are increasingly globalised, often conducted across multiple sites in multiple countries. Selecting effective sites is a key factor in patient enrolment performance and the timely completion of the studies. Trial delays that cause milestones to slide, whether tied to study start-ups, patient enrolment issues, or other disruptions, can cost between $600,000 to $8 million daily. Robust data are available to support more effective site selection, and there has been an increased demand for agile technologies with the capacity to aggregate and interrogate those data to create valuable data points which guide decision-making. With the right tools, data-driven decisions can select the right sites the first time.
With parameters set by ICON’s feasibility and site identification specialists, One Search combines and cleans a wide range of clinical trial data to forecast site performance for a given study. One Search can weigh, rank, and propose the sites with the highest performance potential in as little as 48 hours. Through this data-enabled decision-making, ICON sites have demonstrated a 50% reduction in non-enrolling sites, improved investigator engagement, and up to a 40% reduction in the average time for site identification, as represented in our review of 41 studies discussed below.

Intelligent Investigator Election Improves Enrolment

Trials rely on principal investigators to actively participate in patient recruitment. Choosing an ineffective investigator can cause costly delays or failure to meet recruitment goals, which can jeopardise the feasibility of a trial, leading to termination. Fortunately, data can offer valuable insights into investigator efficacy for those with the tools and capacity to evaluate them.
One Search leverages past performance and patient data to evaluate the investigator’s connectedness to the trial’s target populations and predict their effectiveness in patient recruitment. One Search makes these predictions in real time and generates a ranked list of investigators that the sponsor can approach to host a site. The analysis can be refined to provide country-specific investigator lists and assist feasibility analysts in running scenarios based on site-level intelligence.
Data-driven personnel selections help reduce site identification time, improve the time to activate sites, enhance investigator engagement, and increase the likelihood of reaching patient recruitment goals.

One Search: A Case Study for AI Improving Clinical Trials

ICON examined study start-up data from 41 full-service, non-vaccine studies. We compared site proposal and selection metrics from studies managed by ICON, and which used the One Search platform, against those not managed by ICON. The results were dramatic: in the studies managed by ICON, we fulfilled 100% of site selection within an average of 30 weeks, and these sites were 34% more likely to enrol a subject than sites that sponsors identified independently.

Exceeding Acceleration and Completion Goals

One Search-supported studies were consistently more efficient across major milestones in study start-up timelines. In ‘time to first site proposed’, One Search studies were 56% faster. Similarly, in ‘time to first site selected’, our sites were 25% faster, finishing selection in 4.6 months compared to 6.2 months.
‘First patient randomised’ complied with the baseline 75% of the time, compared to just 56% for non-ICON studies, and ‘the date of the last patient randomised’ complied 69% of the time, as opposed to just 48% for non-ICON studies. ICON proposed 100% of the contracted sites in an average of 6-7 weeks and selected 100% of sites at around 21 weeks. Comparatively, the non-ICON studies delivered site proposals at around week 17 and, on average, failed to select all the necessary sites. Based on non-performance data, ICON factors an additional 25% of required sites when making proposals to ensure studies meet goals. Not only does proposing and selecting sites earlier accelerate start-up, but it also increases the probability of completing the first and last patient randomised milestones by nearly 20%.

Data Delivers Quality

The results of this analysis confirm that leveraging data can improve start-up timelines, and it also proved an increase in the quality of sites based on their patient recruitment. When ICON reviewed all One Search sites that had been active for at least six months in non-vaccine studies, we documented a 47% decrease in the percentage of non-enrolling sites versus sites that had been identified by clients alone. This improved performance was also present to a lesser extent at three months. In turn, improved levels of site participation are driving higher rates of compliance to baseline last patient in targets, leading to greater predictability and reliability in clinical trial delivery.


Clinical trials are lengthy and costly, and have seen dwindling returns on investment as site selection, poor patient enrolment, and a host of other complexities threaten their success. CROs and sponsors can proactively improve this process by leveraging the abundance of healthcare, patient, site, and investigator data, and applying it to the site selection process. Advancements in ML enable this process and support data-driven decision-making as guided by human insight. ICON’s One Search site selection platform combines AI’s technical capabilities with our medical expertise and clinical development wherewithal to turn an ocean of data into actionable wisdom.

Travis Caudill has over 17 years’ experience in clinical trial strategy, global feasibility and site identification, and is an expert in clinical trial modeling and simulation.
As Vice President, Feasibility and Clinical Informatics at ICON he oversees a team of worldclass feasibility experts and data scientists who process, interrogate, and analyse data to support effective data driven decisions for pharma, biotech, and medical device sponsors.