Digital: AI and ML

DEL, AI, and ML: The Digital Future of Research Tools

IPT talks to Noor Shaker at X-Chem about the adoption of AI and ML into life sciences and how pre-trained algorithms can aid research

IPT: What assistance can AI-based DNA-Encoded Library (DEL) provide in life sciences research?
Noor Shaker: DEL technology offers one of the most powerful approaches to hit discovery available today. It provides access to enormous libraries that can be screened for hit identification with unparalleled speed and efficiency. The fact that these libraries are so huge means that each screen produces a wealth of high-quality labelled data.
Such data are ideal for machine learning (ML), which can be used to discover hidden patterns in chemical libraries; differentiating between compounds that can potentially hit a certain target and those that won’t. By using a combination of DEL data and AI, we have shown that we can increase the efficiency of hit identification – finding more hits from a more diverse chemical space – and also speed up the process by finding hits that are commercially available. These can therefore be directly tested, as opposed to hits that require weeks for synthesis.

What are the biggest changes that the adoption of AI and ML have caused in areas such as protein-protein interactions (PPI) and G protein-coupled receptors (GPCR)?
The use of a special class of AI methods, called generative models, has opened new possibilities for the design and identification of novel chemistries. Generative models can design novel chemotypes for certain therapeutic targets that might not be intuitive for humans. By training these models on known chemistry, we gain the ability to explore, in an unbiased way, larger, more novel, and diverse chemical spaces that form more interesting starting points for virtual screening. This has proven useful for the discovery of hits for challenging targets such as PPIs and GPCRs.

What is the biggest challenge in integrating AL and ML into life sciences research?
One of the biggest challenges that has hindered adoption is the lack of access to diverse, high-quality data. Drug discovery is a long and expensive process that requires finding the right starting chemical compounds and optimising them for human efficacy. This means evaluating hundreds of different properties that are necessary for safe and efficacious drugs. Data for many of these endpoints do not exist or are very expensive to acquire, so training ML for drug discovery is a particularly challenging task.

How can the prediction of chemical properties be improved by using pre-trained ML models?
When data sets are unavailable, very small, or not diverse, we have what we call a cold-start problem. One way to approach this challenge is to start from a ML model pre-trained on the data available. This model can then either be gradually improved as new batches of data become available, or it can be fine-tuned by accessing a company’s specific, or proprietary data, catalogues.
In principle, the pre-trained model forms the starting points for model refinement and improvement. Other classes of ML methods, such as active learning, can be combined with this approach to ensure that useful information is collected with each new data batch.

Where do you see AI and ML taking life science research in the future?
The past few years have been transformative for AI in life science, and this trend will continue in the future. AI will become an integrated part of every step of the drug discovery process, from target identification, to selecting the right patients for trials, to marketing the drugs to the most responsive patient population. We’ve seen bits and pieces of this already happening, and we will witness more advances and better integration of these components within the overall healthcare system.

Noor Shaker, Senior Vice President and General Manager at X-Chem, is a biotech entrepreneur and recognised healthcare innovator. Prior to Glamorous AI (acquired by X-Chem in October 2021), she was an assistant professor at Aalborg University, Denmark.
Noor is passionate about data and AI and is on a mission to cure disease by pushing the boundaries for what is possible with AI. She has a record of achievements in AI, including numerous papers published in the field and its application to drug discovery, as well as holding a number of AI patents.
She sits on AI and diversity advisory boards for prestigious organisations and universities and is a recognised healthcare leader – she was one of the 2018 winners of Innovators Under 35 Europe from MIT Technology Review and named one of BBC’s 100 Women of 2019.