Digital: Retrosynthesis
Retrosynthesis is the chemist’s toolkit for building complex molecules from simpler ones. Once rooted in intuition and experience, the process has been transformed by artificial intelligence and big data. This article explores how retrosynthesis works, why it matters and how it’s shaping the future of chemistry – from drug discovery to materials science
Arthur Li at Chemical AI
Imagine being handed an intricate Lego sculpture and asked how to recreate it from scratch. You’d likely start by dismantling it, figuring out what parts it’s made of and how those parts might be assembled from even smaller components. This is essentially what chemists do with retrosynthesis.
But instead of Lego bricks, they’re working with atoms and molecules, often incredibly complex ones. Retrosynthesis is the process of mentally deconstructing a target molecule into simpler, more accessible precursors. It’s chemistry in reverse: breaking down the finished product to figure out how to make it.
The approach was formally introduced in the 1960s by E J Corey, a renowned chemist who later received the Nobel Prize for his work. His retrosynthetic method gave chemists a new way to approach synthesis planning, one that allowed them to move from desired product back to feasible starting materials in a rational, strategic way.
A pill starts on paper
Retrosynthesis isn’t just a clever intellectual exercise, it’s a critical step in the development of everything from pharmaceuticals to new materials. Every medicine, every pigment, every polymer had to be synthesised before it could be tested or produced. In drug discovery, chemists often begin with a promising compound that shows biological activity. But identifying a potential drug is only the beginning. Making it in the lab – efficiently, reliably and at scale – is where retrosynthesis comes in.
Many modern drug candidates are highly complex, with features that make their synthesis difficult or costly. Designing a smart synthetic route can mean the difference between a viable medicine and a dead-end idea. Yet, for much of its history, retrosynthesis was a slow and painstaking process, heavily reliant on the chemist’s own mental library of reactions, and a lot of trial and error.
When retrosynthesis was an art
For decades, the process was more art than algorithm. Chemists would analyse a target molecule by eye, mentally disconnecting bonds and picturing possible reaction sequences. This required deep domain knowledge and a good deal of intuition. A successful plan might take into account dozens of factors – functional group reactivity, reaction conditions, possible side reactions, the need for protective groups and more. No two chemists would approach a problem in quite the same way. The task was both creative and analytical, a blend of inspiration and logic. The best retrosynthetic minds were revered for their ability to see elegant, efficient solutions where others saw only a tangle of atoms. But even the best minds have limits.
When computers entered the lab
As the molecules chemists were asked to make became more complicated, retrosynthetic planning became more challenging. By the 1980s and 1990s, researchers had begun exploring ways to digitise the retrosynthetic process.
Early computer-aided synthesis planning tools worked by encoding known reaction rules into software. These rule-based systems could suggest possible disconnections or reaction sequences, following a kind of decision tree logic. While helpful, they were constrained by their rigidity. They could only suggest transformations that were explicitly programmed in. They couldn’t adapt to unfamiliar chemistry or propose creative solutions. The chemical imagination of these tools was, in a word, limited. In the past decade, something changed. The combination of machine learning (ML), cloud computing and an explosion in available chemical data has ushered in a new era for retrosynthesis, one where machines aren’t just following rules, but learning chemistry from scratch.
Modern AI-powered synthesis tools are trained on millions of real-world reactions, gleaned from scientific literature, patents and lab notebooks. These platforms can predict which chemical bonds to break, identify which reagents to use and plan out multistep syntheses with impressive fluency. Rather than simply mimicking what’s already been done, these models can extrapolate from patterns in the data to suggest new, untested but chemically plausible solutions. They can sometimes see options that a human chemist might overlook.
What makes artificial intelligence (AI) retrosynthesis powerful?
One of the most transformative shifts is the ability of these systems to offer diverse synthetic routes. Instead of proposing a single ‘best’ answer, AI tools often generate multiple options. Some closely follow known literature, offering safe and well-established strategies. Others are more speculative, proposing creative shortcuts or novel transformations that could reduce the number of steps or lower the cost of raw materials.
Importantly, these tools don’t operate in a vacuum. They evaluate each proposed route based on practicality: does the chemistry work for this type of molecule? Are the starting materials commercially available? Is the reaction scalable? Will it produce the right stereochemistry?
In the case of pharmaceuticals, where many drugs are chiral – meaning their molecules exist in left- and right-handed versions – this last point is crucial. AI systems can help identify suitable chiral building blocks or suggest stereoselective reactions to ensure the correct form is synthesised.
The most effective platforms also remain interactive. They allow chemists to guide the retrosynthetic process by accepting or rejecting proposed disconnections, modifying plans on the fly or integrating their own experimental knowledge. This creates a powerful collaboration between human and machine, each complementing the other’s strengths.
“ By offloading the tedious work of generating and filtering ideas, these tools allow scientists to focus on strategy, creativity and experimental execution ”
Not just a shortcut – a strategic tool
Despite their promise, AI retrosynthesis tools aren’t magic wands. They still face significant challenges. For one, no chemical database is complete. Some rare or newly discovered reactions simply aren’t represented in the training data. Models also tend to learn from the most common reactions, which means they can struggle with exotic or niche chemistry. There’s also the problem of too much possibility. As molecules get more complex, the number of potential synthetic routes grows exponentially. Even the fastest algorithms need smart ways to navigate this vast search space efficiently and meaningfully. And then there’s evaluation. Deciding which route is ‘best’ isn’t straightforward. Some chemists prioritise shorter syntheses; others focus on cost, yield, environmental impact or ease of purification. A good AI system needs to offer not just answers, but the ability to filter and compare options based on the priorities of the chemist using it.
Chemistry’s future, faster
The dream of AI-powered retrosynthesis is not about replacing chemists, it’s about amplifying their capacity. By offloading the tedious work of generating and filtering ideas, these tools allow scientists to focus on strategy, creativity and experimental execution. In the future, we may see systems that can integrate real-time feedback from lab automation platforms, adapting their suggestions based on what works – or doesn’t – on the bench. We may see retrosynthesis tools tailored to specific labs, customised to suggest reactions based on available equipment, reagents and in-house expertise.
Some researchers even envision a day when retrosynthesis becomes nearly instantaneous: a chemist draws a molecule, hits ‘Go’ and the system returns a ranked list of complete, optimised synthetic routes. We’re not there yet, but we’re closer than ever.
The molecules yet to be made
Retrosynthesis has always been about possibility, and about looking at a finished structure and imagining the steps that could bring it into being. What began as a mental exercise practised by a handful of master chemists has evolved into a dynamic field at the intersection of chemistry, computer science and data engineering.
As the tools grow smarter and more integrated, they promise to accelerate discovery in fields ranging from drug development to green chemistry to advanced materials. In doing so, they don’t just help chemists make molecules, they help them make progress.
Arthur Li is a scientific business leader with over ten years of experience building cutting-edge companies at the intersection of AI and chemistry. He currently leads the global growth for Chemical AI, a rapidly growing company applying ML to chemical reaction informatics. Prior to this role, he was an early member for Cyclica, a Canadian start-up that uses AI and ML in drug design, and led partnerships for BlueDot, a pioneer that uses AI to survey biological and chemical risks. He holds an MBA and a Master of Science in Pharmaceutical Sciences from the University of Toronto, Canada.