Coding of Cell Identity

Many diseases are caused by gene mutations. A mutant gene will change the function of a protein, which will have a direct molecular consequence, and the first arena where this consequence will play out is on the cellular level. By manipulating cells’ genetic mechanisms in our explorations both of basic science and of medical research, we are recognising the advantages of cells over small-molecule drugs, ushering in regenerative approaches and a future in which ‘smart’ cells can be programmed to recognise and attack disease. In fact, this is already happening with so-called ‘CAR T therapies’ – human T-cells reprogrammed to seek out and destroy cancers.

This contrasting finding in neuroligin-4 provides just one example of poor evolutionary genetic conservation and low translatability of animal research to human health and disease. A more classic manifestation is the high failure rate of drugs developed to treat Alzheimer’s disease – drugs initially tested in mice, despite mice lacking the ability to develop the disease. For impactful human medical research, what we need are human-relevant cell models that recapitulate human disease. The outdated and inappropriate use of animal models is now recognised in the 2022 FDA Modernization Act 2.0, which welcomes human-translatable in vitro models earlier during drug discovery and eliminates the federal mandate for animal testing. With in vitro models now approaching the necessary reproducibility and disease-context fidelity, the likelihood of successful human drug discovery is rising.

For clarifying our basic understanding of what makes cells tick, some useful workhorses are the classic HeLa, HEK293 and some immortalised T cell lines, which have the advantage of being easy to grow in vast quantities. However, when we move beyond cellular biology into medical research, a different playground is required. Human-induced pluripotent stem cells (hiPSCs) provide the platform for today’s most human-relevant research. At this point, scientists start to care about cell identity: the cell function, context and any disease-relevant mutations. By modelling disease in cells with exactly the right identity, genetic screening or expression profiling to identify therapeutic targets acquires a greater level of precision.

Despite offering great promise, protocols for differentiating hiPSCs to a target cell fate still bring some bottlenecks to the bench. Directed differentiation protocols, in which stem cells are shepherded through a series of stages to become a new cell type of interest, are very labour-intensive. These protocols essentially try to mimic embryonic development in a dish via the carefully scheduled application of patterning factors, growth factors and small molecules. These activate or inhibit certain signalling pathways to give rise to authentic populations of the target cell type. Because of the complexity of these protocols, their user-to-user reproducibility is low. This is because many factors can affect the outcome, such as the initial cell density, how the cells are dispersed and how they are counted. As a result, the final cell population can be highly variable and heterogeneous, with differences in efficiency and purity.

Luckily, new technologies such as ‘precision cellular reprogramming’ are emerging that offer a solution to current iPSC-derived cell generation methods. Precision cellular reprogramming technology combines an inducible gene expression system with a unique combination of transcription factors, which when expressed in iPSCs drives the consistent adoption of a new cell identity. It is powered by intrinsic signals from transcription factors, overcoming other cell-generation methods’ inherent limitations caused by interference from external environmental factors, and consequently resulting in cells that are consistently reprogrammed into a highly defined cell identity and population.

With this highly synchronous and consistent precision cellular reprogramming technology, the required scale-up to billions of human cells for disease modelling and high-throughput drug screening applications, or even to trillions of cells for therapeutic purposes, can proceed with greater speed, consistency and physiological relevance. These advantages facilitate the translation from laboratory to clinic, and from clinic to patient. This game-changing technology fills me with excitement for cell therapies; for example, in Parkinson’s disease, transplantation of dopaminergic neurons offers patients hope for the future.

If cell identity is a product of certain transcription factors being switched on, how do we establish which transcription factors code for which cell type? Our next challenge, then, is to identify these unique combinations, of which there are 8.8 x 1016, assuming that my calculations are correct (Figure 1)! No easy feat. Some of the transcription factors that researchers have stumbled upon have proven to be very powerful lineage-determination factors. But being able to predict the precise code that defines cell fate would open up an entirely new level of capability to the technology.

Dr Emmanouil Metzakopian is currently the vice president of Research and Development at bit.bio, a synthetic biology company providing human cells for research, drug discovery and cell therapy. Dr Metzakopian studied biochemistry and biotechnology at the University of Thessaly in Greece, before obtaining his PhD from University College London in 2010 and completing his postdoctoral training at the Wellcome Sanger Institute. Dr Metzakopian joined bit.bio in 2021 as the Head of Innovation and was appointed as vice president of Research and Development in 2022.