How AI will replace animal testing
We have reached a tipping point. Societal and technological pressures are sparking a shift from animal testing to AI.
Animal testing is currently a pillar of biomedical research. Experiments in mice, rats, dogs, and monkeys are the gold standard for evaluating the safety and efficacy of new medicines before launching first-in-human trials.
While animal testing has been responsible for tremendous advancements in human health and life-saving discoveries, questions arise on whether we can do better. The ethics of animal testing are hotly debated, but there are also questions around accuracy and economics. Why do animal tests cost so much and take so long to run? Are animals the best models of human biology? Why do 90% of drugs fail clinical trials even after passing animal tests?
These all beg a larger question: is there a better way?
Even though we have been wrestling with this question for ages, animal testing is deeply ingrained as a crucial part of most chemical development programs. This is starting to change. Societal and technological pressures are converging to shift the world from animal testing to AI.
This shift to AI will enable a completely new class of experimentation that brings the best aspects of basic early-stage experiments and premium late-stage animal tests and avoids the problems associated with each of them. Trained on massive amounts of high-quality data, AI could give us the cost efficiency and speed of early-stage experiments, the human and physiological relevance of late-stage experiments, and new, deeper insights into the mechanism of action to better contextualize the "why" that drives the results. Given high-quality training data, AI can do this without synthesizing a chemical, working in a lab, or harming an animal. AI will enable scientists to answer their most important questions sooner in a way that gives them richer, more accurate, and more interpretable information than existing tests. So let’s better understand the pressures driving this change and use this understanding to chart a path forward.
Tailwinds for replacing animal testing
Concerns from the public have created tremendous pressure to replace, reduce, and refine animal testing. Policy makers have responded.
In 2022, Congress passed the F.D.A. Modernization Act which changed an 85-year-old law that required new drugs to undergo animal testing before human testing. While animal tests are still the status quo for all drugs, the FDA can now accept new, modern methods like computer models, engineered organs, and organs on chips instead of animal tests. This signals a slow turning of the tide throughout Congress, FDA, and most other regulatory agencies such as the EPA. “Fully phasing out animal testing is the goal, and we will always have that goal,” Chris Frey, EPA R&D administrator says. “But I don’t want to get ahead of our scientists.” This same goal is shared by regulatory agencies outside of the US. Canada just banned all animal testing for cosmetic products in late 2023 and the EU is taking this goal a step further by seeking to accelerate the phasing out of all animal tests after 1.2 million citizens signed a petition to end animal testing.
Progress is also being driven by industry leaders. Many of the world’s largest companies recognize the limitations of animal tests and are leading efforts to develop alternative solutions.
Unilever does not test products on animals and has announced their support for a global ban on animal testing. They have doubled down on the next generation of science by investing in an institute to pioneer the development of modern methods such as computer models and human cell-based experiments. L'Oréal also does not test its products on animals but has instead invented reconstructing human skin models as an alternative to animal testing. Roche, one of the world’s largest pharmaceutical companies, has also committed to reducing animal testing resulting in an almost 40% reduction in animal tests over the past 10 years. Roche is doubling down on alternatives to animal testing through the recent opening of the Institute of Human Biology. In a similar vein, the Merck Group has committed to a three step roadmap for phasing out animal testing. Clinical trials are underway for the first drug submitted to regulators using primarily chip tissue data instead of animal data. The main preclinical efficacy studies for Vertex’s new blockbuster pain medication were also done primarily using genetically engineered cells and human neurons. Nearly every large pharmaceutical company from Pfizer to Novo Nordisk to JnJ to Eli Lilly has committed to replacing, reducing, and refining animal testing. Finally, Charles River Laboratories, one of the largest providers of animal testing in the entire world, is investing half of billion to find alternatives.
Technological tailwinds
Human Model Systems
The objective of most animal testing is to act as a substitute for humans in understanding how a novel chemical affects healthy or diseased human biology. It’s all about understanding how chemistry interacts with human biology.
Let’s consider three categories of experimental systems for modeling human biology:
Simple 2D cell systems - human cells are grown in the lab on a flat, 2D surface
Complex 3D cell systems - human cells are grown in the lab to replicate the 3D structure and microfluidics of human organs/tissues
Animal systems - whole animal organisms are used as proxies for whole human organisms
2D lab-based cell systems have seen a wave of innovation cementing their value in chemical discovery. Recently, a simple 2D cell system which tests if a chemical causes skin allergies was shown to provide the same level of information or more information than the gold standard animal test which has been used for decades. This was one of the first “new approach methodologies” to show non-animal systems can reach/surpass animal performance. It has since inspired scientists to start work on hundreds of new lab-based methods with the potential to be as informative or more informative than animal tests. Companies like Eurofins have already built phenotypic biomaps using simple 2D cell systems to profile how a chemical affects over 60 different tissue types throughout the human body. These phenotypic simple 2D cell systems provided unique information that animal models did not catch. With additional research building on breakthroughs like induced pluripotent stem cells and growing cells directly from humans, we should expect the power of these systems to increase over the coming years.
More complex 3D cell systems seek to grow human cells in a structure that is similar to the physiology of organs in the human body. Emulate recently showed their liver organ-on-a-chips can predict liver toxicity with 87% sensitivity and 100% specificity. They believe this performance could generate over $3B a year through increased R&D productivity. A new company called Vivodyne is going all-in on using human data at every stage of the drug discovery process by generating an automated lab to grow human organs. They recently raised over $48M to continue building out their complex 3D cell systems and scale usage within pharma. As we’ve seen, Big Pharma companies are also actively investing in this technology. Roche has made significant investments into 3D cell systems. The recently launched Institute of Human Biology aims to build the world’s largest research institution for developing 3D human model systems such as organoids. These are just a handful of the commercial investments in fundamental lab technologies that may be more predictive of human biology than current animal testing.
The problem is that human biology is really, really complex, so none of these individual systems captures the entire picture. Each system has its own pros and cons for evaluating any given chemical or biological mechanism. For example, a specific biological mechanism may be present in human cells but not in animal cells, the biology may be different in 2D compared to 3D, and human engineered systems may have artifacts/biases that are not present in natural systems.
Every one of these model systems are also very expensive to run as they rely on substantial lab infrastructure, complex reagent supply chains, highly trained technicians, and protocols that can take weeks-months to complete. This all adds up to millions of dollars and years of work. Once all is said and done, the results from these systems do not give scientists all the information they need - they do not give them great answers to the “why”. Which part of the chemical structure is to blame - is it the scaffold, or is it a particular functional group? What exactly is driving the result - is it the DNA or mitochondria? What can be learned from the thousands of other chemicals that have already been tested in this same system - what type of chemicals performed similarly to the one being tested? Which alternative chemicals could have better results?
Artificial Intelligence
What is needed is a method that leverages information from all these different systems while using that information to enable better performance and new capabilities to delve into the “why”. This is exactly where breakthroughs in the latest large scale neural net research are taking us. In the past few years, there has been an explosion in multi-modal AI research which can tie together different informational modalities such as text, images, audio, video, etc. AI models like ImageBind, Gemini, and CLIP are able to connect modalities together to create a more powerful model of the collective information. When plentiful multimodal data is available, training models to relate information between modalities leads to models which perform better than those trained on a single modality.
The application of these approaches to biological experimentation is still in the very earliest stages. Over the next few years, we will build AI that understands the relationship between chemicals and human biology. Achieving this will require a unification of chemistry and biology in a multi-modal AI model. We will need massive datasets with tens of thousands to hundreds of thousands of chemicals each paired with high content, multi-readout data measuring how it affects simple 2D cell systems, complex 3D cell systems, animal systems, and more.
We will use this data to train multimodal models to understand every facet of high-content data coming from cellular and animal systems, and how these systems change in response to different chemicals. The models coming from this multimodal training will perform better on downstream tasks - like predicting outcomes in humans - than training on a single modality alone. For example, an AI system for predicting human liver outcomes may perform better than mice or monkeys if it is pre-trained on a massive dataset that includes data from:
2D primary human liver cell systems
3D human liver spheroid systems
histology images of livers from animal systems
The diversity of system modalities and the large scale of the training data will allow the AI to overcome the limitations of physical experiments. With a high-quality training dataset, scientists will no longer need to worry about the inherent variation when running experiments. They will not need to manage the complex logistics of synthesizing their chemicals, ordering the proper reagents, and maintaining large amounts of lab infrastructure. They will instantly have highly accurate results from the comfort of their desk. Results that not only give them the answer but also help them understand the “why” like never before.
A path forward
Scientists running and developing the next generation of chemical products are the most important filters and guardrails on these coming changes. A core part of a scientist’s job is rigorously assessing and contextualizing the validity of different experimental designs. The experiments chosen for a program must be held to an extremely high bar as anything less jeopardizes the quality of the program’s data. For AI to be useful, scientists must put it through a rigorous assessment process where it is proven to be accurate, relevant, and trustworthy. This makes scientists the most important piece of the puzzle for implementing AI as they lead these assessments and determine how AI will be used within their programs. After talking to hundreds of scientists, most of them have an intuitive grasp of how they want to assess AI as it is similar to how they have assessed other new experimental paradigms. You can read more on this in my other post but here is a quick summary:
Ensure the AI is useful in determining the efficacy or toxicity of a chemical
Understand all the details of the AI’s training data
See how AI responds to well-studied controls in the field
Perform head-to-head blinded tests on novel chemicals where they know what the outcome should be
Use the AI side by side with existing experiments to understand its shortcomings, contextualize its benefits, and see how it best fits into their workflow
So rather than starting with the hardest experiments first, we should aim to win trust by demonstrating value on simpler experiments. We should train this AI on massive high-quality datasets from multiple experimental systems, working with thousands of scientists to transparently assess its performance and value at each step in the journey. We should publish these assessments so we can iterate together as we train more and more powerful AIs to predict more and more complex experiments. By having AI first replace much simpler experiments, we aim to build a robust foundation of trust and standards on top of which we can venture into replacing more complex experiments.
In the coming years, we will be building datasets, training AI systems, and deepening shared expertise to eventually replace animal testing. It will be a long and hard journey. A journey dependent on the same scientists who are pioneering breakthroughs in human health to pioneer the validation and use of AI. As we have seen with the innovative use of CRISPR, sequencing, antibodies, cell therapies, mRNA, and more, scientists can change the world when they are empowered with great technology. If you are a scientist who's interested in contributing to the very early-stages of the advancement and assessment of AI in chemical development, I would love to chat with you and find a way to partner! Subscribe below to follow along as we publicly chronicle this long journey with specific details on the datasets, performance readouts, interpretability methods, scientist assessments, and more.
Brandon
Email: brandon@axiombio.ai
Thank you Elliot, Alex, Sunil, Barr, Kat, Alec, Daniil, Sanjeev, Nate, and Carl for your reviews. Thank you Justin for the graphics.