Future of Drug Discovery: Generative AI in Pharma and Medicine

Explore the future of generative AI in pharma and drug discovery with Dr. Alex Zhavoronkov, CEO of Insilico Medicine, on CXOTalk. Learn about generative AI, molecular drug design, and the impact of AI on pharmaceutical R&D.


Mar 24, 2023

In this episode of CXOTalk, we have the pleasure of speaking with Dr. Alex Zhavoronkov, the founder and CEO of Insilico Medicine.

Insilico Medicine uses artificial intelligence to enhance drug discovery. By combining generative adversarial networks (GANs), reinforcement learning, and other AI techniques, Insilico streamlines the design, synthesis, and testing of new molecules. Their approach has garnered attention, raising $400 million in funding so far.

Dr. Zhavoronkov shares insights into Insilico's goals, such as the accelerated development and testing of small molecules targeting specific diseases. We also explore how their software impacts pharmaceutical R&D by enabling researchers to investigate new targets, design molecules with certain properties, and potentially predict the outcomes of clinical trials.

Join us as we discuss the evolving landscape of pharmaceuticals and how generative AI can help discover new treatments for chronic diseases and promote a healthier future.

The conversation covers these topics:

Alex Zhavoronkov, Ph.D. is the founder and CEO of Insilico Medicine, a leader in next-generation artificial intelligence technologies for drug discovery and biomarker development. He is also the founder of Deep Longevity, Inc, a spin-off of Insilico Medicine developing a broad range of artificial intelligence-based biomarkers of aging and longevity servicing healthcare providers and life insurance industry. In 2020, Deep Longevity was acquired by Endurance Longevity (HK: 0575). Beginning in 2015, he invented critical technologies in the field of generative adversarial networks (GANs) and reinforcement learning (RL) for the generation of novel molecular structures with the desired properties and generation of synthetic biological and patient data. He also pioneered applications of deep learning technologies for the prediction of human biological age using multiple data types, and transferred learning from aging into disease, target identification, and signaling pathway modeling. Under his leadership, Insilico has raised over $400 million in multiple rounds from expert investors, opened R&D centers in six countries or regions, and partnered with multiple pharmaceutical, biotechnology, and academic institutions, nominated 11 preclinical candidates, and has generated positive topline Phase 1 data in human clinical trials with an AI-discovered novel target and AI-designed novel molecule for idiopathic pulmonary fibrosis that received Orphan Drug Designation from the FDA and is nearing Phase 2 clinical trials. Insilico also recently announced that its generative AI-designed drug for COVID-19 and related variants was approved for clinical trials.

Prior to founding Insilico, he worked in senior roles at ATI Technologies (a GPU company acquired by AMD in 2006), NeuroGNeuroinformatics, and the Biogerontology Research Foundation. Since 2012, he has published over 150 peer-reviewed research papers, and 2 books including "The Ageless Generation: How Biomedical Advances Will Transform the Global Economy" (Macmillan, 2013). He serves on the advisory or editorial boards of Trends in Molecular Medicine, Aging Research Reviews, Aging, Frontiers in Genetics, and founded and co-chairs the Annual Aging Research and Drug Discovery conference, the world's largest event on aging in the pharmaceutical industry. He is an adjunct professor of artificial intelligence at the Buck Institute for Research on Aging.


Michael Krigsman: We are exploring generative AI in pharma and drug discovery with someone who is a true pioneer in this field who has been working with generative AI for years. Alex Zhavoronkov is the founder and the CEO of Insilico Medicine.

Early generative AI experiments & adversarial networks

What I find so interesting is generative AI is the latest rage and hype, but you've been working in this field for years and years, so tell us about your work and tell us about Insilico Medicine.

Alex Zhavoronkov: Our first experiments in generative AI were in 2015-2016. We started in generative chemistry, originally, and utilized the technology called generative adversarial networks.

That is actually at the core of the many generative AI platforms in use today, and that technology, to put it simply, is a combination of two deep neural networks competing with each other. That is why it's called adversarial. One is generating meaningful data in response to a query and another one is evaluating this response to see if it's true or false (or how close it is to the ground truth).

Since then, this technology, generative AI, has advanced quite considerably. In 2017, Google published a wonderful concept where they introduce an attention layer into generative networks. It's called Transformer Architecture, and that changed everything. Those attention layers allow deep neural networks to generalize and later generate meaningful output with the desired properties.

We've been in this field pretty much since the very beginning. Ian Goodfellow pioneered generative adversarial networks, so we did not invent them. We started building on top of them in drug discovery in 2016, and my first paper was published in 2016 demonstrating the applications of adversarial autoencoders to small molecule drug design using a molecular fingerprint representations of molecular structures.

Think of it as painting or imagining new molecules with the desired properties just like you do it with images today. You say, okay, Valley or Midjourney, draw me an image with those properties.

Generative AI in molecular drug design

We thought that it would be cool to do the same with molecules, and our first paper was (believe it or not) 2016, so submitted in June, published in December. It actually made some shockwaves in the AI community, originally, because it was a really cool application.

In October of the same year, Alán Aspuru-Guzik from Harvard also put his article with a very similar idea but with variational autoencoder on Archive, so on a preprint server, and we were at that time already in the peer review process in a peer review journal. Yeah, he actually has more citations for his paper because, later, he published in a peer review journal in 2018. But it went on Archive when we were still in review.

But then he joined us as an adviser anyway, so who cares about who did it first. [Laughter] But we were probably the first who used generative AI to design a molecule.

Then it took us several years, so we of course, published multiple theoretical works around the generation of small molecules with exact properties. Using different approaches, we came up with reinforced adversarial threshold neural computer, and many, many other techniques where we also started incorporating reinforcement learning, so rewarding and punishing some of those models, and published a lot of theory.

In 2017, for the first time, so actually just a year before we talked.

Insilico Medicine's funding journey & challenges

Michael Krigsman: You've raised $400 million. When you say, "Invest the money over this time," you've raised a significant amount of money.

Alex Zhavoronkov: The road to this money was not easy. Many companies in our field, especially those that are designed by investors to access financial markets because it's a trend. Right?

They like to invest a lot of money right away. Give the founder some stock (and sometimes it's not even the founder). It's an engineered company with very polished messaging, but usually, the technology there is just starting to be developed.

In our case, we grew organically, and we were founded in 2014. But only in 2019, we raised really serious money. We raised the first $37 million (like big money) in 2019, in September.

Michael Krigsman: Raising this money, what is the objective? What are you trying to accomplish?

Alex Zhavoronkov: In 2017, we were poor. [Laughter] We actually didn't have a lot of money to invest in synthesis and just trying to explain why generative AI in drug discovery is different from generative AI anywhere else.

In drug discovery, you really need to synthesize and test. The probability that you are going to get something that works is very low because the level of precision has to be much, much higher than when you paint a painting that you like.

The molecule has to bind to a very specific, very tiny site on a protein of interest. If it doesn't, you fail.

Before that, you need to actually make this molecule, and you make this protein. You need to make the assay, an experimental assay.

In this case, you spend maybe a few months building the generative model, but then you can spend a year validating just a few molecules. That's something for your listeners to understand.

Also, the process is very expensive. It's not just the training of the model. You need to synthesize the molecule and test it.

Think of it as launching a spacecraft. You have to design it. You generated a spacecraft. Then you need to launch it. That's the synthesis part for us.

We were cash poor at that time, so I remember I sold my apartment at that time and invested everything in Insilico.

The molecules that we synthesized at WuXi AppTec (a very famous contract research organization). That's another thing I can probably talk about today.

They synthesize tested and it worked. In 2018, we published the first paper on that. That led to the first kind of round of investment that we got from credible investors, so we got our first $6 million after synthesizing and testing.

Advancements: AI techniques & reinforcement learning

The first AI generated molecules after the papers were published. In 2018, in August, we published the first experimental validation of AI generated molecules.

Then we, of course, advanced even more. In 2019, we published a really big paper, in "Nature Biotechnology," showing that WuXi AppTec actually decided to challenge us.

They said, "Okay, well, can we give you any target, and how quickly can you design small molecules and test them?"

We showed that, in 46 days, we can very rapidly come up with small molecules for the target that they give. It wasn't a hard target. Synthesize them and test them in many assays, so metabolic stability, microsomal stability, activity, so enzymatic assay, and then all the way into mice in 46 days. That was pretty cool.

We, of course, did this experiment actually in 2018, so people knew about it and gave us the $37 million. Nowadays, if you were to do something like that somewhere else (in some other industry) you can get probably $0.5 billion. [Laughter]

But for us, it wasn't easy, and our first step after we got the first money, round B, so $37 million in 2019, that was actually after we talked with you. Before that, we didn't have much money.

We developed the software that other people can use. But not only for chemistry; also, for biology. We allow people using generative AI to discover new targets. Then we also allow them and decide for the mechanism of disease, and then generate the small molecules with the desired properties.

Then also using generative AI, using transforming neural network, we predict the outcome of clinical trial Phase II to Phase III, so basically trying to replicate the entire process of pharmaceutical R&D and turn it into a generative exercise. We let people use the software.

Then in 2019-2020, we actually thought, "Okay, well, how do we make people believe that it really works? How do we make the ChatGPT moment in pharma?"

We decided to actually synthesize and test our own molecules for a novel target, so go all the way ourselves. That required a very substantial capital raise, and I can show you a slide so your readers and listeners will understand what I'm talking about.

Let me do that now, and just use a visual aid. But I'll talk through this just in case.

Unique challenges in AI-based drug discovery

Here is the slide which depicts the pharmaceutical drug discovery and development process. It comes from a very famous research paper in 2010 by Steven Paul in "Nature Reviews Drug Discovery," so I highly recommend reading this paper. It's called "How to improve R&D productivity: the pharmaceutical industry's grand challenge."

It shows you the many steps of drug discovery and development starting from disease hypothesis and target discovery, so that's where you are trying to understand why the disease happens and what is driving it. What are the critical components, protein targets that are driving the disease?

The probability of success of this exercise or this test, it fails most of the time, 1% to 5% success rate, 95% to 99% failure rate. Most of the time it's done in academia. It takes decades and costs billions of dollars, usually funded by the government.

Very often, people rely on scientific serendipity to find a good target. We still don't understand why Alzheimer's happens. We still don't understand many cancers, ALS, multiple sclerosis, so many of those age-associated processes we actually don't understand. That's actually one of the reasons why I'm focusing so much on aging because it's a big challenge.

But once you identify a target, once you validate it also in animal models, you start chemistry exercises. Here you've got target to hit, hit to lead, lead optimization, pre-clinical exercises.

Here you can see, it takes you 5.5 years from the time you found the target to the time you start human clinical trials. Again, the failure rate there is pretty significant, so only less than 50% of those are going to get there, and it will cost you $0.5 billion by this time.

Michael Krigsman: Please subscribe to our YouTube channel and hit the subscribe button at the top of our website so we can send you our newsletter.

Traditional drug development vs. Insilico Medicine's approach

How is that distinct from what you're doing then?

Alex Zhavoronkov: It's not distinct. We still have to go through all of those steps. We have to go through every single step. As a matter of fact, we have to generate even more data than usually the pharmaceutical company would do internally because we need to learn.

Also, we try to come up with redundant data sets. When we do an experiment, we try to do this experiment twice in two different labs just to be sure and also to generate more data.

If the experimental results differ from one lab to another, we need to understand why and also teach our AI why it happened. It is extremely important to have this redundancy because many of the pharmaceutical industry failures happen because somebody did the experiment wrong and just didn't report the data.

We want to set the new standard in quality of the delivery of those new molecules for new targets so that when the pharmaceutical company looks at it, they are like, "Okay, it's not only our level. It's above our level (in a traditional approach."

AI in drug development can increase success probability

You can see that here it costs you $0.5 billion, so AI can make this process significantly less expensive. Also, the most important part here is increasing the probability of success.

Your major objective function in this entire exercise (when you are looking at this slide) is to pass Phase II human clinical trials in humans. Phase II is when you test efficacy when you see if the drug is not only safe but it's also effective.

This task usually fails 66% of the time. You can see from the slide, actually sometimes it's an even higher probability of failure.

This slide comes from 2010. Since then, the situation actually got worse. We've got Eroom's law working in pharma. I'm not sure if you have ever heard about Eroom's law, but it's Moore's law backwards.

Michael Krigsman: [Laughter]

Alex Zhavoronkov: Moore's law is when things become exponential and you've got a doubling of performance every few years. Here you've got a reduction of performance every few years because, in pharma, many of the low-hanging fruits have been picked up already and it's very difficult to find something that is novel and, at the same time, can be done in reasonable timeframes on a reasonable budget.

We have to go expensive, and that's why many pharmaceutical companies have to raise a lot of capital to take the drug into the clinic. We had to do this, too. So, we had to become a biotech.

That's the reason why we raised so much money that you've mentioned originally. But we raised all this money in 2021-2022, mostly, because that's when you started investing a lot in your own drugs.

The beauty of those drugs is that if they succeed, if you have a Phase II complete asset and you are addressing a chronic disease with no cure and also potentially a blockbuster disease, it means that after you pass Phase II, this drug can be worth $10 billion.

We just saw one story like that, a company called Nimbus Therapeutics. They completed a Phase II clinical study for a very old target.

I usually say that my grandmother was working on this target. It's not novel. The level of novelty is low.

They completed the clinical trial for a novel molecule for this target for psoriasis – it's called TYK2 target – and sold it for $6 billion to Takeda.

Yes, it takes you a long time to polish this diamond. Yeah, very often this diamond cracks. But if you do polish it until Phase II, the payout is very significant.

That is how biotechnology works. When you have an AI tool that allows you to move quickly and increase the probability of success, the best strategy for any AI company is to develop its own drugs because if you can demonstrate that you can do it, people will believe in AI.

Pharma is very conservative. They've seen many transformations over the years. They've seen the human genome being sequenced. They've seen the revolution of computing, mobile, social networking, globalization, the emergence of contact trace research organizations, CRISPR, IPSC. But we only had 50 drugs approved by the FDA last year, and the year before was the second record year in history in terms of the drug approvals.

In my opinion, only seven of those drugs approved last year were innovative, small-molecule drugs. Sorry, but the industry is getting worse.

If you have the AI tool that really is transformative, you want to develop your own drugs. You want to sell them to pharma. You want to enable pharma with your own tools.

Early partnerships with large pharma and lessons learned

But one mistake that we were making in the past when we just started, we started doing a lot of pilots with bit pharma. That's actually the topic I can expand on a lot.

Michael Krigsman: Why was that a mistake? That seems like a kind of natural course of action to partner with these larger companies.

Alex Zhavoronkov: It seems like a reasonable course of action and, at that time, we also got lucky that we started partnering with them in the early days, like 2014 to 2017. We were always partnering on large-scale projects.

At that time, the pharmaceutical companies did not have massive AI teams themselves (internally). They were willing to share data. They were willing to share the experience. They were much more collaborative. But at the same time, they were much more cautious and the budgets for AI were very small.

We learned a lot during those collaborations, so I actually decided to go end-to-end back when we partnered with one big pharmaceutical company. They actually challenged us in many different departments to try to apply AI to the most complex problems they've got.

We solved many of those problems. We realized, "Oh, but Department A does not talk to Department B, and Department B does not talk to Department C. They don't even know all the processes internally." Big pharma, it's disconnected.

If we could connect it in one seamless workflow, we could actually increase the performance of the pharma dramatically even by just this connectivity, even if AI does not result in huge performance gains. That's when the idea of end-to-end originated. But it was a mistake also to partner with them because what I found as a major problem in big pharmaceutical companies is that people come and go.

You see this timeline. To develop a drug and put it on the market, it may take you a decade. But the chief science officer of a big pharma company or one of the top R&D people will not be there for that long. Very often, they move.

Every time there is a new CEO, they change R&D. You get a new chief science officer, they change R&D. Very few projects actually mature in this volatile, rapidly changing environment.

If you start a project, and if you don't control it, some people change within pharma, they go somewhere else, and the project is either discontinued or deprioritized or killed. We've experienced that once.

I'll actually tell you. Well, I'm not going to tell you. Sorry. [Laughter]

A big pharmaceutical company we partnered with in 2015, the new boss comes in, kills 75% of internal projects, starts his own agenda, partners with his buddies, and kills many of our projects, and we actually had in mind that plan. We actually already partnered the plan to go end-to-end, discover new targets, generate small molecules, go all the way into the clinic.

They had a small budget for that. There was internal commitment. But that entire group that did a deal with us was eliminated within a year after the new CSO came in.

Guess what. Four years later, he is out doing something else. Now his new replacement is probably going to change.

We decided not to do pilots anymore.

Michael Krigsman: That's always the risk when a small company partners with a large company, and then also how can they absorb those innovations, which is often a challenge as well.

We have an interesting question from Twitter. This is from Arsalan Khan. He wants to know what are the negative and challenging aspects of using AI in drug discovery. Is it inflated expectations? What are the negative aspects?

Alex Zhavoronkov: If you are really using AI, and if you are developing and you are committed to that task, there are no negative aspects. There are only positive aspects.

The negative aspect is that you've got a lot of very smart financers around the world, mostly in the U.S., and actually in China as well, who would look at the trend, who would try to predict the trend. They say, "Okay, well, AI is hot. I need to have an AI play. I can come and invest in Insilico and take a small piece of the pie and help somebody with working technology, or I can build my own from scratch or from starting block by somebody, so an engineer, the company to access financial markets."

What they would do, they would put a lot of capital into the company from scratch, so from zero, so the company doesn't need to go through the same process we went through where it's organically generated to try to build from scratch. You bring a lot of big executives, so you think, "Okay, well, how do I access financial markets if I don't have the tech? I find great people, and they buy."

It becomes like football. They try to get somebody from Google, from Stanford, from big pharma, somebody very old from big pharma.

It becomes a Tatooine. You've got many, many different species with high profiles in one company, in one bar. Then they start building.

It becomes a chimera with a lot of egos where technology takes the backseat. It's not the main objective. Their main objective is to get the company listed on the public market.

They try to in-license the compounds, say, that look now we have done it using AI. Nobody saw that AI, right? But, yeah, we've in-licensed something.

Or very often they hype it up and say, "Oh, I've got this new technology," or "I've got this new idea of generating machine learnable data using robotics. I have never discovered a target, but I think that with this technology I can."

Even before they lay the ground foundation and the ground floor of the lab, they go to big pharma. Again, because the big bosses are involved, they make a few big deals saying, "Okay, well, here's $50 million, $25 million upfront, and a billion dollars in arrears, and we are going to – in five years, two years, three years – deliver something to this big pharma."

Big pharma has those budgets and, when big bosses are involved, it's easier for them to make those deals. The bigger the deal, the easier it is to make (believe it or not) because you are not trying to get a small budget from a team that can use it internally. You are getting it from the big company.

Very often, those deals, they later fail. The company recognizes that "Oh, the AI company did not deliver?" Then they think that all companies in AI cannot deliver because, "Oh, this person was a super-executive at Google and somebody was a big professor at Stanford and somebody was a bigshot in big pharma. They came together, and they couldn't deliver. That's why AI doesn't work."

That's the real danger of building the company from scratch to access the financial market to being a trend instead of organically being in the field.

Michael Krigsman: Alex, behind you is a photograph that I believe is your lab.

Alex Zhavoronkov: [Laughter] Yeah. That's me. [Laughter]

Generative AI and public data

Michael Krigsman: There you are. Your lab is run using generative AI and robotics. Tell us about that.

Alex Zhavoronkov: When we started that we are going to use as much publicly available data as possible and use generative AI to figure out how to work with this publicly available data. This publicly available data is usually biological data and also text data, is usually not very clean, and it's not exactly designed for machine learning. However, we've seen with ChatGPT and many of the other generative tools that they also take publicly available data and provide very useful content – in imagining, in text.

It's not only about the quality of the data. It's about the algorithm. We focused on the algorithm first and developed a system that can generate small molecules on demand and also identify novel targets on demand without the generation of new data – just using publicly available stuff.

We've demonstrated that it works by taking the AI-discovered target, AI-generated molecule, all the way into human clinical trials. Now we are starting Phase II and demonstrated it works.

Anybody who wants to refute this argument, show me your molecule for a new target that you've taken into the clinic. There's nobody else that I know of that managed to do that.

Reinforcing generative AI with real experiments

Michael Krigsman: I love that challenge: Show me your molecule that you've taken into the clinic. That's awesome.

Alex Zhavoronkov: That's the new kind of way to cut through BS, right? I think that this is the new benchmark for companies entering this field. I think, before you start raising a big pile of money, you need to demonstrate that you've at least got a pre-clinical candidate.

Let's say you raised $5 million (if you are doing it in the U.S. because maybe it's a little bit more expensive). Let's say you raise $5 million and, with $5 million, you deliver a pre-clinical candidate. A pre-clinical candidate meaning that you've completed at least two or three efficacy experiments in mice, so you cured cancer in mice.

Those experiments, by the way, are not exactly very expensive. You can outsource them.

But if you don't have that, you shouldn't be raising $500 million or even like $50 million. Maybe $50 million is fine, but not a few hundred.

But that's the benchmark.

But now, coming to the lab, in 2019, we thought about and we started partnering with companies that generate data automatically using robotics. We saw that when you've got still humans in the loop, you would still be very biased when picking the targets.

Any time the human looks at a target, it's like quantum mechanics. You look at the article, and it's either a wave or not. Right? [Laughter] Or it's a particle. Depending on whether you looked at it or not, right?

Here, if the big pharmaceutical executive looks at the target list, that's it. They already know. They are bias. They know the targets that they know. They saw something that is logical, and they will try to cling to this target choice.

We actually didn't want to even show humans the target lists before the targets get validated. That was the big idea. How do we de-bias people to allow for more greater exploration, because every pharmaceutical executive wants confidence? They actually want to go after something that is more likely to work within their short career in the pharmaceutical company.

We decided to build our own lab that would allow for generative AI exploration from scratch, not just to train AI on machine-learnable data. No. That's not what we are trying to do. We wanted to allow for genuine AI imagination to take place, and then you validate those hypotheses that come from this AI imagination with real experiments and reinforce those pathways that actually worked.

Let me show you what we did.

By the way, during that time, COVID hit. [Laughter] You probably nowadays would kind of tend to forget about it, but 2020, we got COVID and the world decoupled.

China went on lockdown, so actually, inside China, you could still work but you could not just travel there very easily. I spent 14 weeks in quarantines building my lab.

Every time you go there, it's two weeks quarantine, and I loved it. Again, I think that right now China is being demonized by the entire world, especially by the U.S. But most of the things you hear on TV are not true. It's kind of complete garbage.

Nowadays, I actually think that fake news is a real term because people there are extremely hardworking, they're extremely friendly, they're very cooperative, and I think that if Aliens landed in China during COVID, if they didn't have COVID and they were friendly, they would be welcome.

I landed in Suzhou and decided to build my lab there because they have robotics capabilities like none other. And, of course, the country itself, internally, they are open. It's just that you need to spend two weeks in quarantine.

I'll show you the footage of what happened. This is our company. We are truly global. We have eight R&D sites.

This is one in Shanghai. That's where we do drug discovery and many people in drug discovery work there to supervise many contract research organizations that make drugs and synthesize and test them. We have a pretty epic floor, a great presentation area, super high-tech.

One and a half hours' drive away from Shanghai is Suzhou (or it's 30 minutes by train). This is April 2022, so exactly a year ago, we got this space. We have a logo on the building, and that is me promising to build a lab in April 2022, so last year.

Then again, the COVID pandemic hit, so there were some restrictions. But as I mentioned, these people are true heroes. They worked and slept there in this lab. I have never seen people more hardworking.

Here you have a gas leak and they're still working in respirators. That's me sleeping in the lab.

That is today, so you walk in. We wanted to make sure that it looks like a spaceship as well, so people understand that they are working in the future.

It's face activated. You walk in. To your right, you've got dimmed windows or where we can un-dim them and see the robots. But most of the time, we have some confidential stuff there that we don't want to show. But we can also show the workflows.

We've got a presentation area. You've got miniature copies of every room with the robots and stuff we have. You can actually see the workflow.

On December 29th, so just three months ago, I opened this lab and invited a few partners from big pharmaceutical companies to see how it works. This is my co-CEO, Dr. Ren, who is actually a literal hero of this revolution.

This is a real workflow from the lab. We take an animal tissue, send it to the robot, the robot picks it up, grinds it, microplates it, does quality control, passes it to another room. By the way, after that, the human work is over after we put in the sample.

We've got AGVs, autonomous guided vehicles that work around the lab to ensure that there are no human errors. You get imaging, high content imaging. You've got high-resolution imagining.

In parallel, you start the workflow for the next-generation sequencing, so you prepare several libraries. You prepare the library for whole genome sequencing, for RNA sequencing, for methylation, and we also collect a few other data types that I don't want to talk about because people will say that they also have it (even if they don't).

We prepare those libraries, give it to the sequencer. We get methylation data, transcriptomic data, and a few other data types that we feed into AI in addition to sequencing. Again, this AI has been validated and we know that it can discover targets.

Now it starts the exploration phase. It picks the targets. It looks at those that already have compounds and picks those compounds from the compound hotel, puts those compounds onto the liquid handler.

here they get micro-plated, they get aliquoted, and being prepared for what's yet to come.

We also do a bunch of quality control experiments. Here we can do also Echo, some enzymatic assays.

In parallel, you pick up the samples from the incubator that you put in there originally, incubate them with the predicted compounds, put them back. Put the compounds in. Put them back in the incubator. After that, you have three parallel workflows.

Again, you get high-content, high-resolution imaging. You get methylation, transcriptomics, and a few other data types.

AI learns if it picked the right targets, if those compounds worked and did what you wanted them to do. For the most promising targets, humans would get the signal, and they can also do human-level validation, and we can pursue some of those targets. But most of the time, the AI just picks those compounds and trains to get the right targets.

In parallel, we have a CRISPR workflow. For those targets that don't have the compound, we can also do CRISPR screens. But originally, you actually want to find those targets that are addressable with small molecules, and it learns all the time.

Generative AI is being reinforced using real experiments. I don't think that there is anything like that, so some people talk about using machine-learnable data for training. This is not what we do. We already trained, so those are pre-trained models running the lab.

Michael Krigsman: Alex, we're just about out of time. Let's take one last question quickly from Twitter. This is again from Arsalan Khan who wants to know what happens when the underlying data is bias. For example, let's say that you don't have sufficient data gathered on the effects of a certain medication or on certain groups of people since maybe those groups could not afford the medication. How do you handle that situation?

Alex Zhavoronkov: In our case, we can actually pursue many, many, many alternatives. Usually, you start with the alternatives and the pathways where you do have the data. Again, getting one drug to the market, its traditional approach is $2 billion. Our approach is going to be significantly cheaper and faster, but still, it costs you a lot of money.

You have very, very shots on goal, so usually, you select the cases where you do have the data. If you don't have the data, you need to generate the data either experimentally or use a generative approach. But then the probability of success is going to be lower.

“Drug discovery is brutal”medica

Again, drug discovery is a brutal thing. Right now, this year, we're going to see maybe 60 million die of aging and other diseases because there are no cures. By the way, there is aging anyway, so if diseases do not kill us, aging will.

Life is not fair, right? Whatever you do, you still keep losing after a certain age.

You need to start solving problems in the order of priority. The priority is A) demonstrate the clear case that AI can be used to get the drug through human clinical trials discovered from scratch with a novel target. That takes a long time and a lot of capital.

Before you complete that, you shouldn't be thinking about other outlying cases and significant democratization of this because it's going to continue to be a very expensive process. The way we want to democratize this, and I'm going to show you a slide just so you understand this is not just words, we do have plans for that.

The idea that I currently carry is to validate this lab as much as possible with my own projects and also customers' projects. But then miniaturize this lab to the level where I can maybe make it into even two rooms or maybe even one room if I am lucky.

We want to expand the lab and add additional capabilities, but we also want to miniaturize the lab, optimize it so that it becomes small enough. This lab, I would need to build in 3D, so it has to go to the ceiling of the laboratory, and humans would not be able to walk in, so that's very dangerous because very often you need repairs or reagent changes or something breaks. So, you need to be very good at this to miniaturize this kind of technology.

My idea is to miniaturize and put it in hospitals so that the hospital would acquire all the capabilities that my company has, so they don't need to share the data with anybody. I don't need their data. That's what people misunderstand usually about my company.

We have a policy. We don't touch your data. I don't want your data. We actually will refuse your data most of the time because it has dangers in there, including all kinds of geopolitical dangers.

We want you to be able to discover targets. If I give you the lab and the software to run it, you can discover targets at hospitals' premises.

What do doctors do well? They can take biopsies. They can throw biopsies in the lab, and the lab would help them identify a pathway to treat the patients better now. But also, as you get more samples, you can identify new targets. That's the way to democratize drug discovery globally for the first time in human history.

If you can do drug discovery at hospital premises, run by physicians and physicians are usually not at the level (unless you are talking about physicians who work in pharma, drug discovery). Physicians don't have the capabilities usually to discover drugs. But with AI and this robotics capability, physicians will have the ability to acquire those superpowers.

Imagine the countries that never discovered a target that resulted in a drug. Naturally, that's most of the countries. You put a few of those robots in hospitals and now countries like Saudi Arabia or the UAE (oil-rich countries), they can now convert their petrodollars into viable drugs. How cool is that? And you can put it in Africa.

Michael Krigsman: We have a couple of other questions. I'm just going to ask you to answer them very briefly because we are past time.

AI in medical writing

This first one is from Jedd McKenzie who says, "What is your opinion on the use of AI in medical writing and regulatory submissions?" Very quickly, please.

Alex Zhavoronkov: In medical writing, currently the accuracy of generative systems is very low. Before you have massive benchmarking and validation of medical writing AI, you should probably not use it. By the way, if you get caught doing that later on, you might get prosecuted.

To just demonstrate how it works, I published a paper with ChatGPT recently and made a few cases around that. You can look at my article in Nature Medicine about the dangers of using generative AI for medical writing.

Specifically regulatory submissions, no way. You don't want to screw your most important time of your life when you are doing anything with the FDA. That has to be pristine. IT has to be super-regulated. Double, triple quality controlled, and you want to put more effort into doing so.

However, in parallel, you can actually do some benchmarking with AI just for internal experiments. But you shouldn't be using this for submission. Yeah.

IP risks and generative AI

Michael Krigsman: Another question, again very quickly – it's a complicated question – with generative AI, do you run the risk of violating IP because of the publicly available data that you're using?

Alex Zhavoronkov: It depends. First of all, if the data is publicly available and generated by the NIH, it is generated with a purpose of exactly this. You use it for experimentation and the government doesn't have any IP in that.

Usually, this data is available, so Google is doing it for you right now. They are helping you with search.

You've been using other people's data to make discoveries all the time. You are a generative system. Anybody who is creative, they are also generative.

Michael Krigsman: [Laughter]

Alex Zhavoronkov: I try to nowadays use creativity as generativity. You can substitute those terms.

There will be IP issues when you are utilizing proprietary data. Somebody's proprietary data, if you get the full-text articles, for example, without paying for them to the publisher, then it is possible to watermark and notice that this data was used for the generation of your content, the company that is doing this, it might get prosecuted.

There's a long debate about this because, in this case, AI is just like a human. It needs to be treated as a human.

If the human saw something and learned and then generated something new, the level of novelty is very high and there is no way to trace where you got the original, you know, maybe pre-training data for yourself. Then there should not be IP issues. But if there is a clear trace, then yeah, there will be IP issues.

In chemistry, this shouldn't be a problem because there you rely on massive chemistry data sets. If your molecule is not similar to anything that's published or patented or in the process of being patented, it's very diverse, then you should not have any IP issues. You own the asset.

Michael Krigsman: With that, I'm afraid we are out of time. Alex, thank you so much for taking your time to be with us today. I really appreciate it.

Alex Zhavoronkov: Happy to be with you. Again, I think AI and robotics are going to do great things.

A big goal for all of us is to solve aging. It's a big statement to make, but I think that once you set this goal high enough and the bar high enough, everything else starts looking achievable. I think even aging is achievable, but that's where AI is going to make the biggest difference.

Actually, my biggest contribution, I think, in generative AI was starting to train generative systems on longitudinal data to generate synthetic data with age as a generation condition. That allows you to play with much more data, synthetic data, than you can think of.

I think aging should be a priority for all of us.

Michael Krigsman: Absolutely. There's no doubt that AI is going to have a profound impact on all parts of our lives including drug discovery and medicine.

Thank you, everybody, for watching. Now, before you go, please subscribe to our YouTube channel and hit the subscribe button at the top of our website so we can send you our newsletter.

With that, thanks for watching, everybody. I hope you have a great day. Check out CXOTalk.com, and we will see you again next week. Have a great one.

Published Date: Mar 24, 2023

Author: Michael Krigsman

Episode ID: 782