Research topic: Machine learning models for the identification of compounds likely to interfere with biological assays
Project Description: Today, drug discovery is still a capital-intensive, largely inefficient process, in which pharmaceutical companies invest more than a Billion dollars and a decade of work to discover a single new valuable molecular entity. During the last twenty years, advancements in technology have allowed the development of more productive strategies for modern drug discovery. Above all, High Throughput Screening (HTS), is one of the most effective methods for finding novel bioactive compounds. However, HTS comes with a major challenge: understanding which of the preliminary positive hits are worth further investments. Indeed, the bioactivity detected for some of the molecules screened could stem from non- specific binding or interference with the essay technology, producing false positive hits which, if they remain undetected, represent a considerable waste of time and resources. False-positive hits in biological assays are associated with two major categories of compounds: bad actors and frequent hitters. Bad actors can be further divided into three groups: colloidal aggregators, pan-assays interference compounds (PAINS) and reactive compounds. While frequent hitters are compounds showing higher-than-expected hit rates in assay panels due to interference with the screening technology or, rarely, due to the ability to bind multiple targets. Considering the number of resources invested in HTS, the development of new approaches for the prediction of bad actors and frequent hitters is a central problem for the drug discovery community.
Personal introduction: Early-stage researcher of the Advanced machine learning for Innovative Drug Design (AIDD) consortium, focusing on hit de-prioritization through machine learning.
After a Bachelor in Industrial Chemistry and a Master’s studies in Bioinformatics, I was keen to find a PhD position in which I could merge my passion for small molecules, data science and proteins. The AIDD program definitely ticks all my boxes and. Not only, I can use my background at best, working at the edge between organic chemistry and machine learning, but I have the unique opportunity to make an impact in the Drug Discovery field.
During my time as ESR, I aim to advance the current state of hit de-prioritization in HTS screenings. HTS is indeed plagued by false positive hits arising from non drug-like interactions with the target or interference with the assay technology. Overseeing false positives significantly slows the hit-to-lead step of the drug discovery pipeline, triggering time consuming follow-up assays and depleting resources that could have been applied in more profitable studies. Hence, the development of novel tools for HTS triaging is a pivotal step for the optimization of the early drug discovery pipeline. In my work Machine Learning and Deep Learning will serve as backbone for new hit de-prioritization software. Design of new ML tools will include merging public and private datasets, exploiting state-of-art computational resources, and applying the most recent findings in artificial intelligence (AI). From these building blocks we aim to deliver new AI based methods having a direct impact on the drug discovery pipeline.
Contact: GitHub LinkedIn Twitter
University of Vienna, Austria, March 1st , 2023 - August 31th, 2024
AstraZeneca AB, Sweden, March 2023