TY - JOUR
T1 - DeepEGFR a graph neural network for bioactivity classification ofEGFR inhibitors
AU - Malik, Aijaz Ahmad
AU - Khyriem, Costerwell
AU - Hauns, Sven
AU - Khan, Imran
AU - Pinto, Frederico Garcia
AU - Al-Sadi, Azzat
AU - Mohammad, Rasheed
AU - Tran, Van Dinh
AU - Backofen, Rolf
AU - Soares, Nelson
AU - Uddin, Mohammed
AU - Alkhnbashi, Omer S.
PY - 2025/10/31
Y1 - 2025/10/31
N2 - Epidermal Growth Factor Receptor (EGFR) plays a critical role in the development of several cancers.Thus, modulation/inhibition of EGFR activity is an appealing target of developing novel cancertherapeutics. With the advent of modern machine learning technologies, it is now possible to simulateinteractions with high precision between EGFR and small molecules to predict inhibitory/ modulatoryactivity at an unprecedented scale. In this work, we propose a novel machine-learning method to fastand precise classification of small compounds that are active, intermediate or inactive in inhibiting/modulating EGFR activity. We developed DeepEGFR, a novel multi-class graph neural network(GNN) model, to classify compounds into Active, Inactive, and Intermediate functional categories.DeepEGFR leverages complementary molecular representations, combining SMILES strings andmolecular fingerprint matrices (Klekota-Roth and PubChem) to capture both structural and property-based features of compounds. The model constructs an advanced molecular graph representing atomtype, formal charge, bond type, and bond order, through nodes and edges. DeepEGFR achievedsuperior performance compared to baseline machine learning algorithms (e.g., SVM, Random Forest,ANN), with approximately 94% F1-scores across training and test datasets for all activity classes. Toensure interpretability, the top 20 features identified by DeepEGFR were validated against the fivekey characteristics of FDA-approved EGFR inhibitors (Afatinib, Gefitinib, Osimertinib, Dacomitinib,Erlotinib), confirming the biological relevance of the features. Moreover, DeepEGFR successfullyidentified 300 underexplored EGFR-targeting compounds, demonstrating its potential to acceleratethe discovery of therapeutic agents. These results highlight the effectiveness of graph neural networksin advancing molecular activity classification, setting a potential new benchmark for EGFR inhibitorprediction. These findings demonstrate the DeepEGFR’s ability to highlight the promising EGFRinhibitors, that have received limited prior investigation, thereby supporting its role in facilitating therational development of targeted therapies for precision oncology.
AB - Epidermal Growth Factor Receptor (EGFR) plays a critical role in the development of several cancers.Thus, modulation/inhibition of EGFR activity is an appealing target of developing novel cancertherapeutics. With the advent of modern machine learning technologies, it is now possible to simulateinteractions with high precision between EGFR and small molecules to predict inhibitory/ modulatoryactivity at an unprecedented scale. In this work, we propose a novel machine-learning method to fastand precise classification of small compounds that are active, intermediate or inactive in inhibiting/modulating EGFR activity. We developed DeepEGFR, a novel multi-class graph neural network(GNN) model, to classify compounds into Active, Inactive, and Intermediate functional categories.DeepEGFR leverages complementary molecular representations, combining SMILES strings andmolecular fingerprint matrices (Klekota-Roth and PubChem) to capture both structural and property-based features of compounds. The model constructs an advanced molecular graph representing atomtype, formal charge, bond type, and bond order, through nodes and edges. DeepEGFR achievedsuperior performance compared to baseline machine learning algorithms (e.g., SVM, Random Forest,ANN), with approximately 94% F1-scores across training and test datasets for all activity classes. Toensure interpretability, the top 20 features identified by DeepEGFR were validated against the fivekey characteristics of FDA-approved EGFR inhibitors (Afatinib, Gefitinib, Osimertinib, Dacomitinib,Erlotinib), confirming the biological relevance of the features. Moreover, DeepEGFR successfullyidentified 300 underexplored EGFR-targeting compounds, demonstrating its potential to acceleratethe discovery of therapeutic agents. These results highlight the effectiveness of graph neural networksin advancing molecular activity classification, setting a potential new benchmark for EGFR inhibitorprediction. These findings demonstrate the DeepEGFR’s ability to highlight the promising EGFRinhibitors, that have received limited prior investigation, thereby supporting its role in facilitating therational development of targeted therapies for precision oncology.
UR - https://www.open-access.bcu.ac.uk/16712/
U2 - 10.1038/s41598-025-22126-8
DO - 10.1038/s41598-025-22126-8
M3 - Article
SN - 2045-2322
VL - 15
JO - Scientific Reports
JF - Scientific Reports
M1 - 38236
ER -