GNN andTransformer Fusion Learning for Molecular Classification of BACE1 Inhibitors

Shadid, Md. Abu Hena; Tabassum, Mahajabin

GNN andTransformer Fusion Learning for Molecular Classification of BACE1 Inhibitors

Files

18 Fulltext_CSE_GNN andTransformer Fusion Learning for Molecular Classification _200041101.pdf (2.39 MB)

Date

2025-10-25

Authors

Shadid, Md. Abu Hena

Tabassum, Mahajabin

Publisher

Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh

Abstract

Alzheimer’s disease (AD) is a progressive and devastating neurodegenerative disor der, primarily manifested through memory loss and cognitive decline [1], [2]. One of the central pathological hallmarks of AD is the accumulation of amyloid-beta (A𝛽) plaques, formed via the sequential cleavage of the amyloid precursor protein (APP) by 𝛽-secretase (BACE1) and 𝛾-secretase [3]. Inhibiting BACE1 is therefore regarded as a compelling therapeutic strategy, as it can impede the formation of neurotoxic A𝛽 aggregates [4], [5]. Nevertheless, the identification of effective BACE1 inhibitors remains arduous and resource-intensive when approached through conventional ex perimental pipelines. In this study, we propose a hybrid deep learning framework that fuses Graph Neural Networks (GNNs) with ChemBERTa, a transformer model pretrained on large chemical corpora. While GNNs capture atom-level and bond level interactions (local structural dependencies), ChemBERTa encodes long-range dependencies and semantic patterns from SMILES representations (global chemical context). By unifying these complementarymodalities, ourmodelovercomesthelim itations of prior GNN+CNN approaches, where CNNs process sequential SMILES in a strictly local fashion and fail to capture non-linear long-range dependencies across molecular structures. Our GNN–ChemBERTa fusion model achieved an accuracy of 92.77% inclassifying active versus inactive BACE1 inhibitors, demonstrating superior predictive power and generalization. Beyond its performance, the model contributes to reducing drug discovery costs, accelerating virtual screening, and minimizing the need for extensive laboratory experimentation. Moreover, a recall value of 93% in dicates that almost all potential active molecules were successfully identified by the model, minimizing the risk of missing true inhibitors. Similarly, a high precision value of 93% demonstrates that the model produces very few false positives, thereby reducing unnecessary laboratory costs associated with testing inactive compounds. Additionally, the ROC–AUC score of 87.88% confirms that the model can effectively distinguish between active and inactive molecules, reflecting strong overall classifica tion performance. By enabling efficient in silico identification of potential inhibitors, this approach not only streamlines the early stages of Alzheimer’s drug development but also holds promise for broader application to other therapeutic targets associated with neurodegenerative diseases.

Description

Supervised by Mr. Tareque Mohmud Chowdhury, Assistant Professor, Mr. Njayou Youssouf, Lecturer, Department of Computer Science and Engineering (CSE) Islamic University of Technology (IUT) Board Bazar, Gazipur, Bangladesh This thesis is submitted in partial fulfillment of the requirement for the degree of Bachelor of Science in Computer Science and Engineering, 2025

URI

https://repository.iutoic-dhaka.edu/handle/123456789/2639

Collections

2025

Full item page

GNN andTransformer Fusion Learning for Molecular Classification of BACE1 Inhibitors

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By