GNN andTransformer Fusion Learning for Molecular Classification of BACE1 Inhibitors

Shadid, Md. Abu Hena; Tabassum, Mahajabin

GNN andTransformer Fusion Learning for Molecular Classification of BACE1 Inhibitors

dc.contributor.author	Shadid, Md. Abu Hena
dc.contributor.author	Tabassum, Mahajabin
dc.date.accessioned	2026-06-25T03:48:46Z
dc.date.issued	2025-10-25
dc.description	Supervised by Mr. Tareque Mohmud Chowdhury, Assistant Professor, Mr. Njayou Youssouf, Lecturer, Department of Computer Science and Engineering (CSE) Islamic University of Technology (IUT) Board Bazar, Gazipur, Bangladesh This thesis is submitted in partial fulfillment of the requirement for the degree of Bachelor of Science in Computer Science and Engineering, 2025
dc.description.abstract	Alzheimer’s disease (AD) is a progressive and devastating neurodegenerative disor der, primarily manifested through memory loss and cognitive decline [1], [2]. One of the central pathological hallmarks of AD is the accumulation of amyloid-beta (A𝛽) plaques, formed via the sequential cleavage of the amyloid precursor protein (APP) by 𝛽-secretase (BACE1) and 𝛾-secretase [3]. Inhibiting BACE1 is therefore regarded as a compelling therapeutic strategy, as it can impede the formation of neurotoxic A𝛽 aggregates [4], [5]. Nevertheless, the identification of effective BACE1 inhibitors remains arduous and resource-intensive when approached through conventional ex perimental pipelines. In this study, we propose a hybrid deep learning framework that fuses Graph Neural Networks (GNNs) with ChemBERTa, a transformer model pretrained on large chemical corpora. While GNNs capture atom-level and bond level interactions (local structural dependencies), ChemBERTa encodes long-range dependencies and semantic patterns from SMILES representations (global chemical context). By unifying these complementarymodalities, ourmodelovercomesthelim itations of prior GNN+CNN approaches, where CNNs process sequential SMILES in a strictly local fashion and fail to capture non-linear long-range dependencies across molecular structures. Our GNN–ChemBERTa fusion model achieved an accuracy of 92.77% inclassifying active versus inactive BACE1 inhibitors, demonstrating superior predictive power and generalization. Beyond its performance, the model contributes to reducing drug discovery costs, accelerating virtual screening, and minimizing the need for extensive laboratory experimentation. Moreover, a recall value of 93% in dicates that almost all potential active molecules were successfully identified by the model, minimizing the risk of missing true inhibitors. Similarly, a high precision value of 93% demonstrates that the model produces very few false positives, thereby reducing unnecessary laboratory costs associated with testing inactive compounds. Additionally, the ROC–AUC score of 87.88% confirms that the model can effectively distinguish between active and inactive molecules, reflecting strong overall classifica tion performance. By enabling efficient in silico identification of potential inhibitors, this approach not only streamlines the early stages of Alzheimer’s drug development but also holds promise for broader application to other therapeutic targets associated with neurodegenerative diseases.
dc.identifier.uri	https://repository.iutoic-dhaka.edu/handle/123456789/2639
dc.language.iso	en
dc.publisher	Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh
dc.title	GNN andTransformer Fusion Learning for Molecular Classification of BACE1 Inhibitors
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 18 Fulltext_CSE_GNN andTransformer Fusion Learning for Molecular Classification _200041101.pdf
Size:: 2.39 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

2025