GNN andTransformer Fusion Learning for Molecular Classification of BACE1 Inhibitors

dc.contributor.authorShadid, Md. Abu Hena
dc.contributor.authorTabassum, Mahajabin
dc.date.accessioned2026-06-25T03:48:46Z
dc.date.issued2025-10-25
dc.descriptionSupervised by Mr. Tareque Mohmud Chowdhury, Assistant Professor, Mr. Njayou Youssouf, Lecturer, Department of Computer Science and Engineering (CSE) Islamic University of Technology (IUT) Board Bazar, Gazipur, Bangladesh This thesis is submitted in partial fulfillment of the requirement for the degree of Bachelor of Science in Computer Science and Engineering, 2025
dc.description.abstractAlzheimer’s disease (AD) is a progressive and devastating neurodegenerative disor der, primarily manifested through memory loss and cognitive decline [1], [2]. One of the central pathological hallmarks of AD is the accumulation of amyloid-beta (A𝛽) plaques, formed via the sequential cleavage of the amyloid precursor protein (APP) by 𝛽-secretase (BACE1) and 𝛾-secretase [3]. Inhibiting BACE1 is therefore regarded as a compelling therapeutic strategy, as it can impede the formation of neurotoxic A𝛽 aggregates [4], [5]. Nevertheless, the identification of effective BACE1 inhibitors remains arduous and resource-intensive when approached through conventional ex perimental pipelines. In this study, we propose a hybrid deep learning framework that fuses Graph Neural Networks (GNNs) with ChemBERTa, a transformer model pretrained on large chemical corpora. While GNNs capture atom-level and bond level interactions (local structural dependencies), ChemBERTa encodes long-range dependencies and semantic patterns from SMILES representations (global chemical context). By unifying these complementarymodalities, ourmodelovercomesthelim itations of prior GNN+CNN approaches, where CNNs process sequential SMILES in a strictly local fashion and fail to capture non-linear long-range dependencies across molecular structures. Our GNN–ChemBERTa fusion model achieved an accuracy of 92.77% inclassifying active versus inactive BACE1 inhibitors, demonstrating superior predictive power and generalization. Beyond its performance, the model contributes to reducing drug discovery costs, accelerating virtual screening, and minimizing the need for extensive laboratory experimentation. Moreover, a recall value of 93% in dicates that almost all potential active molecules were successfully identified by the model, minimizing the risk of missing true inhibitors. Similarly, a high precision value of 93% demonstrates that the model produces very few false positives, thereby reducing unnecessary laboratory costs associated with testing inactive compounds. Additionally, the ROC–AUC score of 87.88% confirms that the model can effectively distinguish between active and inactive molecules, reflecting strong overall classifica tion performance. By enabling efficient in silico identification of potential inhibitors, this approach not only streamlines the early stages of Alzheimer’s drug development but also holds promise for broader application to other therapeutic targets associated with neurodegenerative diseases.
dc.identifier.urihttps://repository.iutoic-dhaka.edu/handle/123456789/2639
dc.language.isoen
dc.publisherDepartment of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh
dc.titleGNN andTransformer Fusion Learning for Molecular Classification of BACE1 Inhibitors
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
18 Fulltext_CSE_GNN andTransformer Fusion Learning for Molecular Classification _200041101.pdf
Size:
2.39 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections