Bioinformatics Analysis of Differentially Expressed Gene's in Breast Cancer Using DESeq2

dc.contributor.authorMalick, Sow Bocar Amadou
dc.contributor.authorConteh, Fatoumatta
dc.contributor.authorSawo, Muhammed
dc.date.accessioned2023-04-28T06:50:17Z
dc.date.available2023-04-28T06:50:17Z
dc.date.issued2022-05-30
dc.descriptionSupervised by Mr. Tareque Mohmud Chowdhury, Asst. Professor, Department of Computer Science and Engineering(CSE), Islamic University of Technology (IUT) Board Bazar, Gazipur-1704, Bangladesh. This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2022.en_US
dc.description.abstractDifferential Gene Expression Analysis is a strong tool for determining if genes in two or more sample groups are expressed at significantly different levels. To estimate gene counts and identify deferentially expressed genes, we’ll utilize the DESeq2 software. Also, while determining whether genes are deferentially expressed, we must account for variation in the data. The purpose is to see if differences between groups are substantial for each gene, given the biological differences between biological replicates. Using Normalized to Read Count Data (NRCD) and statistical analysis, DEG analysis was used to find quantitative differences in expression levels between experimental groups. For example; statistical testing is used to decide whether for a given gene and observed difference in read counts is significant. I.e., whether it is greater than what would be expected just due to natural random variation. The analysis requires gene expression values to be compared between sample group types. The goal is to determine which genes are expressed at different levels between conditions. It has become a widely used technology that allows for effective genome-wide relative gene expression quantification, and it is the method of choice for identifying deferentially expressed genes between two or more biological situations of interest. The primary challenges surrounding such DE analysis have been highlighted from the start, and several methodologies and tools have been offered in the relevant literature. One of the most difficult aspects of this study, as with any other statistical research, has been determining the probabilistic model that best fits the data, as well as the model’s optimal parameter estimates. Another significant challenge was the requirement for data normalization in order to appropriately compare two biological situations by analyzing and removing any potential technological and/or biological biases. Last but not least, several research have emphasized the practical requirement to determine the ideal number of biological replicates per condition and the optimal library size. We’ll go over the use of DeSeq2 method as a utilized methodology and tools for DE analysis in this article. The gene outcomes can offer biological insights into processes affected by the conditions. greater than what would be expected just due to natural random variation.en_US
dc.identifier.citation[1] Sonali Arora. “Raw TCGA data using Bioconductor’s ExperimentHub”. In: Raw TCGA data using Bioconductor’s ExperimentHub (2021). doi: https://www.bioconductor. org/packages/release/data/experiment/vignettes/GSE62944/inst/doc/GSE62944. html. [2] Sandrine Dudoit et al. “Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments”. In: Statistica sinica (2002), pp. 111–139. [3] Vanessa M Kvam, Peng Liu, and Yaqing Si. “A comparison of statistical methods for detecting differentially expressed genes from RNA-seq data”. In: American journal of botany 99.2 (2012), pp. 248–256. [4] Cosmin Lazar et al. “A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis”. In: IEEE/ACM Transactions on Computational Biology and Bioinformatics 9.4 (2012), pp. 1106–1119. doi: 10.1109/TCBB.2012.33. [5] Wentian Li. “Volcano plots in analyzing differential expressions with mRNA microarrays”. In: Journal of bioinformatics and computational biology 10.06 (2012), p. 1231003. [6] Wentian Li et al. “Using volcano plots and regularized-chi statistics in genetic association studies”. In: Computational biology and chemistry 48 (2014), pp. 77–83. [7] Shenghui Liu et al. “Feature selection of gene expression data for cancer classification using double RBF-kernels”. In: BMC bioinformatics 19.1 (2018), pp. 1–14. [8] Michael I Love, Simon Anders, and Wolfgang Huber. “Analyzing RNA-seq data with DESeq2”. In: R package reference manual (2017). [9] Yinglian Pan et al. “A novel signature of two long non-coding RNAs in BRCA mutant ovarian cancer to predict prognosis and efficiency of chemotherapy”. In: Journal of Ovarian Research 13.1 (2020), pp. 1–10. [10] Andrea Rau, Guillemette Marot, and Florence Jaffrézic. “Differential meta-analysis of RNA-seq data from multiple studies”. In: BMC bioinformatics 15.1 (2014), pp. 1–10. [11] Robert M Samstein et al. “Mutations in BRCA1 and BRCA2 differentially affect the tumor microenvironment and response to checkpoint blockade immunotherapy”. In: Nature cancer 1.12 (2020), pp. 1188–1203. [12] Terry Speed. Statistical analysis of gene expression microarray data. Chapman and Hall/CRC, 2003. [13] Zong Hong Zhang et al. “A comparative study of techniques for differential expression analysis on RNA-Seq data”. In: PloS one 9.8 (2014), e103207en_US
dc.identifier.urihttp://hdl.handle.net/123456789/1866
dc.language.isoenen_US
dc.publisherDepartment of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur, Bangladeshen_US
dc.subjectBioinformatics, Differential Expressed Genes, DESeq2, Breast Canceren_US
dc.titleBioinformatics Analysis of Differentially Expressed Gene's in Breast Cancer Using DESeq2en_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Malick_fulltext_thesis.pdf
Size:
2.72 MB
Format:
Adobe Portable Document Format
Description:
Full text of the Thesis
Loading...
Thumbnail Image
Name:
Malick_30%_turnitin similarity.pdf
Size:
487.46 KB
Format:
Adobe Portable Document Format
Description:
Turnitin report_30% similarity

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections