UWindsor Together: Student Mental Health and Remote Learning Services

MSc Thesis Defense Announcement of Saiteja Danda:"Identification of Cell-types in scRNA-seq data via Enhanced Local Embedding and Clustering "

Thursday, April 22, 2021 - 13:30 to 15:30


The School of Computer Science is pleased to present… 

MSc Thesis Defense by: Saiteja Danda 

Date: Thursday, April 22nd, 2021 
Time:  1:30pm – 3:30pm 
Passcode: If interested in attending this event, contact the Graduate Secretary at csgradinfo@uwindsor.ca


Identifying relevant disease modules such as target cell types is a significant step for studying diseases and consequently leading to better diagnosis, drug discovery, and prognosis. High-throughput single-cell RNA-Seq (scRNA-seq) technologies have advanced in recent years, enabling researchers to investigate cells individually and understand their biological mechanisms. Computational techniques such as clustering, which are categorized in the form of unsupervised learning methods, are the most suitable approach in scRNA-seq data analysis when the cell types have not been characterized. These techniques can be used to identify a group of genes that belong to a specific cell type based on their similar gene expression patterns. However, due to the sparsity and high-dimensional nature of scRNA-seq data, classical clustering methods are not efficient. Therefore, the use of non-linear dimensionality reduction techniques to improve clustering results is crucial. We introduce a pipeline to identify representative clusters of different cell types by combining non-linear dimensionality reduction techniques such as modified locally linear embedding (MLLE) and clustering algorithms. We assess the impact of different dimensionality reduction techniques combined with the clustering of thirteen publicly available scRNA-seq datasets of different tissues, sizes, and technologies. We evaluate the intra- and inter-cluster performance based on the Silhouette score before performing a biological assessment. We further performed gene enrichment analysis across biological databases to evaluate the proposed method's performance. As such, our results show that MLLE combined with independent component analysis yields overall the best performance relative to the existing unsupervised methods across different experiments. 
Keywords: non-linear dimensionality reduction, clustering, single-cell RNA sequencing 

MSc Thesis Committee:  

Internal Reader: Dr. Ahmad Biniaz            
External Reader: Dr. Phillip Karpowicz      
Advisor:  Dr. Luis Rueda 
Chair: Dr. Hossein Fani    

 MSc Thesis Defense Announcement     Vector Institute approved Artificial Intelligence topic logo


 5113 Lambton Tower 401 Sunset Ave. Windsor ON, N9B 3P4 (519) 253-3000 Ext. 3716 csgradinfo@uwindsor.ca