MSc Thesis Proposal by:
Lian Duan
Date: Friday, October 6th, 2023
Time: 11:00 AM-12:00pm
Location: Essex Hall Room 122
Abstract:
Graph neural networks (GNNs) have gained increasing popularity as a powerful tool for node classification in complex networks, among other tasks. However, the traditional design of GNNs assumes homophily, where connected nodes have similar class labels and features. In the real world, it is common for connected nodes to have different class labels and dissimilar features, a scenario known as heterophily, which can affect the performance of GNNs. To address this issue, recent studies have proposed paradigms to enhance the representation power of GNNs under heterophily. These methods include higher-order neighborhoods, ego- and neighbor-embedding separation, and the combination of intermediate representations. However, it is unclear whether these proposed approaches are effective in real-world datasets, especially in single-cell RNA sequencing field, with high heterophily. In this study, we designed a pipeline to process single-cell RNA sequencing data of pancreatic cells from a healthy human donor in the Baron Human Pancreas dataset and feed the data into multiple GNNs models to predict cell types. Our early experiments show that H2GCN, incorporating the proposed methods, outperforms all other GNN models on the Baron Human Pancreas dataset, including GCN, GAT, GraphSAGE, and MixHop.
Thesis Committee:
Internal Reader: Dr. Jianguo Lu
External Reader: Dr. Brian DeVeale
Advisor: Dr. Luis Rueda