MSc Thesis Proposal: Deep Multi-View Network for Protein Sequence Classification by Jaber Al Siam

Wednesday, January 28, 2026 - 13:00

Deep Multi-View Network for Protein Sequence Classification

MSc Thesis Proposal by: Jaber Al Siam

Date: Wednesday, 28th January 2026

Time:  1:00 PM

Location: Essex Hall 122

 

Abstract:

Protein family classification is a fundamental problem in computational biology with applications in protein function annotation, structural inference, and drug discovery. While recent deep learning approaches have shown promising results, they often rely on single-view sequence representations, limiting their ability to capture complementary biological signals such as physicochemical properties, contextual semantics, and interaction information.

This thesis investigates a multi-view deep learning framework for protein sequence classification that integrates heterogeneous protein representations. The proposed approach combines physicochemical time-series encoded using Gramian Angular Fields, frequency-based geometric representations of amino acid composition, contextual embeddings learned from protein language models, and graph-based embeddings derived from protein–protein interaction networks. Each representation is processed through a dedicated neural branch, and the resulting features are fused to enable joint learning across multiple biological views.

The objective of this work is to systematically analyze how different protein representations contribute to classification performance and robustness. The proposed framework will be evaluated on large-scale PDB-derived datasets using cross-validation and ablation studies to assess the contribution of each view.

 

Thesis Committee:

Reader 1: Dr. Jessica Chen     

Reader 2: Dr. Ikjot Saini

Advisor: Dr. Alioune Ngom