MSc Thesis Defense Announcement of Yakin Patel: "DTA+VAE: Drug Target Affinity prediction with SELFIES String via variational autoencoder and Transformer6 protein model"

Thursday, May 4, 2023 - 10:00 to 12:00


The School of Computer Science is pleased to present…

MSc Thesis Defense by: Yakin Patel

Date: Thursday May 04, 2023
Time:  10:00 am – 12:00 pm
Location: Essex Hall, Room 122
Reminders: 1. Two-part attendance mandatory (sign-in sheet, QR Code) 2. Arrive 5-10 minutes prior to event starting - LATECOMERS WILL NOT BE ADMITTED. Note that due to demand, if the room has reached capacity, even if you are "early" admission is not guaranteed. 3. Please be respectful of the presenter by NOT knocking on the door for admittance once the door has been closed whether the presentation has begun or not (If the room is at capacity, overflow is not permitted (ie. sitting on floors) as this is a violation of the Fire Safety code). 4. Be respectful of the decision of the advisor/host of the event if you are not given admittance. The School of Computer Science has numerous events occurring soon.


A crucial step in drug discovery is identifying drug-target interactions. Over the years, there have been many computational methods to determine whether a drug and a target will interact or not. Drug-target binding affinity can also be determined by predicting the strength of the binding interaction between the drug and the target. Drug target binding affinity consider a lot of information that is left out by drug target interaction. There have been many methods to predict the binding affinity, all the methods use SMILES representation, learning accurate drug representations is essential for tasks such as computational drug repositioning, drug target affinity, drug target interaction, and drug repurposing. There are multiple ways to represent a drug in computational methods, one of which is string-based, and one of the most widely used methods is SMILES. However, SMILES has a few limitations due to its complex grammar. Here in this paper, we change the representation to SELFIES (SELF-referencIng Embedded Strings) to determine if changing drug representation helps in improving the model. We developed a model (DTA+VAE) to predict binding affinity by using a variational auto-encoder and a pre-trained protein model.
Keywords: Drug Target Binding Affinity, SELFIES, Variational Autoencoder, Generative Adversarial Network

MSc Thesis Committee:

Internal Reader: Dr. Sherif Saad
External Reader: Dr. Majid Ahmadi
Advisor: Dr. Alioune Ngom             
Chair: Dr. Fani Hossein

MSc Thesis Defense Announcement  VI, approved artificial intelligence topic