PhD Seminar Presentation Announcement by Ala Alam Falaki:"A Robust Approach to Fine-tune Pre-trained Transformer-based models for Text Summarization through Latent Space Compression"

Thursday, May 11, 2023 - 13:00 to 14:00

SCHOOL OF COMPUTER SCIENCE

The School of Computer Science at the University of Windsor is pleased to present …

PhD. Seminar by: Ala Alam Falaki

Date: Thursday May 11th, 2023

Time: 1:00pm – 2:00pm

Location: Essex Hall, Room 122

Reminders: 1. Two-part attendance mandatory (sign-in sheet, QR Code) 2. Arrive 5-10 minutes prior to event starting - LATECOMERS WILL NOT BE ADMITTED. Note that due to demand, if the room has reached capacity, even if you are "early" admission is not guaranteed. 3. Please be respectful of the presenter by NOT knocking on the door for admittance once the door has been closed whether the presentation has begun or not (If the room is at capacity, overflow is not permitted (ie. sitting on floors) as this is a violation of the Fire Safety code). 4. Be respectful of the decision of the advisor/host of the event if you are not given admittance. The School of Computer Science has numerous events occurring soon.

Abstract:

We proposed a technique to reduce the decoder’s number of parameters in a sequence-to-sequence (seq2seq) architecture for automatic text summarization. This approach uses a pre-trained Autoencoder (AE) trained on top of an encoder’s output to reduce its embedding dimension, which significantly reduces the summarizer model’s decoder size. Two experiments were performed to validate the idea: a custom seq2seq architecture with various pre-trained encoders and incorporating the approach in an encoder-decoder model (BART) for text summarization. Both studies showed promising results in terms of ROUGE score. However, the impressive outcome is the 54% decrease in the inference time and a 57% drop in GPU memory usage while fine-tuning with minimal quality loss (4.5% R1 score). It significantly reduces the hardware requirement to fine-tune large-scale pre-trained models. It is also shown that our approach can be combined with other network size reduction techniques (e.g. Distillation) to further reduce any encoder- decoder model parameters count.

Keywords: Transformers, Summarization, Autoencoder (AE), Sequence-to-sequence (seq2seq), Compression

PhD Doctoral Committee:

Internal Reader: Dr. Luis Rueda

Internal Reader: Dr. Dan Wu

External Reader: Dr. Jonathan Wu

Advisor (s): Dr. Robin Gras

PhD Seminar Announcement

5113 Lambton Tower 401 Sunset Ave. Windsor ON, N9B 3P4 (519) 253-3000 Ext. 3716 csgradinfo@uwindsor.ca