The School of Computer Science would like to present…
Text Classification in Natural Language Processing
PhD Comprehensive Exam by: Ryan Bluteau
Date: Thursday, January 25, 2024
Time: 12:30 pm – 2:00 pm
Location: Essex Hall Room 122
Abstract:
The field of Natural Language Processing (NLP) has recently grown exponentially since the release of the Transformer and recent large language models such as GPT-4 and ChatGPT. Within this field, we focus on the history of text classification and current trends in the field.
Up until 2017, text classification has largely depended on the Recurrent Neural Network (RNN) to process text. RNNs resolved problems encountered when trying to process sequences in a neural network, thus allowing us to input an arbitrarily sized text input. It consumes tokens, and pieces of text, one at a time and optionally provides outputs at each step, often using just one output for text classification. The RNN resolved some contextual dependencies using bidirectional connections allowing the model to revisit prior tokens processed. Contextual information was later improved with the development of embeddings and attention. Embeddings represent tokens and store relationships between themselves and other relevant tokens and could be disconnected from the network and used in conjunction with other models (pre-trained embeddings). Attention is a mechanism to compare embeddings of a text sequence to focus the model on relevant information. It scales each embedding based on its contextual relationship to hidden states in the network and bypasses a choking point of information in the RNN.
In 2017, the transformer was released taking full advantage of embeddings and attention. The model uses a multi-headed self-attention mechanism to encode and decode text. This field grew with a wide array of pre-trained models, like BERT and GPT, all of which were released with a variety of sizes trending in both directions. Later large language models took over including GPT-4 and the development of ChatGPT with an ability to auto-complete prompts using text generation. Classification could then take advantage of pre-trained models and their stored knowledge.
Keywords: Natural Language Processing (NLP), Text Classification, Transformer
PhD Committee:
Internal Reader: Dr. Boubakeur Boufama
Internal Reader: Dr. Dan Wu
External Reader: Dr. Jonathan Wu
(Faculty/department)
Advisor: Dr. Robin Gras