MSc Thesis Proposal by Tanvi Sandhu

Monday, March 11, 2024 - 13:00 to 14:30

The School of Computer Science is pleased to present…

Exploration of word embeddings with Graph-Based Context Adaptation for Enhanced Word Vectors

MSc Thesis Proposal by: Tanvi Sandhu

Date: Monday, 11 Mar 2024

Time:  1:00 pm – 2:30 pm

Location: Chrysler Hall South, Room CS53

 

Abstract:
In the aspect of information storage, text assumes a central role, necessitating streamlined and effective methods for swift retrieval. Among various text representations, the vector form stands out for its remarkable efficiency, especially when dealing with expansive datasets. Arranging words that are similar in meaning close to each other in the vectorized representation helps improve how well the system performs in different Natural Language Processing (NLP) related tasks. Previous methods, primarily centered on capturing word context through neural language models, have fallen short in delivering high scores for word similarity problems. This paper investigates the connection between representing words in vector form and the improved performance and accuracy observed in NLP tasks. It introduces a method to represent words as a graph, aiming to preserve their inherent relationships and to enhance overall capabilities in semantic representation. Experimental deployment of this technique across diverse text corpora underscores its superiority over conventional word-embedding approaches. The findings contribute to the evolving landscape of semantic representation learning but also illuminates their implications for text classification tasks, especially within the context of dynamic embedding models.
 
Keywords: Graphs, Natural Language Processing, Word Embedding
 
Thesis Committee:
Internal Reader: Dr. Jianguo Lu, School of Computer Science
External Reader: Dr. Lori Buchanan, Dept. of Psychology
Advisor: Dr. Ziad Kobti, School of Computer Science
 
Vector Institute Logo