Wednesday, September 1, 2021 - 14:00 to 16:00
SCHOOL OF COMPUTER SCIENCE
The School of Computer Science is pleased to present...
MSc Thesis Defense by: Vinay Kiran Manjunath
Date: Wednesday September 1st, 2021
Time: 2:00 PM to 4:00 PM
Meeting URL: https://us06web.zoom.us/j/89023374443?from=addon
Passcode: If interested in attending this event, contact the Graduate Secretary at firstname.lastname@example.org with suffient notice before the event to obtain the passcode.
Social media platforms have opened doors to users' opinions and perceptions. The text remains the most popular means of contact on social media, despite different means of communication (audio/video and images). Twitter is one such microblogging platform that allows people to express their thoughts within 280 characters per message. The freedom of expression has made it difficult to understand the polarity (Positive, Negative, or Neutral) of the tweets/posts. Given a corpus of microblog texts (e.g., "the new iPhone battery life is good, but camera quality is bad"), mining aspects (e.g., battery life, camera quality) and opinions (e.g., good, bad) of these products are challenging due to the vast data being generated. Aspect-Based Opinion Mining (ABOM) is thus a combination of aspect extraction and opinion mining that allows an enterprise to analyze the data in detail, saving time and money automatically.
Existing systems such as Hate Crime Twitter Sentiment (HCTS) and Microblog Aspect Miner (MAM) have been recently proposed to perform ABOM on Twitter. These systems generally go through the four-step approach of obtaining microblog posts, identifying frequent nouns (candidate aspects), pruning the candidate aspects, and getting opinion polarity. However, they differ in how well they prune their candidate features. HCTS uses Apriori based Association rule mining to find the important aspects (single and multi word) of a given product. However, the Apriori based system generate a large number of candidate sequences which generates redundant candidate aspects and HCTS also fails to summarize the category of the aspects (Camera?, Battery?). MAM follows the similar approach to that of HCTS for finding the relevant aspects but it further clusters the frequent nouns (aspects) to obtain the relevant aspects. However, it does not identify the multi-word aspects and the aspect category of a product.
This thesis proposes a system called Microblog Aspect Sequence Miner (MASM) as an extension of Microblog Aspect Miner (MAM) by replacing the Apriori algorithm with the modified frequent sequential pattern mining algorithm. The system uses the power of sequential pattern mining for aspect extraction in ABOM. The sentiments of the tweets are unknown so we build our approach in an unsupervised learning manner. The input posts are first classified to identify those tweets which contain the opinion (subjective) to those that do not have any opinion (objective). Then we extract the Parts of Speech tags for the explicit aspects to identify the frequent nouns. The novel frequent pattern mining framework (CM-SPAM) is applied to segment the single and multi-word aspects which generates less sequences as compared to previous approaches. This prior knowledge helps us to operate a topic modeling framework (Latent Dirchelt Allocation) to determine the summary of most common aspects (Aspect Category) and their sentiments for a product.The findings demonstrate that the MASM model has a promising performance in finding relevant aspects with reduction of average vector size (cost of candidate/aspect generation) against the MAM and HCTS using the Sanders Twitter corpus dataset. Experimental results with evaluation metrics of execution time, precision, recall, and F-measure indicate that our approach has higher recall and precision than the existing systems.
Keywords: Twitter Sentiment Analysis; Aspect based opinion mining; Sequential pattern mining; Topic modeling
MSc Thesis Committee:
Internal Reader: Dr. Ahmad Biniaz
External Reader: Dr. Dennis Borisov
Advisor: Dr. Christie Ezeife
Chair: Dr. Boubakeur Boufama
MSc Thesis Defense Announcement
5113 Lambton Tower 401 Sunset Ave. Windsor ON, N9B 3P4 (519) 253-3000 Ext. 3716 email@example.com (working remotely)