UWindsor Together: Student Mental Health and Remote Learning Services

MSc Thesis Proposal of Priyanka Motwani:"Discovering High Profit Product Feature Groups by mining High Utility Sequential Patterns from Feature-Based Opinions "

Tuesday, April 13, 2021 - 10:00 to 12:00

SCHOOL OF COMPUTER SCIENCE                                          

The School of Computer Science is pleased to present…

MSc Thesis Proposal by: Priyanka Anilkumar Motwani 
 
Date: Tuesday April 13th, 2021 
Time:  10:00 AM to 12:00 PM 
Passcode:If interested in attending the event, contact the Graduate Secretary at csgradinfo@uwindsor.ca
 

Abstract:  

In the age of big data, customer opinions available online on social media platforms like Amazon, Epinions, Twitter, Facebook are used to examine consumer preferences to facilitate product redesigns. While buying a product from social networking or an e-commerce website, a customer might require other users’/reviewers’ opinions on various product features rather than the item's overall rating(stars). As a result of the growing popularity of online review sites and social media, Opinion Mining(OM) has emerged as an upcoming research area. Extracting high-profit product feature groups, such as “{battery life, camera, design} of a smartphone.” from the mined opinions, will help the retailers analyze and identify the properties of the market products more effectively and understand the more preferred aspects of a particular product. The accuracy of opinion-feature extraction can be improved if more complex sequential patterns of customer reviews are learned and included in the user-behavior analysis to obtain relevant frequent feature groups.  
 
Existing Opinion-Feature Extraction systems that use Data Mining techniques with some sequences include those referred to in this thesis as Rashid13OFExt, Rana18OFExt, and HPFG19_HU. Rashid13OFExt system compares the accuracy of techniques like Sequential Pattern Mining and Association Rule Mining to obtain frequent product features and opinion words from customers’ opinions. Rana18OFExt uses Class Sequential Rules(CSR) to extract product features and opinions from free format reviews. However, these systems only capture frequent aspects from the customers’ opinions and do not discover the frequent high-profit features considering utility values (internal and external) such as cost, profit, quantity, or other user preferences. HPFG19_HU system uses High Utility Itemset Mining and Aspect Based Sentiment Analysis to extract High Utility Aspect(HUA) groups based on feature-opinion sets and works on transaction databases. HPFG19_HU aims to find the high-profit aspect groups by considering the high utility values(e.g., are more profitable to the seller?) of the extracted frequent patterns from a set of opinion sentences that correspond to itemset of aspects or features, but it does not consider the order of occurrences(sequences) of product features formed in the customers' opinion sentences that help distinguish similar users and identifying more relevant and related high-profit product features. 
 
This thesis proposes a system called High-Profit Feature Sequential Groups based on High Utility Sequences (HPFSG_HUS), which is an extension to the HPFG19_HU system that replaces frequent high utility itemset patterns with frequent high utility sequential patterns. The system combines Feature-Based OM and High Utility Sequential Pattern Mining(HUSPM) to extract high-profit feature groups from product reviews or opinions mined from Amazon review datasets. The input to the proposed system is in the form of text obtained from the product reviews' corpus. Next, we form sequences with the text data having extracted features and opinions and are combined with the utility values to form a sequence database. Lastly, USpan, a HUSPM algorithm, is applied to obtain high utility sequential patterns of product features-opinions set in a sequential database. The output is the high-profit product feature groups. We try to improvise the HPFG19_HU system by considering occurrences in sequence databases to identify sequential patterns in the features extracted from opinions. This method increases the existing system's accuracy in extracting relevant frequent feature groups that increase the retailer’s sales-profit and user satisfaction. Experiments performed on extracted patterns show that the proposed HPFSG_HUS system provides more accurate feature groups and higher revenue than the tested existing systems. 
 
KEYWORDS: Social network, Sentiment Classification, Opinion mining, High Utility Sequential Pattern Mining, Feature Extraction 
 
 

MSc Thesis Committee:  

Internal Reader: Dr. Hossein Fani              
External Reader: Dr. Mohamed Belalia 
Advisor: Dr. Christie Ezeife 
 

MSc Thesis Proposal Announcement

 
 5113 Lambton Tower 401 Sunset Ave. Windsor ON, N9B 3P4 (519) 253-3000 Ext. 3716 csgradinfo@uwindsor.ca