MSc Thesis Defense Announcement of Priyanka Anilkumar Motwani:"Discovering High Profit Product Feature Groups by mining High Utility Sequential Patterns from Feature-Based Opinions "

Tuesday, July 27, 2021 - 11:00 to 13:00


The School of Computer Science is pleased to present… 

MSc Thesis Defense by: Priyanka Anilkumar Motwani 

Date:  Tuesday July 27th, 2021 
Time:  11:00 AM to 1:00 PM 
Passcode:If interested in attending this event, contact the Graduate Secretary at 


In the age of big data, customer opinions available online on social media platforms like Amazon, Epinions, Twitter, Facebook, etc., are used to examine consumer preferences to facilitate product redesigns. While buying a product from social networking or an e-commerce website, a customer might require other users’/reviewers’ opinions on various product features rather than the item's overall rating (stars). As a result of the growing popularity of online review sites and social media, Opinion Mining (OM) has emerged as an upcoming research area. Forming a group of features together instead of extracting a single feature from the mined opinions helps the retailers analyze and identify the properties of the market products more effectively and understand the more preferred features of a particular product. Extracting these product feature groups as a whole, such as “{battery life, camera, design} of a smartphone.” that yield higher profit to the manufactures and higher customer satisfaction, can be called as High-Profit Feature Groups. The accuracy of opinion-feature extraction can be improved if more complex sequential patterns of customer reviews are learned and included in the user-behavior analysis to obtain relevant frequent feature groups.  
Existing Opinion-Feature Extraction systems that use Data Mining techniques with some sequences include those referred to in this thesis as Rashid13OFExt, Rana18OFExt, and HPFG19_HU. Rashid13OFExt system compares the accuracy of techniques like Sequential Pattern Mining and Association Rule Mining to obtain frequent product features and opinion words from customers’ opinions. Rana18OFExt uses Class Sequential Rules (CSR) to extract product features and opinions from free format reviews. HPFG19_HU system uses High Utility Itemset Mining and Aspect Based Sentiment Analysis to extract High Utility Aspect (HUA) groups based on feature-opinion sets and works on transaction databases of itemsets by considering the high utility values (e.g., are more profitable to the seller?) of the extracted frequent patterns from a set of opinion sentences that correspond to itemset of aspects or features. However, existing systems like Rashid13OFExt  and Rana18OFExt only capture frequent aspects from the customers’ opinions and do not discover the frequent high-profit features considering utility values (internal and external) such as cost, profit, quantity, or other user preferences. HPFG19_HU system does not consider the order of occurrences (sequences) of product features formed in the customers' opinion sentences that help distinguish similar users and identifying more relevant and related high-profit product features. 
This thesis proposes a system called High-Profit Sequential Feature Groups based on High Utility Sequences (HPSFG_HUS), which is an extension to the HPFG19_HU system that replaces frequent high utility itemset patterns with frequent high utility sequential patterns. The system combines Feature-Based OM and High Utility Sequential Pattern Mining (HUSPM) to extract high-profit feature groups from product reviews or opinions mined from Amazon review datasets. The input to the proposed system is in the form of text obtained from the product reviews corpus. Next, we form sequences with the text data having extracted features and opinions and combine the utility values to form a sequence database. Lastly, USpan, a HUSPM algorithm, is applied to obtain high utility sequential patterns of product features-opinions set in a sequential database. The output is the high-profit product feature groups. We try to improve the HPFG19_HU system by considering occurrences in sequence databases to identify sequential patterns in the features extracted from opinions. This method improves on existing system's accuracy in extracting relevant frequent feature groups. The  results on retailer’s  graphs of extracted high profit sequential feature groups show that the proposed HPSFG_HUS system provides more accurate high feature groups, sales profit, and user satisfaction. Experimental results with evaluation results of execution time, accuracy, precision, and comparison show higher revenue than the tested existing systems.  
KEYWORDS: Sentiment Matching, Opinion mining, High Utility Sequential Pattern Mining, Feature Extraction 

MSc Thesis Committee:  

Internal Reader: Dr. Hossein Fani    
External Reader: Dr. Mohamed Belalia 
Advisor: Dr. Christie Ezeife 
Chair: Dr. Dima Alhadidi 

MSc Thesis Defense Announcement 

5113 Lambton Tower 401 Sunset Ave. Windsor ON, N9B 3P4 (519) 253-3000 Ext. 3716 (working remotely)