PhD Dissertation Proposal Announcement of Mahreen Nasir Butt:"Semantic Enhanced Sequential Recommendation for E-Commerce Products through Mining Customers' Historical Interactions and Products' Meta Data "

Friday, December 10, 2021 - 09:30 to 11:30


The School of Computer Science is pleased to present… 

PhD Dissertation Proposal by: Mahreen Nasir  

Date:  Friday, December 10th, 2021 
Time:  9:30 a.m to 11:30 am 
Passcode: If interested in attending this event, contact the Graduate Secretary at with sufficient notice before the event to obtain the passcode    


E-commerce Recommendation Systems facilitate customers’ purchase decision by recommending products or services of interest. Designing a recommender system targeted towards an individual customer’s need is crucial for retailers to increase revenue and retain customers’ loyalty. Content based (key-word based) approaches generate recommendations based on the content (features) of the item and suffer from content overspecialization (lack of diversity in recommended products) due to the use of specific features only and cannot find semantics (meaning full relationships between items). For example, the products “Diet Coke” and “Coca-Cola Cherry” are two different products but are similar in semantics as both are beverages. Collaborative Filtering (CF), another common recommendation technique, takes user-item interaction matrix as input which represents user-item interactions either explicitly (users ratings) or implicitly (users’ browsing or buying behavior) and outputs top item recommendations for each target user, by finding similarities among users or items. The input matrix suffers from (i) sparsity (has low user item interactions), (ii) cold start (an item cannot be recommended if no ratings exist), (iii) lack of interpretability (e.g., if two movies are rated highly by different users, it does not show that these movies are semantically similar). Therefore, to create user profiles that better reflect user preferences and to interpret the meaning of items other than their features, semantics can be learned through (i) external knowledge sources, such as, taxonomies (IS-A hierarchy), dictionaries, ontologies or (ii) distributional hypothesis, which learn an item’s representation by analyzing the context (neighborhood) in which it is used (e.g., Word2Vec, Prod2Vec). The idea is that items co-occurring in a context are likely to be similar to each other (e.g., items in a purchase sequence).  
Users’ interests and preferences change with time. The time stamp of a user’s interaction (click or purchase event) is an important characteristic. Learning the sequential patterns of user interactions based on the timestamps are useful to understand their long and short term preferences and to predict the next items for recommendation.  Sequential Pattern Mining mines frequent or high utility sequential patterns from a sequential database comprising of historical purchase or click sequences. Conventional recommendation systems (ChoiRec12, SuChen15, SainiRec17, HPCRec18, HSPRec19) utilize mining techniques such as clustering, frequent and sequential pattern mining along with click and purchase similarity measures for next item recommendation. However, the performance of these systems is still limited when the matrix is sparse, as the number of items keep increasing rapidly. Additionally, models utilizing sequential pattern mining suffer from (i) lack of personalization: patterns are not targeted for a specific customer, as they infer decisions based on a global view of sequences and (ii) lack of contextual similarities among recommended items: they can only recommend items that exist as a result of a matching rule generated from frequent sequential purchase pattern(s).  
To better understand users’ preferences and to infer the inherent meaning of the items other than their features, this thesis, explores the effectiveness of utilizing semantic relationships (associations) between various customer interactions (clicks, purchases, browsing and reviews). The semantics of these interactions will be obtained through distributional hypothesis and then integrated into different phases of recommendation process such as (i) pre-processing, to learn associations between items, (ii) candidate generation, while mining sequential patterns and in Collaborative Filtering to select Top-N candidates that show semantic and sequential association between items and (iii) output (recommendation). Hence, the learnt semantic associations between items extracted from customers’ historical data can better represent users’ preferences and can address the issues of sparsity, coldstart, content overspecialization and provide recommendations which are diverse, similar in context and better reflect user’s long and short term interests. 
Keywords: Recommendation Systems, Sequential Pattern Mining, Semantics, Clickstream data, Historical Purchases, RecSys, Collaborative Filtering, TF-IDF, Vector Space Model, Cold Start, Sparsity, E-commerce 

PhD Dissertation Committee:  

Internal Reader:  Dr. Boubaker Boufama 
Internal Reader:  Dr. Sherif Saad                
External Reader: Dr. Dennis Borisov (Department of Mathematics) 
Advisor:               Dr. Christie Ezeife 

PhD Dissertation Proposal Announcement 

5113 Lambton Tower 401 Sunset Ave. Windsor ON, N9B 3P4 (519) 253-3000 Ext. 3716 (working remotely)