Wednesday, September 16, 2020 - 14:30 to 16:30
SCHOOL OF COMPUTER SCIENCE
The School of Computer Science would like to present…
PhD. Comprehensive Exam by: Mahreen Nasir
Date: Wednesday, September 16th, 2020
Time: 2:30 p.m. to 4:30 p.m.
Zoom URL: https://zoom.us/j/94581959070?
Recommendation Systems (RS) facilitate customers’ purchase decision by recommending products or services of interest. Designing a recommendation system targeted towards an individual customer’s need is crucial for retailers to increase revenue and retain customers’ loyalty. Content based (key-word based) approaches generate recommendations based on the content (features) of the item and suffer from content overspecialization (lack of diversity in recommended products) due to the use of specific features only and cannot find semantics (meaning full relationships between items). For example, the products “Diet Coke” and “Coca-Cola Cherry” are two different products but they are similar in semantics as both are beverages. Collaborative Filtering (CF), another common recommendation technique, takes user-item interaction matrix as input which represents user-item interactions either explicitly (users ratings) or implicitly (users’ browsing or buying behavior) and outputs top item recommendations for each target user, by finding similarities among users or items. The input matrix suffers from (i) sparsity (has low user item interactions), (ii) cold start (an item cannot be recommended if no ratings exist), (iii) lack of interpretability (e.g., if two movies are rated highly by different users, it does not show that these movies are semantically similar). Therefore, to create user profiles that better reflect user preferences and to interpret the meaning of items other than their features, semantics can be learned through (i) external knowledge sources, such as, taxonomies (IS-A hierarchy), dictionaries, ontologies or (ii) distributional hypothesis, which learn an item’s representation by analyzing the context (neighborhood) in which it is used (Word2Vec, Prod2Vec). The idea is that items co-occurring in a context are likely to be similar to each other (e.g., items in a purchase sequence).
Users’ interests and preferences change with time. The time stamp of a user’s interaction (click or purchase event) is an important characteristic. Learning the sequential patterns of user interactions based on the timestamps are useful to understand their long and short term preferences and to predict the next items for recommendation. Sequential Pattern Mining mines frequent or high utility sequential patterns from a sequential database comprising of historical purchase or click sequences. Conventional recommendation systems (ChoiRec12, SuChen15, SainiRec17, HPCRec18, HSPRec19) utilize mining techniques such as clustering, frequent and sequential pattern mining along with click and purchase similarity measures for next item recommendation. However, the performance of these systems is still limited when the matrix is sparse, as the number of items keep increasing rapidly. Additionally, models utilizing sequential pattern mining suffer from (i) lack of personalization: patterns are not targeted for a specific customer, as they infer decisions based on a global view of sequences and (ii) lack of contextual similarities among recommended items: they can only recommend items that exist as a result of a matching rule generated from frequent sequential purchase pattern(s).
To better understand users’ preferences and to infer the inherent meaning of the items other than their features, this thesis, explores the effectiveness of utilizing semantic associations between various customer interactions (clicks, purchases, browsing and reviews). The semantics of these interactions will be obtained through distributional hypothesis and then integrated into different phases of recommendation process such as (i) pre-processing, to learn associations between items, (ii) candidate generation, while mining sequential patterns and in Collaborative Filtering to select Top-N candidates that show semantic and sequential association between items and (iii) output (recommendation). Hence, the learnt semantic associations between items extracted from customers’ historical data can better represent users’ preferences, address the issues of sparsity, coldstart, content overspecialization and provide recommendations which are diverse, similar in context and better reflect user’s long and short term interests.
Keywords: Recommendation Systems, Sequential Pattern Mining, Semantics, Clickstream data, Historical Purchases, RecSys, Collaborative Filtering, TF-IDF, Vector Space Model, Cold Start, Sparsity, E-commerce
Internal Reader: Dr. Boubaker Boufama
Internal Reader: Dr. Sherif Saad
External Reader: Dr. Eugene Kim
(Faculty/department) Department of Physics
Advisor: Dr. Christie Ezeife
PhD Comprehensive Exam Announcement
5113 Lambton Tower 401 Sunset Ave. Windsor ON, N9B 3P4 (519) 253-3000 Ext. 3716 email@example.com