MSc Thesis Defense Announcement of Shanu Kumar:"Inference-Based Personalized Recommendation Via Uncertainty-aware Dual Actor-Critic Using Reinforcement Learning"

Tuesday, March 21, 2023 - 13:30 to 15:00


The School of Computer Science is pleased to present…

MSc Thesis Defense by: Shanu Kumar

Date: Tuesday March 21, 2023
Time:  1:30pm – 3:00pm
Location: Essex Hall, Room 122
Reminders: 1. Two-part attendance mandatory (sign-in sheet, QR Code)
2. Arrive 5-10 minutes prior to event starting - LATECOMERS WILL NOT BE ADMITTED. Note that due to demand, if the room has reached capacity, even if you are "early" admission is not guaranteed.
3. Please be respectful of the presenter by NOT knocking on the door for admittance once the door has been closed whether the presentation has begun or not (If the room is at capacity, overflow is not permitted (ie. sitting on floors) as this is a violation of the Fire Safety code).
4. Be respectful of the decision of the advisor/host of the event if you are not given admittance. The School of Computer Science has numerous events occurring in the near future.


Ranking of items is the core of an efficient, personalized recommendation system that justifies the overall performance and directly affects the consumer's experience. For models working on explicit feedback to capture consumers' satisfaction, non-rated interactions are usually ignored. A section of consumed items does not obtain ratings hence user satisfaction remains to be discovered and an implicit system that maximizes interaction is utilized in such cases. We aim to extend the application of an explicit system via inference to suggest items to these less expressive users so that the users' satisfaction is considered a pertinent element in the recommendation. Our goal is to use the interaction data of such users to suggest item that could positively affect them. In this work, the model is trained on explicit data obtained in the same environment from other users. We use reinforcement learning to obtain the model since recommendation can be considered a sequential decision-making task while the aim remains long terms cumulative reward maximization by making efficient transitions. Twin actor twin delayed deep deterministic policy gradient is the underlying framework. Our approach considers uncertainty a determining element, which is a significant feature of this work because every user’s behavior is different and uncertain which might be reflected in the data. This can induce uncertainty in the model as there could be insufficient data and the model itself could be inefficient to capture all the patterns. The final policy, which we refer to as uncertainty-aware dual actor-critic, is acquired via policy aggregation, which is theoretically motivated by the deep ensemble in reinforcement learning with multiple deep deterministic policy gradients. The results of numerous experiments conducted using various benchmark datasets show that our aggregated policy-based approach enhances the recommendation performance by improving the generalization capability of the agent.
Keywords: Inference, Deep reinforcement learning, Actor-critic, Recommendation systems, Ensemble, Uncertainty.

MSc Thesis Committee: 

Internal Reader: Dr. Xiaobu Yuan             
External Reader: Dr. Ning Zhang               
Advisor: Dr. Luis Rueda 
Chair:    Dr. Kalyani Selvarajah 

MSc Thesis Defense Announcement 

Vector Institute, artificial intelligence approved topic logo


5113 Lambton Tower 401 Sunset Ave. Windsor ON, N9B 3P4 (519) 253-3000 Ext. 3716