MSc Thesis Proposal Announcement of Shanu Kumar:"Enhanced Action Selection Using Actor’s Ensemble in Reinforcement Learning Based Recommendation Systems "

Monday, August 15, 2022 - 09:00 to 11:00


The School of Computer Science is pleased to present… 

MSc Thesis Proposal by: Shanu Kumar 

Date: Monday August 15th, 2022  
Time:  9:00 AM – 11:00 AM 
Passcode: If interested in attending this event, contact the Graduate Secretary at with sufficient notice before the event to obtain the passcode.


Recommendation Systems (RS) ameliorate customers’ experience by recommending products or services most relevant to them. For companies to boost sales and keep consumers loyal and satisfied, they must create a recommender system that is tailored to each customer's needs. With the development of reinforcement learning (RL) techniques, it is more effective to capture the sequential nature of recommendation task. Another advantage of using RL based techniques in recommendation systems is that it can handle long-term user engagement and as it not only prioritizes the immediate next item but estimate the future rewards for items that are expected be recommended after that. Additionally, RL based model can be trained to handle multiple objective such as accuracy, diversity, and novelty. From Multi-armed bandit, DQN, DDQN, RL has improved enormously. However, Actor-critic (AC) which is commonly used technique used frequently in RL based recommendation systems, uses Deep Deterministic Policy Gradient internally which has certain limitations such as while searching for the global maxima the DDPG algorithm might get trapped in local maxima which often results in sub-optimal policy. Another issue is single policy network may not be efficient in searching the entire item space and then select item that maximizes the critic.  
In this proposal we aim to use ensemble of actors to maximize the model’s performance by selecting action that satisfies some predefined condition. In recent years with integration of ensemble model in RL based games new avenues have opened for experimentation and finding optimal policies that could affect decision making of agents in a great way. The preliminary result with ensembled actor shows not just improved performance but also the resulting policy becomes more sample efficient as well. 
Keywords: Reinforcement learning, Recommendation, Ensemble model, Actor-critic, DDPG 

MSc Thesis Committee:  

Internal Reader: Dr. Xiaobu Yuan             
External Reader: Dr. Ning Zhang               
Advisor: Dr. Luis Rueda 

 MSc Thesis Proposal Announcement 

Vector Institute in Artificial Intelligence, artificial intelligence approved topic logo

5113 Lambton Tower 401 Sunset Ave. Windsor ON, N9B 3P4 (519) 253-3000 Ext. 3716