Monday, August 15, 2022 - 09:00 to 11:00
SCHOOL OF COMPUTER SCIENCE
The School of Computer Science is pleased to present…
MSc Thesis Proposal by: Shanu Kumar
Date: Monday August 15th, 2022
Time: 9:00 AM – 11:00 AM
Meeting URL: https://us06web.zoom.us/j/84053685073?from=addon
Passcode: If interested in attending this event, contact the Graduate Secretary at email@example.com with sufficient notice before the event to obtain the passcode.
Recommendation Systems (RS) ameliorate customers’ experience by recommending products or services most relevant to them. For companies to boost sales and keep consumers loyal and satisfied, they must create a recommender system that is tailored to each customer's needs. With the development of reinforcement learning (RL) techniques, it is more effective to capture the sequential nature of recommendation task. Another advantage of using RL based techniques in recommendation systems is that it can handle long-term user engagement and as it not only prioritizes the immediate next item but estimate the future rewards for items that are expected be recommended after that. Additionally, RL based model can be trained to handle multiple objective such as accuracy, diversity, and novelty. From Multi-armed bandit, DQN, DDQN, RL has improved enormously. However, Actor-critic (AC) which is commonly used technique used frequently in RL based recommendation systems, uses Deep Deterministic Policy Gradient internally which has certain limitations such as while searching for the global maxima the DDPG algorithm might get trapped in local maxima which often results in sub-optimal policy. Another issue is single policy network may not be efficient in searching the entire item space and then select item that maximizes the critic.
In this proposal we aim to use ensemble of actors to maximize the model’s performance by selecting action that satisfies some predefined condition. In recent years with integration of ensemble model in RL based games new avenues have opened for experimentation and finding optimal policies that could affect decision making of agents in a great way. The preliminary result with ensembled actor shows not just improved performance but also the resulting policy becomes more sample efficient as well.
Keywords: Reinforcement learning, Recommendation, Ensemble model, Actor-critic, DDPG
MSc Thesis Committee:
Internal Reader: Dr. Xiaobu Yuan
External Reader: Dr. Ning Zhang
Advisor: Dr. Luis Rueda
MSc Thesis Proposal Announcement
5113 Lambton Tower 401 Sunset Ave. Windsor ON, N9B 3P4 (519) 253-3000 Ext. 3716 firstname.lastname@example.org