MSc Thesis Proposal Announcement of Shanu Kumar:"Enhanced Action Selection Using Actor’s Ensemble in Reinforcement Learning Based Recommendation Systems "

Monday, August 15, 2022 - 09:00 to 11:00

SCHOOL OF COMPUTER SCIENCE

The School of Computer Science is pleased to present…

MSc Thesis Proposal by: Shanu Kumar

Date: Monday August 15th, 2022

Time: 9:00 AM – 11:00 AM

Meeting URL: https://us06web.zoom.us/j/84053685073?from=addon

Passcode: If interested in attending this event, contact the Graduate Secretary at csgradinfo@uwindsor.ca with sufficient notice before the event to obtain the passcode.

Abstract:

Recommendation Systems (RS) ameliorate customers’ experience by recommending products or services most relevant to them. For companies to boost sales and keep consumers loyal and satisfied, they must create a recommender system that is tailored to each customer's needs. With the development of reinforcement learning (RL) techniques, it is more effective to capture the sequential nature of recommendation task. Another advantage of using RL based techniques in recommendation systems is that it can handle long-term user engagement and as it not only prioritizes the immediate next item but estimate the future rewards for items that are expected be recommended after that. Additionally, RL based model can be trained to handle multiple objective such as accuracy, diversity, and novelty. From Multi-armed bandit, DQN, DDQN, RL has improved enormously. However, Actor-critic (AC) which is commonly used technique used frequently in RL based recommendation systems, uses Deep Deterministic Policy Gradient internally which has certain limitations such as while searching for the global maxima the DDPG algorithm might get trapped in local maxima which often results in sub-optimal policy. Another issue is single policy network may not be efficient in searching the entire item space and then select item that maximizes the critic.

In this proposal we aim to use ensemble of actors to maximize the model’s performance by selecting action that satisfies some predefined condition. In recent years with integration of ensemble model in RL based games new avenues have opened for experimentation and finding optimal policies that could affect decision making of agents in a great way. The preliminary result with ensembled actor shows not just improved performance but also the resulting policy becomes more sample efficient as well.

Keywords: Reinforcement learning, Recommendation, Ensemble model, Actor-critic, DDPG

MSc Thesis Committee:

Internal Reader: Dr. Xiaobu Yuan

External Reader: Dr. Ning Zhang

Advisor: Dr. Luis Rueda

MSc Thesis Proposal Announcement

5113 Lambton Tower 401 Sunset Ave. Windsor ON, N9B 3P4 (519) 253-3000 Ext. 3716 csgradinfo@uwindsor.ca