MSc Thesis Proposal by Yogeswar Lakshmi Narayanan:"Matches Made in Heaven or Somewhere: Personalized Query Refinement Gold Standard Generation Using Transformers"

Thursday, June 8, 2023 - 11:00 to 12:00

SCHOOL OF COMPUTER SCIENCE

The School of Computer Science is pleased to present…

MSc Thesis Proposal by: Yogeswar Lakshmi Narayanan

 
Date: Thursday June 8th, 2023
Time:  11:00 AM – 12:00 PM
Location: Essex Hall, Room 122
 
Reminders: 1. Two-part attendance mandatory (sign-in sheet, QR Code) 2. Arrive 5-10 minutes prior to event starting - LATECOMERS WILL NOT BE ADMITTED. Note that due to demand, if the room has reached capacity, even if you are "early" admission is not guaranteed. 3. Please be respectful of the presenter by NOT knocking on the door for admittance once the door has been closed whether the presentation has begun or not (If the room is at capacity, overflow is not permitted (ie. sitting on floors) as this is a violation of the Fire Safety code). 4. Be respectful of the decision of the advisor/host of the event if you are not given admittance. The School of Computer Science has numerous events occurring soon
 

Abstract:

The foremost means of information retrieval, search engines, have difficulty searching into knowledge repositories, e.g., the web, because they are not tailored to the users' differing information needs. User queries are, more often than not, under-specified or contain ambiguous terms that also retrieve irrelevant documents. Query refinement is the process of transforming users' queries into new refined versions without semantic drift to enhance the relevance of search results. Prior query refiners have been benchmarked on ad-hoc web retrieval datasets following weak assumptions that users' input queries improve gradually within a search session. In this paper, we contribute RePair, an open-source configurable toolkit, to generate large-scale gold standard benchmark datasets from a variety of domains for the task of query refinement. RePair takes a dataset of queries and their relevance judgements (e.g. msmarco or aol), a sparse or dense information retrieval method (e.g., bm25, colbert), and an evaluation metric (e.g., map), and outputs refined versions of queries, each of which with the relevance improvement guarantees under the retrieval method in terms of the evaluation metric. RePair benefits text-to-text-transfer-transformer (t5) to generate gold standard datasets for any input query set and is designed with extensibility in mind. Out of the box, we have generated and publicly shared gold standard datasets for aol and msmarco.passage whilst benchmarking these gold standard datasets with state-of-the-art supervised query suggestions models and exploring t5 as an alternative model for query suggestion. RePair's codebase is publicly available at https://github.com/fani-lab/RePair
 
Keywords: Query Refinement, Query Suggestion, Information Retrieval, Transformers
 


MSc Thesis Committee:

Internal Reader: Dr. Jianguo Lu      
External Reader: Dr. Mohammad Hassanzadeh
Advisor: Dr. Hossein Fani


MSc Thesis Proposal Announcement   Vector Institute, artificial intelligence approved topic logo

 

5113 Lambton Tower 401 Sunset Ave. Windsor ON, N9B 3P4 (519) 253-3000 Ext. 3716 csgradinfo@uwindsor.ca