Friday, August 19, 2022 - 11:00 to 13:00
SCHOOL OF COMPUTER SCIENCE
The School of Computer Science is pleased to present…
MSc Thesis Proposal by: Surajsinh Prakashchandra Parmar
Date: Friday August 19th, 2022
Meeting URL: https://us06web.zoom.us/j/89316803831?from=addon
Passcode: If interested in attending this event, contact the Graduate Secretary at email@example.com with sufficient notice before the event to obtain the passcode.
Hedge funds have the goal of achieving higher returns in the stock market. A typical hedge fund buys high-quality data at a very high price and has 100-200 employees working in isolated teams, leading to repeated research due to secrecy reasons. The data cannot be publicly shared as they can lose their edge in the market. Numerai devised an innovative idea of encrypting the high-quality stock market data without losing its predictive structure, which can publicly be shared. It allows anyone to load the data and make predictions on it. If 200 employees at a hedge fund can generate innovative signals, one can only imagine what happens when Numerai combines the predictions from 10000 data scientists worldwide. Submitted predictions can influence how Numerai allocate their capital in the global stock market. The provided time-series data is cleaned and regularized with millions of samples and 1191 features. This dataset evolved from 310 features in V2 to 1191 in V4. The task is to predict the probability of a sample giving positive returns. The non-stationary nature of features makes it a very challenging problem to solve. Also, the participants want to be less similar to what other participants submit. Thus, a data scientist would like to learn from features but not be overexposed on one feature, be correlated to a given target, and be less correlated to other participants. This data can be treated as a supervised regression learning problem. Tree-based models have shown to be good in supervised tabular data tasks. However, Neural networks with Transformer architecture have shown their potential to be excellent at vision and NLP tasks. This thesis explores incorporating state-of-the-art deep learning methods to achieve better results using unsupervised and supervised learning methods. It also explores the impact of various tabular data generation methods.
Keywords: Finance, Stocks, Big data, Machine learning, Deep learning, Generative Modelling
MSc Thesis Committee:
Internal Reader: Dr. Alioune Ngom
External Reader: Dr. Gurupdesh Pandher
Advisor: Dr. Imran Ahmad
MSc Thesis Proposal Announcement
5113 Lambton Tower 401 Sunset Ave. Windsor ON, N9B 3P4 (519) 253-3000 Ext. 3716 firstname.lastname@example.org