The School of Computer Science is pleased to present…
Mining for Product Recommendation on Document-Based NoSQL Big Data
PhD Dissertation Proposal by: Abdulrauf Aremu Gidado
Date: Friday November 24, 2023
Time: 10:00 AM – 12:00 PM
Location: Essex Hall, Room 105
To date, majority of large corporations such as Amazon and Facebook still have their core solutions (e.g., payments) on relational databases but only use non-relational Bigdata (i.e., NoSQL) database management systems for their non-core systems (e.g., shopping cart) that favor availability and scalability through partitioning while trading off consistency. NoSQL systems are built based on the CAP (Consistency, Availability and Partitioning) database theorem, which satisfies two of these features while trading off one. The need for systems availability and scalability drives the use of NoSQL models, while the lack of consistency and robust query engines as obtainable in relational databases impede their usage. To mitigate these drawbacks, researchers and companies like Amazon, Google and Facebook developed 'SQL over NoSQL' systems such as Amazon’s Dynamo, Google's Spanner, Facebook’s Memcache, Zidian2019, Apache Hive and SparkSQL. These systems create a query engine layer over NoSQL systems but suffer from data redundancy due to lack of normalized database relations and lack consistency obtainable in relational databases. Also, their query engine is not relational complete because they cannot process all relational algebra-based queries as obtainable in a relational database. This thesis presents a ‘NoSQL over SQL system’, an inverse of existing approaches such as Zidian2019 that transforms data into a key-value format then builds an SQL query engine layer on the NoSQL data. This approach is motivated by (i) the need for existing systems to fully deploy NoSQL data store functionalities without the limitation of building an extra SQL layer for querying, and (ii) the ability to integrate images similarities into the ecommerce mining process by taking advantage of the ease of retrieval and storage of storing images as text on document-oriented NoSQL databases. To allow appropriate storage and retrieval of data on document-based NoSQL databases without data redundancy and inconsistency while encouraging both horizontal and vertical partitioning, this work proposes NoSQL over SQL Block as a Value (BaaV) data storage strategy...
Additionally, we vectorized items image and integrate item-item image similarity scores into e-commerce customer historical purchase database to enhance sequential pattern recommendation on e-commerce with a proposed Image Enhanced Historical Sequential Pattern Recommendation (iHSPRec) system. To enhance accurate pattern mining on NoSQL databases for adequate recommendation and allow existing corporations with a large relational database to take advantage of NoSQL databases, this thesis (i) proposes a Block as a Value (BaaV) framework for extracting data and mapping from relational schema into NoSQL to enable faster data retrieval for existing large relational databases, (ii) Integrate item-item image similarity scores into customers purchase history for enhance sequential pattern recommendation by using items images stored on document-based NoSQL database (iii) propose a sequential pattern mining technique on NoSQL BaaV document-oriented database. Using existing benchmark systems of ‘SQL over NoSQL’, relational databases and real-life datasets for our experiments, we demonstrated that our NoSQL over SQL system outperforms existing relational databases, SQL over NoSQL systems and is novel in ensuring data consistency, scalability, query execution and improving data storage and retrieval in large database systems without data loss and enhancing improved performance on NoSQL database.
Keywords: Bigdata, Document-Oriented NoSQL Databases, Recommendation Systems, Sequential Pattern Mining, Block as a Value, E-Commerce.
Internal Reader: Dr. Aliuone Ngom
Internal Reader: Dr. Curtis Bright
External Reader: Dr. Christian Trudeau
Advisor: Dr. Christie Ezeife