MSc Thesis Defense Announcement by William Briguglio:"Machine Learning Interpretability in Malware Detection"

Tuesday, March 17, 2020 - 13:00 to 14:00

SCHOOL OF COMPUTER SCIENCE

The School of Computer Science is pleased to present…

 

MSc Thesis Defense by: William Briguglio

 
Date: Tuesday March 17, 2020
 
Time:  1:00pm – 2:00pm
 
Location: LT 3015
 
 

Abstract:

 
The ever-increasing processing power of modern computers, as well as the increased availability of large and complex data sets, has led to an explosion in machine learning research. This has led to increasingly complex machine learning algorithms, such as Convolutional Neural Networks, with increasingly complex applications, such as malware detection.
 
Recently, malware authors have become increasingly successful in bypassing traditional malware detection method, partly due to advanced evasion techniques such as obfuscation and server-side polymorphism. Further, new programming paradigms such as fileless malware, that is malware that exist only in the main memory (RAM) of the infected host, add to the challenges faced with modern day malware detection. This has led security specialists to turn to machine learning to augment their malware detection systems. However, with this new technology comes new challenges. One of these challenges is the need for interpretability in machine learning.
 
Machine learning interpretability is the process of giving explanations of a machine learning model's predictions to humans. Rather than trying to understand everything that is learnt by the model, it is an attempt to find intuitive explanations which are simple enough and provide relevant information for downstream tasks. Cybersecurity analysts always prefer interpretable solutions because of the need to fine tune these solutions. If malware analysts can't interpret the reason behind a misclassification, they will not accept the non-interpretable or ``black box'' detector.
 
In this thesis, we provide an overview of machine learning and discuss its roll in cyber security, the challenges it faces, and potential improvements to current approaches in the literature. We showcase its necessity as a result of new computing paradigms by implementing a proof of concept fileless malware with JavaScript. We then present techniques for interpreting machine learning based detectors which leverage n-gram analysis and put forward a novel and fully interpretable approach for malware detection which uses convolutional neural networks. We also define a novel approach for evaluating the robustness of a machine learning based detector.
 
Key Words: Machine Learning, Interpretability, Malware Detection, Neural Networks
 
 

Thesis Committee:

 
Internal Reader: Dr. Ngom         
 
External Reader: Dr. Mirhassani
 
Advisor: Dr. Sherif Saad Ahmed
 
Chair: Dr. Samet
 
 

MSc Thesis Defense Announcement

Vector Institute AI approved logo

 

5113 Lambton Tower 401 Sunset Ave. Windsor ON, N9B 3P4 (519) 253-3000 Ext. 3716 csgradinfo@uwindsor.ca