Representation Learning for Anomaly Detection in Multimodal Data
PhD Dissertation Proposal by: Atefeh Gilvari
Date: Thursday, January 29, 2026
Time: 11:00AM
Location: EH room 122
Abstract:
This study investigates representation learning for anomaly detection in multimodal data, with a focus on industrial inspection scenarios in electric vehicle (EV) manufacturing. The work systematically progresses from classical anomaly detection theory to modern few-shot and zero-shot learning paradigms. Early phases establish a structured taxonomy of anomaly types and detection models, followed by the development of FSOME++, a few-shot framework that improves anomaly discrimination through abnormal sample augmentation. The research further investigates zero-shot anomaly detection using pretrained vision–language models (VLMs), where defect recognition is achieved without model training by leveraging samples and textual prompts to learn anchor representations, together with new engineering synthesis for deployment. Building on these foundations, the proposed dissertation will introduce a novel foundation for adaptive inference in frozen VLMs, where lightweight expert representations are selectively weighted to capture domain-specific visual and textual features without retraining the encoders but finetuning them. In addition, the research will evaluate the practicality of the proposed methods through on-device deployment, assessing real-time inference performance across embedded camera systems. Overall, this dissertation aims to span the full pipeline, from theoretical anomaly modeling and representation learning to adaptive inference and real-world deployment for reliable and efficient industrial anomaly detection.
Thesis Committee:
Internal Reader: Dr. Boubakeur Boufama
Internal Reader: Dr. Arunita Jaekel
External Reader: Dr. Ning Zhang
Special member: Dr. Rajeev Verma
Advisor: Dr. Ziad Kobti
Advisor: Dr. Narayan C. Kar