Comprehensive security assessment of LLM Guardrails
MSc Thesis Proposal by:
Ahmed Abdalmagid
Date: 12-5-2026
Time: 11 AM
Location: 122 Essex Hall
Abstract:
Large language models have been widely adopted across many domains. Some applications are in critical areas where failures or adversarial attacks could be catastrophic. Businesses therefore need mechanisms to prevent inputs and outputs that are not compliant with their policies. To defend against this, internal post-training and external guardrails were proposed. However, the vulnerability landscape of external guardrail models remains poorly understood. In this research, we investigate the vulnerabilities in external guardrail systems by treating them as a policy enforcement mechanism and mapping the well-established firewall vulnerability landscape to it, as they both share the same goal and philosophy. We aim to conduct a systematic empirical and comparative analysis based on mapping existing vulnerabilities and security threats of network firewalls to external guardrails. This work is expected to provide researchers and practitioners with new methods for security analysis of guardrails and help in understanding their potential vulnerabilities and threats.
Keywords: Guardrails, LLMs
Thesis Committee:
Internal Reader: Dr. Alioune Ngom
Internal Reader: Dr. Jianguo Lu
Advisor: Dr. Sherif Saad
