The School of Computer Science would like to present…
Training Vision Transformer for Image Classification Task
PhD. Comprehensive Exam by: Chris Khalil
Date: Wednesday November 22, 2023
Time: 10:00 am
Location: Essex Hall, Room 105
Abstract: In recent years, Vision Transformers (ViTs) have emerged as a revolutionary approach to image classification tasks, challenging the long-standing dominance of Convolutional Neural Networks (CNNs). This comprehensive exam delves into the intricacies of training Vision Transformers for image classification, addressing the fundamental principles, key methodologies, and contemporary developments in the field. By exploring the convergence of computer vision and natural language processing techniques, this research area offers a fresh perspective on image analysis. Through a thorough examination of ViTs' architecture, pre-training strategies, fine-tuning procedures, and data augmentation techniques, this study aims to provide a comprehensive understanding of how to effectively harness the power of Vision Transformers for image classification tasks. Furthermore, the exam will shed light on the potential advantages, limitations, and areas for future exploration in the realm of ViTs, ultimately paving the way for enhanced image recognition systems.
Key words: Vision Transformers (ViTs), Image Classification, Pre-training, Fine-tuning, Computer Vision