The School of Computer Science is pleased to present…
Date: Monday, May 12, 2025
Time: 1:00 pm
Location: Lambton Tower, Room 3105
Group Activity Recognition (GAR) plays an important role in video surveillance, sports analytics, and human-computer interaction by identifying and classifying interactions among multiple individuals in video sequences. Unlike single-person action recognition, GAR requires analyzing complex spatiotemporal dependencies and group dynamics, making it challenging. Recent advancements in deep learning, particularly Graph Neural Networks (GNNs) and Transformer-based architectures, have improved GAR by capturing hierarchical relationships and enhancing interaction modelling. However, challenges such as occlusion, high computational costs, limited dataset diversity, and weak contextual modelling persist, requiring more robust and scalable solutions.
In this study, a multiscale Transformer-based deep learning model was trained in two stages using a key point-only modality for enhanced relational reasoning in group activity recognition. The model was evaluated on the Volleyball dataset and achieved 94.6% group-level classification accuracy and 79.0% person-level action accuracy. Compared to state-of-the-art key point-only GAR methods, the model outperformed prior benchmarks by up to +1.5% at the group level and +2.0% at the person level. These results demonstrate that a lightweight, privacy-conscious architecture can effectively model complex group dynamics while maintaining computational efficiency.
Internal Reader: Dr. Muhammad Asaduzzaman
External Reader: Dr. Leo Oriet
Advisor: Dr. Saeed Samet
Chair: Dr. Curtis Bright