GPU Performance Metrics for CNNs: Profiling and Optimization (1st Offering)

Monday, November 24, 2025 - 14:00

School of Computer Science – JLR Challenge #3 Technical Workshop

GPU Performance Metrics for CNNs: Profiling and Optimization (1^st Offering)

Presenter: Farzaneh Kazemzadeh

Date: Monday, November 24th, 2025

Time: 2:00 PM

Location: 4th Floor - 300 Ouellette Ave., School of Computer Science, Advanced Computing Hub

Abstract:

Efficiently deploying Convolutional Neural Networks (CNNs) on GPUs requires not only implementation knowledge but also a solid understanding of performance profiling. This workshop focuses on GPU-level performance metrics that determine the speed and scalability of CNN models. Participants will learn about key profiling indicators such as kernel execution time, GPU occupancy, memory throughput, and tensor-shape impact on runtime. The session also explores how NVIDIA Nsight and Triton profiling tools can be used to interpret GPU utilization and identify bottlenecks. By the end of this workshop, attendees will have a clear understanding of how to analyze CNN performance beyond model accuracy, aligning with real-world practices in CUDA and Triton optimization.

Workshop Outline:

- Overview of CNN computation on GPUs
- GPU performance indicators: execution time, SM usage, memory throughput
- Introduction to CUDA profiling and Nsight tools
- Understanding tensor dimensions and batch-size effects
- Comparing kernel fusion and optimization techniques in CUDA and Triton
- Discussion on interpreting profiling results and performance improvement strategies

Prerequisites:

- Basic understanding of CNN architecture and GPU computing
- Familiarity with CUDA programming concepts

Biography:

Farzaneh Kazemzadeh is a PhD student in Computer Science at the University of Windsor. Her research focuses on trustworthy AI, particularly on privacy-preserving machine learning, with applications in genomics and social networks. Her current work explores memorization and privacy risks in large language models.

GPU Performance Metrics for CNNs: Profiling and Optimization (1st Offering) - JLR Challenge #3 Technical Workshop by: Farzaneh Kazemzadeh

Presenter: Farzaneh Kazemzadeh

Registration Link (only MAC students need to pre-register)

GPU Performance Metrics for CNNs: Profiling and Optimization (1^st Offering)