Project 2 · Jun 2025 – Nov 2025

GPU VDI Benchmark Automation

Technical Support Engineer @ PuzzleSystems · Python, PyTorch, TensorFlow, VMware vSphere, PowerShell

The Problem

A client (SK E&S) needed to validate the performance of VMware's virtualized GPU (vGPU) environment before launching a subscription-based GPU rental service. The key questions: how does GPU performance degrade under concurrent VM workloads? What is the maximum number of VDI sessions that can run GPU-intensive tasks simultaneously? How do resource utilization patterns change over extended stress periods? Manual benchmarking was time-consuming and produced inconsistent results.

The Solution

I built an automated GPU benchmarking pipeline:

Stress-test scripts using PyTorch and TensorFlow that simulate realistic GPU workloads (matrix operations, model training, inference) inside VMware vGPU environments
Automated data collection capturing GPU utilization, memory, temperature, and throughput at configurable intervals
Statistical analysis and reporting that processes accumulated data and generates incremental reports with trend analysis

Results

Provided the quantitative data SK E&S used to launch their GPU rental service
Turned a multi-day manual test cycle into an unattended overnight run
Delivered reproducible results across different vGPU configurations

Source code is held under prior employer's confidentiality.