Benchmarking Storage with AI Workloads

Abstract

Modern data centers invariably face performance challenges due to the rising volume of datasets and complexity of deep learning workloads. Sizeable research and development has taken place to understand AI/ML workloads. These workloads are computationally intensive, but also require vast amounts of data to train models and draw inferences. The impact of storage on AI/ML pipelines therefore merits additional study. This work addresses the following research questions: (1) whether AI/ML workloads benefit from high performance storage systems, and (2) can we showcase such storage via realistic approaches, with vision-based training and inference workloads. The study evaluated the following storage-intensifying approaches: limiting system memory, simultaneous data ingestion and training, running parallelized training workloads, and inferring streaming AI workloads. Simultaneous data ingestion and training, and inference are observed to be the most storage intensive and thus are recommended as ways to showcase storage. To support the analyses, we discuss the system resources taxed by AI workloads. Additionally, the work presents I/O analyses for these approaches in terms of access locality, I/O sizes, read write ratio, and file offset patterns. The I/O traces of inference indicated remarkably diverse random write request size distribution. PM9A3 could support such a challenging workload generating 25x I/O with 3.4% overhead on inference time and 3x higher throughput compared to the MLPerf inference implementation.

Devasena Inupakutika
Samsung Semiconductor Inc.
Related Sessions