Leveraging Computational Storage for Simulation Science Storage System Design

Abstract

High-performance computing data centers supporting large-scale simulation applications can routinely generate a large amount of data. To minimize time-to-result, it is crucial that this data be promptly absorbed, processed, and potentially even multidimensionally indexed so that it can be efficiently retrieved when the scientists need it for insights. Currently, despite a transition from HDDs to using all-flash based storage systems in hot storage tiers for a boost in raw storage bandwidth as many recently deployed systems have done, bottlenecks still exist due to legacy software, severe server CPU and memory bandwidth limitations for certain data-intensive operations, and excessive data movement. Computational storage, with its ability to map and distribute storage functions to various computing units along the data processing path, offers opportunities to overcome existing storage system bottlenecks to vastly improve performance and cost. In this talk, we will discuss various computational storage efforts carried out at Los Alamos National Laboratory in collaboration with partners including Aeon Computing, Eideticom, NVIDIA, SK hynix, and Seagate. We will explore topics like transparent ZFS I/O pipeline offloads, analytics acceleration with flash key-value based storage devices, and in-drive SQL like query processing in an erasure coded data lake tier. We will then conclude by discussing lessons learned, next steps, and the need for an open, standards-based approach for computational storage in the form of an object storage system to ease development, adoption, and innovation.

Gary Grider
Los Alamos National Laboratory
Related Sessions