LANL’s Journey Toward Computational Storage

Abstract

Given LANL simulations can generate a Petabyte of data per time step with thousands to tens of thousands of time steps, data gravity is a huge concern at the lab. Performing analytics on this data to find and understand interesting features on simulation output is extremely expensive requiring movement of Petabytes of data and a data analytics platform with enough memory, not storage, to hold a full timestep (Petabyte). Often features of interest exist in far less data than the total data, like tracking the front of a shockwave going through a material, where you only care about the region at the very front of the shock wave. Sometimes the actual interesting feature is captured in several orders of magnitude less data the overall simulation output. We have been keenly interested in computational storage techniques to reduce the data as close to the storage as possible, making the amount of data that has to be moved and the size of the analytics platform many times if not orders of magnitude smaller. LANL and its many partners have explored many avenues including row, column, object, file and even block based approaches. This talk will provide an overview of this exploration and its future direction.

Gary Grider
Los Alamos National Laboratory
Related Sessions