Design Modern Object Store Server for Lustre file system in the Era of Solid State Storage and Persistent Memory

Abstract

Fast, scalable parallel file system performance is a key enabler of massively parallel computing as well as of emerging big data and machine learning applications. Released almost two decades ago, Lustre has long been the storage solution of choice for many supercomputing data centers. But as the world slowly retire rotational disks in favor of fast SSDs and persistent memory for their performance tiers, Lustre is becoming increasingly unable to fully utilize available storage bandwidth due to its old, disk-oriented object storage server designs based on Ext4 derived ldiskfs. In this talk, we will first introduce the high level design of Lustre key networking and storage components (Object Storage Server and Object Storage Target). Then we will describe the limitations and shortcomings we consider that are becoming major impedance to further Lustre innovations and hurdle to faster cadence of development. In order to address these issues, we propose a completely new OSS/OST architecture and implementation by natively incorporating latest advancements in hardware (SSDs, Persistent memory), software (SPDK/PMDK, KV-Store) and other relevant technologies such as flexible Erasure Coding data protection scheme. We believe the benefits of this proposal will lower the barrier to future innovative contributions and can invigorate developer community activities, which is essential to ensure Lustre’s continued success in end user adoption and support.

Yong Chen
Samsung Electronics
Related Sessions