2021 Presentation Abstracts

Break Out Sessions and Agenda Tracks Include:

Note: This agenda is a work in progress. Check back for updates on additional sessions as well as the agenda schedule.

 

 

 

 

Blockchain

Applying Blockchain for Digital Identity Management

Abirami Ravichandran, Senior DevOps Engineer, MSys Technologies

Abstract

In this session, we will start with identifying the current problems in traditional identify management methods like bad password combination, failure due to manual provisioning and de-provisioning process, no regulations for access restriction, non-updated applications, and others.

We’ll then walk through detailed Blockchain features of decentralization, immutability, and transparency and learn how it these features will help us solve the aforementioned problems. We will see how all DIDs which are cryptographically secured are stored to the immutable ledger storage for Digital Identity Management and also learn how public-key cryptography based authentication works.


Analysis of Distributed Storage on Blockchain

Tejas Chopra, Senior Software Engineer, Netflix, Inc.

Abstract

Blockchain has revolutionized decentralized finance, and with smart-contracts has enabled the world of Non-Fungible Tokens, set to revolutionize industries such as art, collectibles and gaming. Blockchains, at the very core, are distributed chained hashes. They can be leveraged to store information in a decentralized, secure, encrypted, durable and available format. However, some of the challenges in Blockchain stem from the bloat of storage. Since each participating node will keep a copy of the entire chain, the same data gets replicated on each node, and even a 5MB file stored on the chain can exhaust systems.

Several techniques have been used by different implementations to allow Blockchains for distributed storage of data. The advantages compared to cloud storage would be the decentralized nature of storage, the security provided by encrypting content, and the costs.

In this session, we will discuss how different blockchain implementations such as Storj, InterPlanetary File System, YottaChain, and ILCOIN have solved the problem of storing data on the chain, but avoiding bloat. Most of these solutions store the data off-chain and store the transactions metadata on the blockchain itself.

IPFS & Storj for example, uses content-addressing to uniquely identify each file in a global namespace connecting all the computing devices. The incoming file is encrypted, and split into smaller chunks, and each participating node will store a chunk with zero-knowledge of the other chunks. ILCOIN relies on RIFT protocol to enable two level blockchains, one for the standard blocks, and the other for the mini-blocks that comprise the transactions and which are not mined, but generated by the system. Yottachain uses deduplication after encrypting content, which is not generally the way data storage is designed for cloud, to reduce the footprint of data on the chain.

We will discuss the tradeoffs of these solutions and how they aim to disrupt cloud storage. We will compare the benefits provided in terms of security, scalability and costs, and how organizations such as Netflix, Box, Dropbox can benefit from leveraging these technologies.

Cloud Storage

Compression, Deduplication & Encryption Conundrums for Cloud Storage

Tejas Chopra, Senior Software Engineer, Netflix, Inc.

Abstract

Cloud storage footprint is in exabytes and exponentially growing and companies pay billions of dollars to store and retrieve data. In this talk, we will cover some of the space and time optimizations, which have historically been applied to on-premise file storage, and how they would be applied to objects stored in Cloud.

Deduplication and compression are techniques that have been traditionally used to reduce the amount of storage used by applications. Data encryption is table stakes for any remote storage offering and today, we have client-side and server-side encryption support by Cloud providers.

Combining compression, encryption, and deduplication for object stores in Cloud is challenging due to the nature of overwrites and versioning, but the right strategy can save millions for an organization. We will cover some strategies for employing these techniques depending on whether an organization prefers client side or server side encryption, and discuss online and offline deduplication of objects. Companies such as Box, and Netflix, employ a subset of these techniques to reduce their cloud footprint and provide agility in their cloud operations.


Compacting Smaller Objects in Cloud for Higher Yield

Tejas Chopra, Senior Software Engineer, Netflix, Inc.

Abstract

In file systems, large sequential writes are more beneficial than small random writes, and hence many storage systems implement a log structured file system. In the same way, the cloud favors large objects more than small objects. Cloud providers place throttling limits on PUTs and GETs, and so it takes significantly longer time to upload a bunch of small objects than a large object of the aggregate size. Moreover, there are per-PUT calls associated with uploading smaller objects.

In Netflix, a lot of media assets and their relevant metadata is generated and pushed to cloud. Most of these files are between 10s of bytes to 10s of kilobytes and are saved as small objects on Cloud.

In this talk, we would like to propose a strategy to compact these small objects into larger blobs before uploading them to Cloud. We will discuss the policies to select relevant smaller objects, and how to manage the indexing of these objects within the blob. We will also discuss how different cloud storage operations such as reads and deletes would be implemented for such objects.

Finally, we would showcase the potential impact of such a strategy on Netflix assets in terms of cost and performance.


Managing Cloud Infrastructure by using Terraform HCL

Kannadasan Palani, Senior Test Manager, MSys Technologies

Abstract

Terraform – infrastructure as code tool from HashiCorp for building, changing, and managing infrastructure. We can use it to manage Multi-Cloud environments with a configuration language called the HashiCorp Configuration Language (HCL). It codifies cloud APIs into declarative configuration files.

We will learn how to write the Configuration files in the Terraform to run a single application or manage an entire data center by defining the plan and then executing it to build the described infrastructure. As the configuration changes, Terraform can determine changes and create incremental execution plans in accordance.

Using Terraform, we can manage low-level components such as compute instances, storage, networking, and high-level components such as DNS entries and SaaS features.


Transparent Encryption and Dual End Point Access Controls to Secure AWS S3 Buckets

Sridharan "Sri" Sudarsan, Director, Ciphertrust Transparent Encryption Engineering, Thales Group

Carlos Wong, Senior Principal Software Engineer, Ciphertrust Transparent Encryption Engineering, Thales Group

Abstract

Amazon AWS S3 storage is widely deployed to store everything from customer data, server logs, software repositories and so on. Poorly secured S3 buckets have resulted in many publicized data breaches. The cloud service provider's shared responsibility model places responsibility on customers for protecting the confidentiality, availability and integrity of their data.

Thales Cipher Trust Encryption Cloud Object Storage for S3 secures S3 objects by enabling advanced encryption along with dual end point access controls. Access controls are enforced both at the client host running the AWS S3 application and at the AWS S3 server end. The encryption offered by CTE COS for S3 is independent of AWS's S3 server side encryption.

Encryption and access controls are completely transparent to applications while AWS S3 administrative procedures remain unchanged after software agent deployment. Continuously enforced encryption policies protect against unauthorized access even in the case of AWS misconfigurations. Data access to 'protected' S3 buckets is tracked through detailed audit logs.

CTE's granular, least-privileged user access policies protects sensitive data in S3 buckets from external attacks and misuse by other privileged users. CTE security administrators can frame client host policies to allow or deny actions involving ACLs like reading, writing, enumerating and deleting S3 buckets or even individual objects in a S3 bucket. In addition, client policies can also specify permissible users and applications capable of accessing protected AWS S3 buckets.

AWS S3 server side access controls can also be simultaneously and transparently enabled with custom AWS IAM policies and roles. S3 bucket data accesses are only allowed from hosts configured with Ciphertrust Transparent Encryption. Cloud access controls and its management can therefore be offloaded to client hosts with additional control points for permitting specific local identities and applications. CTE COS S3 dual end point access controls and encryption therefore prevent S3 data breaches against unauthorized accesses even in the midst of misconfigured buckets and rogue insider threats. CTE COS S3 is FIPS 140-2 Level certified and is a part of the Ciphertrust Data Security platform.


FlexAlloc a Lightweight Building Block for User Space Data Management

Joel Granados, Staff Software Engineer, Samsung Electronics

Jesper Devantier, Staff Software Engineer, Samsung Electronics

Adam Manzanares, Software Engineer, Samsung Electronics

Abstract

In this talk we present FlexAlloc: a lean block allocator focused on bridging the gap between raw block device access in user space and data management through monolithic file systems.

By removing meta data and features of general-purpose file system layers between applications and storage devices, we have developed a thin block allocation layer meant to exist within a heterogeneous storage stack. Traditional file systems include features (like management overhead of growing/shrinking files) that are often unused, sub-optimal and in some cases lead to performance degradations. By eliminating those that are not needed and replacing the sub-optimal ones with stripped down versions that cater to the specific characteristics of applications such as RocksDB (sequential writes and immutable files or SSTables), we arrive at a thin layer that can be integrated into applications with minimal changes and at the same time can support emerging device characteristics such as zoned namespace SSDs.

FlexAlloc provides a lean block allocation layer that avoids paying the price of additional abstractions which makes it an attractive building-block within a heterogeneous storage stack. We demonstrate its viability by running a rocksDB benchmark on top of a zoned namespace SSD through FlexAlloc and subsequently copying out the resulting database files through a traditional file system interface. We hit the sweet spot where RocksDB maintains its read/write path through raw block device access while still being able to provide a read only user space mountable file system that can be used by existing infrastructure.


Ozone - Architecture and Performance at Billions’ Scale

Lokesh Jain, Software Engineer, Cloudera

Abstract

Object stores are known for ease of use and massive scalability. Unlike other storage solutions like file systems and block stores, object stores are capable of handling data growth without increase in complexity or developer intervention. Apache Hadoop Ozone is a highly scalable Object Store which extends the design principles of HDFS while maintaining a 10-100x scale compared to HDFS. It can store billions of keys and hundreds of petabytes of data. With the massive scale there is a requirement for it to have very high throughput while maintaining low latency.

This talk discusses the Ozone architecture and design decisions which were significant for achieving high throughput and low latency. It will detail how Ozone supports metadata and data layer without compromising throughput or latency. Further the talk discusses the hardships and challenges related to resource management and good performance in Ozone. It would cover some major pain points and present the performance issues in broad categories.


Apache Ozone - Balancing and Deleting Data At Scale

Lokesh Jain, Software Engineer, Cloudera

Abstract

Apache Ozone is an object store which scales to tens of billions of objects, hundreds of petabytes of data and thousands of datanodes. Ozone not only supports high throughput data ingestion but also supports high throughput deletion with performance similar to HDFS. Further with massive scale the data can be non-uniformly distributed due to addition of new datanodes, deletion of data etc. Non-uniform distribution can lead to lower utilisation of resources and can affect the overall throughput of the cluster.

The talk discusses the balancer service in Ozone which is responsible for uniform distribution of data across the cluster. It would cover the service design and how the service improves upon HDFS balancer service. Further the talk discusses how Ozone deletion matches the HDFS deletion performance of deleting 1 Million keys / hour but can scale much more. Simple design and asynchronous operations enable Ozone to achieve the scale for deletion. The talk would dive deeper into the design and performance enhancements.

Composable Infrastructure

Future of Storage Platform Architecture

Mohan Kumar, Intel Fellow, Intel

Anjaneya ‘Reddy’ Chagam, Principal Engineer and Chief SDS Architect, Intel’s Data Center Group / Board Member at SODA Foundation, Intel Corp.

Abstract

Traditional Storage Node consists of Compute, Networking and Storage elements. In this case, the entire node is a single failure domain and as such both data and meta data are maintained in storage. Emergence of CXL allows us to re-think the traditional storage node architecture. In future, the storage (behind CXL IO) and metadata memory (behind a CXL memory) can be disaggregated locally or across a bunch of storage nodes to improve the availability of the data. Further, memory persistence can be achieved at a granular level using CXL memory devices. Future extensions to CXL with fabric like attributes have potential to further extend the data replication capabilities of the storage platform. In this talk, we will discuss the various platform architecture options that are emerging for the storage node and how they can change the face of traditional storage node organization.


Disaggregated Data Centers: Challenges and Opportunities

Jason Vanvalkenburgh, Technical Product Management, Fungible

Abstract

Data centers are under pressure. Competing needs for asset homogenization, supporting specialized workloads and delivering predictable performance lead to deployment tradeoffs that drive down data center utilization and drive up costs. Businesses need to continue to innovate, but budgets rarely reflect as such. A "one Cloud fits all" is expensive or simply does not work, but the same flexibility of the Cloud is essential as workloads repatriate to containers on bare metal. These factors are driving organizations towards adopting composable infrastructure to drive more value and flexibility. In this session, we describe an existing solution whose architecture and approach reflects the challenges of dis-aggregation. This presentation includes the design decisions made while looking at these problems and the resulting architecture that delivers on the promise of composable infrastructure.


Paravirtualized NVMe with vfio-user

Benjamin Walker, Technical Lead, Intel

Abstract

Presenting paravirtualized devices to virtual machines has historically required specialized drivers to be present in the guest operating system. The most popular example is virtio-blk or virtio-scsi. These devices can be constructed using either the host system operating system (KVM, for example, can emulate virtio-blk and virtio-scsi devices for use by the guest) or by a separate user-space process (the vhost-user protocol can connect to these targets, typically provided by SPDK). However, only Linux currently ships with virtio-blk and virtio-scsi drivers built-in. The BIOS does not typically have drivers available, making it impossible to boot from these devices, and operating systems like Windows need to have drivers installed separately.

In this talk, we'll cover a new, standardized protocol for virtual machines to use to communicate with other processes that can be used to emulate any PCI device called vfio-user. This protocol will be supported by QEMU in an upcoming release. We'll then over how SPDK has used this new protocol to present paravirtualized NVMe devices to guests, allowing the guest BIOS and guest OS to load their existing NVMe drivers without modification to use these disks. We'll then close with some benchmarks demonstrating the extremely low overhead of this virtualization.


NextGen Connected Cloud Datacenter with Programmable and Flexible Infrastructure

Bikash Roy Choudhury, Technical Director, Pure Storage

Abstract

Many businesses like ours are challenged with on-demand infrastructure provisioning at the speed of software while the datacenter footprint is diminishing. Pure Storage as a company is getting out of the datacenter center business and are adopting more of a OpEx cost model with strict legal and data compliance guidelines for many of our business application pipelines. We chose to go with a programmable Infrastructure and Connected Cloud architecture with open APIs. Integrated model with Kubernetes on bare metal hosts that can burst to cloud with native storage layer providing a flexible infrastructure with predictable performance for core and edge applications. The storage layer in the infrastructure stack provides data management capabilities using Stork. Stork provides built-in high availability, data protection, data security and compliance with multi-cloud mobility. Automating the entire stack with k8s manifest helped us to provision and modify quickly on-demand the entire infrastructure stack.


Computational Storage

Computational Storage Update from the Working Group

Scott Shadley, Co-Chair, Computational Storage TWG - VP Marketing, NGD Systems

Jason Molgaard, Co-Chair, Computational Storage TWG - Senior Principal Storage Device Architect, Arm

Abstract

In this presentation the Co-Chairs of the Computational Storage Technical Working Group (CS TWG) will provide a status update from the work having been done over the last year, including the release of the new Public Review materials around Architecture and APIs. We will update the status of the definition work and address the growing market and adoption of the technology with contributions from the 47+ member organizations participating in the efforts.

We will show use cases, customer case studies, and efforts to continue to drove output from the Technical efforts.


NVMe Computational Storage Update

Kim Malone, Storage Software Architect, Intel

Stephen Bates, CTO Eideticom

Abstract

Learn what is happening in NVMe to support Computational Storage devices. The development is ongoing and not finalized, but this presentation will describe the directions that the proposal is taking. Kim and Stephen will describe the high level architecture that is being defined in NVMe for Computational Storage. The architecture provides for programs based on a standardized eBPF.

We will describe how this new command set fits within the NVMe I/O Command Set architecture. The commands that are necessary for Computational Storage will be described. We will discuss a proposed new controller memory model that is able to be used for computational programs.


Computational Storage APIs

Oscar Pinto, Principal Engineer, Samsung Semiconductor Inc

Abstract

Computational Storage is a new field that is addressing performance and scaling issues for compute with traditional server architectures. This is an active area of innovation in the industry where multiple device and solution providers are collaborating in defining this architecture while actively working to create new and exciting solutions. The SNIA Computational Storage TWG is leading the way with new interface definitions with Computational Storage APIs that work across different hardware architectures. Learn how these APIs may be applied and what types of problems they can help solve.


Computational Storage Moving Forward with an Architecture and API

William Martin, SSD IO Standards, Samsung Electronics Co., Ltd.

Abstract

The SNIA Computational Storage TWG is driving forward with both a CS Architecture specification and a CS API specification. How will these specification affect the growing industry Computational Storage efforts? Learn what is happening in industry organizations to make Computational storage something that you can buy from a number of vendors to move your computation to where your data resides. Hear what is being developed in different organizations to make your data processing faster and allow for scale-out storage solutions to multiply your compute power.


Computational Storage Architecture Simplification and Evolution

Jason Molgaard, Sr. Principal Storage Solutions Architect, Arm Inc.

Abstract

Computational Storage continues to gain interest and momentum as standards that underpin the technology mature. Developers are realizing that moving compute closer to the data is a logical solution to the ever-increasing storage capacities. Data-driven applications that benefit from database searches, data manipulation, and machine learning can perform better and be more scalable if developers add computation directly to storage.

Flexibility is key in the architecture of a Computational Storage device hardware and software implementation. Hardware flexibility minimizes development cost, controller cost, and controller power. Software flexibility leverages existing ecosystems and software stacks to simplify code development and facilitate workload deployment to the compute on the drive. Leveraging the hardware and software flexibility enables Computational Storage devices to deploy technologies available today, such as eBPF, while also providing a pathway to upcoming technologies like CXL and type 2 accelerators.

This presentation will show how to simplify computational storage architectures today and an evolutionary pathway to CXL. Attendees will walk away understanding how to reduce power, area, and complexity of their computational storage controller, leverage Linux and the Linux ecosystem of software to facilitate software development and workload management, and consider how new technologies may evolve computational storage capabilities.


The Building Blocks to Design a Computational Storage Device

Jerome Gaysse, Senior Technology and Market Analyst, Silinnov Consulting

Abstract

The computational storage ecosystem is growing fast, from IP providers to system integrators. How to benefit from this promising technology: design your own device and bring your added value at the device level, or save time and select an available product?

This talk will present guidelines to use computational storage technology and a review of the computational storage building blocks, including computing, non-volatile memories, interfaces, embedded software, host software and tools.


Accelerating File Systems and Data Services with Computational Storage

Brad Settlemyer, Storage Researcher, Los Alamos National Laboratory

Abstract

Standardized computational storage services are frequently touted as the Next Big Thing in building faster, cheaper file systems and data services for large-scale data centers. However, many developers, storage architects and data center managers are still unclear on how best to deploy computational storage services and whether computational storage offers real promise in delivering faster, cheaper – more efficient – storage systems. In this talk we describe Los Alamos National Laboratory’s ongoing efforts to deploy computational storage into the HPC data center.

We focus first on describing the quantifiable performance benefits offered by computational storage services. Second, we describe the techniques used at Los Alamos to integrate computational storage into ZFS, a fundamental building block for many of the distributed storage services provided for Los Alamos scientists. By developing ZIA, the ZFS Interface for Accelerators, Los Alamos is able to embed data processing elements along the data path and provide hardware acceleration for data intensive processing tasks currently performed on general purpose CPUs. Finally, we describe how computational storage is leading to a fundamental re-architecture of HPC platform storage systems and we describe the lessons learned and practical limitations when applying computational storage to data center storage systems.


Computational Storage Deployment with Kubernetes and Containerized Applicationed

Scott Shadley, Co-Chair, Computational Storage TWG - VP Marketing, NGD Systems

Abstract

With the growth of Containerized applications and Kubernetes as an orchestration layer, the ability to leverage these technologies within the storage device directly adds additional support to the implementation and parallel processing of data. By using an os-based Computational Storage Drive (CSD), a deployment of SPARK will be presented and the steps required to achieve this task. The ability to use a distributed processing operation and orchestrate it with the Host and the CSDs at the same time to maximize the benefits of the application deployment.


Computational Storage Directions at Fungible

Jai Menon, Chief Scientist, Fungible

Abstract

Computational storage is paramount for truly composable infrastructure. While there isn't yet broad adoption, computational storage is ready to move beyond test benches into widespread deployment. This presentation explore today’s cloud data center requirements using real-world use cases to show how to move compute to the data, instead of the data to the compute. Computational systems seek to address the limitations of hyper-converged infrastructure systems, in which users can only scale compute and storage by purchasing additional nodes. These newer technologies enable organizations to maximize the value of their compute and storage hardware spending.


Scientific Data Powered by User-Defined Functions

Lucas Villa Real, Research Software Engineer, IBM Research

Abstract

Scientific data is commonly represented by static datasets. There is a myriad of sources for such data: snapshots of the Earth, as captured by satellites, sonars, and laser scanners; the output of simulation models; the aggregation of existing datasets, and more. Some of the problems faced by consumers of that data relate to data transformation (e.g., harmonizing data layout and format), inefficient data transfer over the network (for instance, the creation of derivative data leads to even larger datasets), expensive data processing pipelines (that often preprocess data in advance, hoping that the produced data will be consumed at some point by the applications), CPU occupation, and difficulty to track data lineage (especially when data processing scripts are lost for good).

This talk will present a pluggable extension to the HDF5 file format, used extensively by the scientific community, to enable the attachment of user-defined functions that help solve many of the aforementioned problems. Dubbed HDF5-UDF, the project allows scripts written in Lua, C/C++, and Python to be attached into existing HDF5 files. Such scripts are disguised as regular datasets that, once read, execute the associated source code and produce data in a format that existing applications already know how to consume.

HDF5-UDF has been designed from scratch to enable seamless integration with hardware accelerators. Its memory allocation model allows for input datasets (i.e., datasets that a user-defined function depends on) to be allocated in device memory (such as a SmartSSD) and the backend architecture enables the incorporation of different programming languages and compiler suites (such as Xilinx's Vitis compiler or Nvidia's nvcc). The talk will present current and future work in the context of computational storage.

To date, HDF5-UDF is tied to HDF5, but it is feasible to port this work to other formats such as TIFF (and GeoTIFF). Some guidance will be provided for those wishing to experiment with different data formats.

Last, but not least, HDF5-UDF is open source. All examples shown in the presentation are readily available for downloading, modification, and testing. Please visit https://github.com/lucasvr/hdf5-udf and https://hdf5-udf.readthedocs.io for more details on the project covered by this talk.


Stop Wasting 80% of Your Infrastructure Investment!

Tony Afshary, Sr. Director, Product Line Management, Pliops

Abstract

There is a new architectural approach to accelerating storage-intensive databases and applications which take advantage of new techniques that dramatically accelerate database performance, improve response times and reduce infrastructure cost at massive scale. This new architecture efficiently stores fundamental data structures increasing space savings—up to 80%—over host- and software-based techniques that cannot address inherent inefficiencies in today’s fastest SSD technology. With novel data structures and algorithms it is a unique data processing approach to fundamentally simplify the storage stack.


SkyhookDM: An Arrow-Native Storage System

Jayjeet Chakraborty, IRIS-HEP Fellow IRIS-HEP and CROSS, UC Santa Cruz

Abstract

With the ever-increasing dataset sizes, several file formats like Parquet, ORC, and Avro have been developed to store data efficiently and to save network and interconnect bandwidth at the price of additional CPU utilization. However, with the advent of networks supporting 25-100 Gb/s and storage devices delivering 1, 000, 000 reqs/sec the CPU has become the bottleneck, trying to keep up feeding data in and out of these fast devices. The result is that data access libraries executed on single clients are often CPU-bound and cannot utilize the scale-out benefits of distributed storage systems. One attractive solution to this problem is to offload data-reducing processing and filtering tasks to the storage layer. However, modifying legacy storage systems to support compute offloading is often tedious and requires an extensive understanding of the internals.

SkyhookDM introduces a new design paradigm for building computational storage systems by extending existing storage systems with plugins. Our design allows extending programmable object storage systems by embedding existing and widely used data processing frameworks and access libraries into the storage layer with minimal modifications. In this approach data processing frameworks and access, libraries can evolve independently from storage systems while leveraging the scale-out, availability, and failure recovery properties of distributed storage systems.

SkyhookDM is a data management system that allows data processing tasks to the storage layer to reduce client-side resources in terms of CPU, memory, and network traffic for increased scalability and reduced latency. On the storage side, SkyhookDM uses the existing Ceph object class mechanism to embed Apache Arrow libraries in the Ceph OSDs and uses C++ methods to facilitate data processing within the storage nodes. On the client-side, the Arrow Dataset API is extended with a new file format that bypasses the Ceph filesystem layer and invokes storage side ceph object class methods on objects that make up a file in the filesystem layer. SkyhookDM currently supports Parquet as its object storage format but support for other file formats can be added easily due to the use of Arrow access libraries.

 

Container Storage

Istio Service Mesh: A Primer

Binitha MT, Software Senior Engineer, Dell Technologies

Deepika K, Software Senior Engineer, Dell Technologies

Abstract

Microservices architectures enhance the ability for modern software teams to deliver applications at scale and have expanded the distributed nature of application. But as an application’s footprint grows, the challenge is to understand and control interactions among services within these environments. Enabling Service Mesh controls the communication, configuration and behavior of microservices in an application. A service mesh is a dedicated infrastructure layer for handling service-to-service communication in any microservice, public cloud or Kubernetes architecture. Istio, an open source project complimentary to Kubernetes, allows you to set up a service mesh of your own to start learning more about how it works. It offers service(request) discovery, monitoring, reliability, and security to applications.


CSI Driver Design: Bringing a Parallel File System to Containerized Workloads

Eric Weber, Software Engineer, NetApp

Joe McCormick , Software Engineer, NetApp

Abstract

The advent of cloud and the everything as a service (XaaS) model requires storage developers to rethink how their products are consumed. Organizations are looking to develop infrastructure and processes that are agnostic from any one cloud vendor, including their own on-premises datacenters. Container orchestrators (COs) like Kubernetes enable this ideal by allowing entire application deployments to be packaged up (in containers and manifests) and moved from environment to environment with relative ease. Container Storage Interface (CSI) provides a standard for exposing arbitrary block and file storage to these COs, but user experience for a particular CSI driver is heavily dependent on its implementation. Storage developers must carefully consider the key attributes of a storage product when developing a driver to ensure a cloud-like experience.

Parallel file systems provide an excellent opportunity to take storage software often perceived as complex and expose advanced functionality and capabilities through a CO’s simple and familiar interfaces. In this presentation, Eric Weber and Joe McCormick will break down the thought processes and design considerations that went into developing a CSI driver for BeeGFS. We will walk through how the CSI spec’s gRPC endpoints were mapped onto BeeGFS specific functions and discuss specific CSI/CO functionality and sub-features that do or do not make sense in the context of BeeGFS. While the CSI spec is intended to be CO agnostic, we will tend to discuss our design in the context of Kubernetes. In addition to existing BeeGFS CSI driver functionality, we will also discuss potential future features in order to give our audience a well-rounded view of the available capabilities for container storage.


Containerized Machine Learning Models Using NVME

Ramya Krishnamurthy, Senior QA Architect [Test Expert], HPE

Ajay Kumar, Senior Test Specialist, HPE

Ashish Neekhra, Senior Test Specialist, HPE

Abstract

Machine learning referred to as ML, is the study and development algorithms that improves with use of data -As it deals with the training data, the machine algorithm changes and grows. Most machine learning models begin with “training data” which the machine processes and begins to “understand” statistically. Machine learning models are resource intensive. To anticipate, validate, and recalibrate millions of times, they demand a significant amount of processing power. Training an ML model might slow down your machine and hog local resources.

The proposed solution is to Containerize ML with NVME - putting your ML models in a container. This talk covers aspects of Containerizing ML with NVME and its benefits:

  • Containers are lightweight software packages that run-in isolation on the host computing environment. Containers are predictable, repetitive, and immutable and are also easy to coordinate (or “orchestrate”). Containerizing the ML workflow requires putting your ML models in a container (Docker is sufficient), then deploying it on a machine. We can create a cluster of containers with a configuration suited for machine learning requirements
  • Artificial Intelligence (AI) at scale sets the standard for storage infrastructure in terms of capacity and performance, making it one of the most crucial factors for containers Hence storage plays an important aspect in containers
  • NVME is a new storage access and transport protocol for flash and next-generation solid-state drives (SSDs) provides the highest throughput , improved system level CPU utilization and fastest response times.
  • The combination of NVMe-over Fabric [NVMEoFC] with an NVME SSD solution allows Kubernetes orchestration to scale data-intensive workloads and increase data mining speed.

Data Protection for Stateful Application: OpenSource Way!!

Ashit Kumar, Lead Architect, SODA Foundation/Huawei Technologies

Abstract

Currently Stateful Applications contribute to the major chunk (more than 50%) of containerized application deployment for enterprise. To maintain an uninterrupted application availability for the containerized applications, backup and recover these application to be DR ready is the need and important strategy of the enterprises. One important aspect is to backup and recover in an alternative environment. With alternative environment we mean, in an hybrid environment, i.e. take an example of Cloud1 managed K8S environment being recovered into Cloud2 managed K8S environment.

Today, there are multiple proprietary solutions available but an Open Source solution, which is driven by community development, will add more insight as well as flexibility.

This session aims to provide the:

  1. Background of Data protection for stateful applications in K8S
  2. The strategy for developing such solutions, considerations and important aspects to look to
  3. Look into some of the available and upcoming Open Source solutions like Velero and future solution from SODA Foundation. And How SODA Foundation is creating a community driven development of this solution

Data Management Going Beyond Storage Boxes

Sanil Kumar D, Chief Architect, TOC, Arch Lead, Huawei / SODA Foundation

Abstract

Most of the top storage vendors are moving to storage as a service solution. Exploring all the ecosystem components including open source solutions to build a complete data management solution on top of the existing storage capabilities. The intensity of this move is accelerated as more cloud-native and hybrid use cases are in demand. In this session, we will see an overall architecture for a typical stack with key data management including data protection and monitoring across existing storage infrastructure. We will also see how this trend is shaping container and cloud-native storage. We will have a working demo on data protection/monitoring illustrating how the overall stack works with some of the open-source projects.


Accelerating CI/CD Pipelines with Kubernetes Storage Orchestration

Jacob Cherian, Chief Product Officer, ionir

Abstract

Continuous Integration/Continuous Delivery (CI/CD) platforms such as Jenkins, Bamboo, CircleCI and the like dramatically speed deployments by allowing administrators to automate virtually every process step. Automating the data layer is the final frontier. Today, hours are wasted manipulating datasets in the CI/CD pipeline. Allowing Kubernetes to orchestrate the creation, movement, reset, and replication of data eliminates dozens of wasted hours from each deployment, accelerating time to market by 500X or more.

Using real-life examples, Jacob will describe the architecture and implementation of a true Data as Code approach that allows users to:

  • Automatically provision compute, networking and storage resources in requested cloud provider / BM environments;
  • Automatically deploy the required environment-specific quirks (specific kernels, kernel modules, NIC configurations),
  • Automatically deploy multiple flavours of Kubernetes distributions (Kubeadm & OCP supported today; Additional flavors as pluggable modules)
  • Apply Kubernetes customized deployment (Feature Gates, API server tunables, etc )
  • Automate recovery following destructive testing -Replicate datasets to worker nodes instantly

All based on predefined presets/profiles (with obvious customization/overrides as required by the user), across multiple cloud environments (today AWS & BM, additional environments as pluggable modules into the architecture).

Data Protection Technologies

 

LTO Technology and Two-Dimensional Erasure-Coded Long-Term Archival Storage with RAIL Architecture

Turguy Goker, Director of Advanced Development LTO, Quantum, Quantum Corporation

Abstract

This is a two-part detailed technology presentation on LTO, erasure codes, and archival storage. In the first part, we will cover core LTO technology by reviewing areal density roadmaps, physical and logical LTO formats, and data durability for long-term archival applications including environmental requirements. In the second part, we will discuss new two-dimensional erasure-coded tape for durable and available long-term data storage where random access is used.

Tape has recently reemerged as the lowest cost medium for different levels of cold data in archival data storage. The old cliche about tape being dead is no longer the case. Instead, the new feeling is that tape has found a new home in hyperscaler data centers. This is due to three key factors: total cost of ownership (TCO), roadmap, and durability—a must for the new zettabyte era of cold data storage. In this era, the consensus among analysts is that data is growing at approximately 40 to 50 percent per year. IDC estimates that by 2025 there will be 7 trillion gigabytes of cold archive data to manage. This which is why tape has reemerged as the unprecedented cold storage medium of choice.

However, these massive scaling predictions have resurfaced some of the old challenges with new requirements as areal densities increase—such as typical tape and drive errors, human interactions, and environmental conditions. When we consider the fact that systems (due to low TCO) may operate in an open environment using cartridge to drive ratios around 100:1 with complicated robotics libraries, the need for a different data protection policy with adaptive drive and media management other than legacy copy systems becomes important. In this presentation we will introduce a new erasure-coded tape architecture using Redundant Array of Independent Libraries (RAIL) to solve this complex problem by offering lowest the TCO cold storage system with high data durability and availability.


Hardening OpenZFS to Further Mitigate Ransomware

Michael Dexter, CTO, Gainframe

Abstract

While the Open Source cross-platform OpenZFS file system and volume manager provides advanced block-level features including checksumming, snapshotting, and replication to mitigate ransomware at the POSIX level, the evolving nature of ransomware dictates that no technology should rest on its laurels. This developer and administrator-focused talk will explore strategies for hardening OpenZFS and its supported operating systems towards a goal of total storage immutability without authorization. Achieving this goal will require operating system-level mitigations with a focus on multi-factor authentication (MFA) and authorization.


A Quintuple Parity Error Correcting Code - a Game Changer in Data Protection

Marek Rychlik, CEO of Xoralgo Inc., Professor of Mathematics at the University of Arizona, co-inventor of a patented RAID and ECC technology, Xoralgo, Inc

Abstract

We describe a replacement for RAID 6, based on a new linear, systematic code, which detects and corrects any combination of E errors (unknown location) and Z erasures (known location) provided that Z+2E≤4. The code is at the core of a RAID technology called PentaRAID, for which the two co-inventors were awarded a US utility patent. The problem that we address is that of weak data protection of RAID 6 systems. The known vulnerability is that RAID 6 may recover from not more than 2 failed disks in a RAID, if we know which disks failed. If we do not know which disks are the source of errors, we are protected from only 1 disk failure. Moreover, if 1 disk fails, the failed data needs to be recovered, and with hard disks reaching 40TB, the recovery process lasts for weeks (degraded mode). While in degraded mode, the second disk failure results in a system with no error detection and correction. In addition, Undetected Disk Errors (UDE) can only be detected but not corrected event with one failed disk.

The natural solution is to increase redundancy from 2 to more disks. There is very little payoff from using 3 disks. It turns out that a practical solution is possible with 5 redundant disks, and this solution is employed in PentaRAID. The payoff is immense, as the RAID extends Mean Time to Data Loss (MTDL) from days to far beyond the age of the universe (100 quadrillion years) under typical assumptions in regard to disk error rates. The new RAID can tolerate a loss of 2 drives at unknown locations (thus seemingly operating normally but generating UDE), and up to 4 disks at known locations, e.g. due to power failure (typically detected by the disk controller).

In addition, the recovery process involves a fixed, small number of Galois field operations per error, and therefore has virtually fixed computational cost per error, independent of the number of disks in the array. Parity calculation has also constant time per byte of data, if distributed computation is utilized. In short, the computational complexity is on the par with that of RAID 6. Notably, the solution does not require Chien search commonly used in Reed-Solomon coding, which makes it possible to utilize large Galois fields. The application of the new technology will dramatically increase data durability with significant reduction of the number of hard disks necessary to maintain data integrity, and will simplify the logistics of operating RAID in degraded mode and recovery from disk failures.


Fighting Ransomware using Intelligent Storage

Chris Lionetti , Senior Technical Marketing Engineer, Hewlett Packard Enterprise

Abstract

Ransomware is an acknowledged threat, and protecting your data must be a security-in-depth exercise. We discusses how Intelligent Storage can detect and recover from an attack while maintaining administrative isolation from compromised servers. While this method is only a single layer of a defence-in-depth infrastructure, it can be implemented invisibly on existing workloads and storage which can gather the proper sets of metrics.

  1. Understand the value baseline measurements of metrics for your workloads.
  2. Identify patterns of anomalies in metrics which indicate probably infections.
  3. Utilize metrics to infection zero-day for restorations and the process to validate that restore.

DOTS - A Simple, Visual, Method for Preserving Digital Infomation

Robert Hummel, President, Group 47, Inc.

Abstract

DOTS is the only digital storage designed to be read with a camera employing standard image processing techniques. Data is recorded visually on patented phase-change metal alloy tape at a microscopic density that rivals the capacity of current magnetic tapes. DOTS can record any digital file format, visible text, and imagery on the same media. Using a visual method to represent the data ensures, as long as cameras and imaging devices are available, the information will always be recoverable, and backwardly compatible to the 1st generation.

DOTS™ is Write-Once Read Many (WORM) storage, tamper-proof, cannot be erased, and supports external compression and data encryption, making it a secure and robust archive technology. It is non-magnetic, chemically inert, immune to electromagnetic fields (including EMP), and can be stored in normal office environments or extremes ranging from -9º to 66º C (16º to 150º F) With visual technologies such as photographic prints or negatives and paper text documents, one can look directly at the medium to access the information. With all magnetic media, complex optical, biologic, or holographic storage, a machine and software are required to read and translate the data into a human-observable and comprehensible form. If the machine or software is obsolete or lost, the data is likely to be lost as well. It is critical that the method employed to protect the data must be unencumbered by complicated technology. DOTS is designed to ensure both those preservation and comprehension demands can be met.

The presentation will not only explain how DOTS works, but also explain that taking advantage of DOTS visual characteristics, we have come up with a patented method for preserving digital images and sound that ensure they will always be readable for hundreds of years without any file format or operating system dependencies.


Implementing WORM for Backup Protection: a Story about Integrating Disk Storage with Backup Application

Kornel Jakubczyk, Tech Lead, 9LivesData

Abstract

The ransomware attacks raised the need of an appropriate defense, bringing back the mature idea of WORM. Having pre-existing WORM capability in the disk storage alone is not enough for the backup area due to multiple problems like usage complexity, and potential for backup application mishandling of WORM partitions. Instead, integrated WORM support delivered by cooperation of the backup application with the disk storage delivers a complete WORM solution addressing these problems. We present trade-offs and advantages of various integration approaches based on the development of WORM in the NEC HYDRAstor for Veritas NetBackup.


“Streaming Replication” of Object Storage System Data to achieve Disaster Recovery

Gowtham Alluri, Staff Software Engineer, Nutanix Inc

Sarthak Moorjani, Member of Technical Staff, Nutanix Inc

Abstract

Object Storage is increasingly getting used on premises for a variety of use cases some of which consist of primary and critical applications. Such use cases and applications require enterprise grade Data protection and disaster recovery capabilities. Replication of S3 compatible Scale Out Object Storage presents unique challenges that are unlike found in traditional block or file storage. Replication of object storage system is done at individual bucket and object level and it not only has to deal with data replication but also object specific constructs like metadata and tags. Streaming replication provides a near synchronous and eventually consistent approach to replicate object storage data.

DNA Data Storage

DNA Data Storage and Near-Molecule Processing for the Yottabyte Era

Karin Strauss, Senior Principal Research Manager, Microsoft

Luis Ceze, Co-founder and CEO, OctoML

Abstract

DNA data storage is an attractive option for digital data storage because of its extreme density, durability and eternal relevance. This is especially attractive when contrasted with the exponential growth in world-wide digital data production. In this talk we will present our efforts in building an end-to-end system, from the computational component of encoding and decoding to the molecular biology component of random access, sequencing and fluidics automation. We will also discuss some early efforts in building a hybrid electronic/molecular computer system that can offer more than just data storage, for example, image similarity search.


DNA storage with ADS Codex: the Adaptive Codec for Organic Molecular Archives

Latchesar Ionkov, Scientist, Los Alamos National Laboratory

Abstract

The information explosion is making the storage industry look to new media to meet increasing demands. Molecular storage, and specifically synthetic DNA, is a rapidly evolving technology that provides high levels of physical data density and longevity for archival storage systems. Major challenges to the adoption are the higher error rates for synthesis and sequencing, as well as the different nature of errors. Unlike traditional storage media, erroneous insertions and deletions are a common source of errors in DNA-based storage systems. These errors require a different approach to encoding and recovery of binary data.

Further, the quickly evolving fields of synthesis and sequencing require a codec that can accommodate rapid technology changes in parallelism, er- ror rates, and DNA lengths while allowing the writing and reading technologies to be mixed and matched to best effect. Here we describe ACOMA, an open source end-to-end codec, that has been demonstrated to achieve 0.99 bits of data per nucleotide while successfully recovering data across a variety of real industrial DNA processes for writing and reading data.


From DNA Synthesis on Chips to DNA Data Storage

Andres Fernandez, Senior Director Silicon Engineering, Twist Bioscience

Abstract

Enabling data storage on DNA relies on advancements in semiconductor technology to make DNA synthesis cheaper, which is a must-have for this field to emerge. The talk will introduce storage people to the concept of how semiconductors are used to create DNA and how the two are tied together, as well as how the advancements in semiconductors are crucial to bringing DNA data storage costs down.


Storing Data over Millennia. Long term Room Temperature Storage of DNA

Marthe Colotte, Platform Operations Manager, Imagene

Abstract

The most expensive factor in traditional archival storage is that it is not durable and, thus, over the years, it is necessary to do many migrations due to degradation and technology obsolescence. In contrast, DNA reading technology, due to the immutable format of the DNA molecule, mitigates this obsolescence. However, while low cost, DNA storage does come with some imperatives. DNA outside the cell, as with any biological molecule, will be subject to aggressive degradation factors, the main one being water. Even when dehydrated, degradation due to water cannot be completely avoided because no plastic container is watertight. Imagene developed and industrialized a process allowing to keep, at room temperature, dehydrated DNA under an inert atmosphere in a hermetic stainless-steel capsule. This standalone storage system allows to store and retrieve digital data coded in DNA for millennia.


Scalable and Dynamic File Operations for DNA-based Data Storage

James Tuck, Professor, NC State University / DNAli Data Technologies

Abstract

DNA-based data storage systems have the potential to offer unprecedented increases in density and longevity over conventional storage mediums. Starting from the assumption that advances in synthesis and sequencing technology will soon make DNA-based storage cost competitive with conventional media, we will need ways of organizing, accessing, and manipulating the data stored in DNA to harness its full potential. There are a range of possible storage system designs. This talk will cover three systems that the speaker co-developed and prototyped at NC State / DNAli Data Technologies. First, we'll show how we expanded the set of uniquely addressable files by nesting primers, the chemical labels that identify each file, to ensure that system capacity can reach the high densities afforded by DNA. Second, in our File Preview system, we exploit the thermodynamics of primer bindings to create a new file access operation that allows either full or partial access of a file's data, thereby saving sequencing bandwidth when a partial file read is sufficient. While the first two systems rely on double-stranded DNA, the third system, DORIS, is comprised of a T7 promoter and a single-stranded overhang domain (ss-dsDNA). The overhang serves as a physical address for accessing specific DNA strands as well as enabling a range of in-storage file operations like renaming and deletion. Meanwhile, the T7 promoter enables repeatable information access by transcribing information from DNA without destroying it. 


DNA Sequencing at Scale

Craig Ciesla, VP Advanced Engineering, Ilumina

Abstract

DNA Sequencing using Sequencing-by-synthesis (SBS) technology is today responsible for the majority of sequencing done worldwide. This presentation will cover the fundamentals behind SBS, the steps involved in going from a DNA sample to data, and the current state of art of sequencing platforms. The presentation will end by discussing how DNA sequencing can be applied to DNA-based data storage.


Emerging Interfaces

Compute Express Link 2.0: A High-Performance Interconnect for Memory Pooling

Andy Rudoff, Persistent Memory SW Architect, Intel

Abstract

Data center architectures continue to evolve rapidly to support the ever-growing demands of emerging workloads such as artificial intelligence, machine learning and deep learning. Compute Express Link™ (CXL™) is an open industry-standard interconnect offering coherency and memory semantics using high-bandwidth, low-latency connectivity between the host processor and devices such as accelerators, memory buffers, and smart I/O devices. CXL technology is designed to address the growing needs of high-performance computational workloads by supporting heterogeneous processing and memory systems for applications in artificial intelligence, machine learning, communication systems, and high-performance computing (HPC). These applications deploy a diverse mix of scalar, vector, matrix, and spatial architectures through CPU, GPU, FPGA, smart NICs, and other accelerators.

During this session, attendees will learn about the next generation of CXL technology. The CXL 2.0 specification, announced in 2020, adds support for switching for fan-out to connect to more devices; memory pooling for increased memory utilization efficiency and providing memory capacity on demand; and support for persistent memory. This presentation will explore the memory pooling features of CXL 2.0 and how CXL technology will meet the performance and latency demands of emerging workloads for data-hungry applications like AI and ML.


PCIe® 6.0: A High-Performance Interconnect for Storage Networking Challenges

Mohiuddin Mazumder, Principal Engineer, Data Center Group, PCI-SIG, Intel

Abstract

Over the past nearly three decades, PCI-SIG® has delivered a succession of industry-leading specifications that remain ahead of the curve of the increasing demand for a high-bandwidth, low-latency interconnect for compute-intensive systems in diverse market segments, including data centers, PCs and automotive applications. Each new PCI Express® (PCIe®) specification consistently delivers enhanced performance, unprecedented speeds, and low latency – doubling the data rate over previous generations. The PCIe 6.0 specification – targeted for final release in 2021 – will deliver 64 GT/s data rate (256 GB/s via x16 configuration), while maintaining backward compatibility with previous generations.

In this session, attendees will learn the nuts and bolts of PCIe 6.0 architecture and how it will enable high-performance networking. Some key features of the upcoming specification include PAM4 encoding, low-latency Forward Error Correction (FEC), and backward compatibility with all previous generations of PCIe technology. This presentation will also highlight PCIe 6.0 technology use cases and the heterogenous computing applications that will be accelerated by PCIe 6.0 technology, including artificial intelligence, machine learning and deep learning. Finally, attendees will receive an update on the release timeline of the PCIe 6.0 specification later this year and rollout of the interoperability and compliance program.


Enabling Heterogeneous Memory in Python

Daniel Waddington, Principal Research Staff Member, IBM Research

Abstract

Adopting new memory technologies such as Persistent Memory and CXL-attached Memory is a challenge for software. While libraries and frameworks (such as PMDK) can help developers build new software around emerging technology, legacy software faces a more severe challenge. At IBM Research Almaden we are exploring a new approach to managing heterogeneous memory in the context of Python. Our solution, PyMM, focuses on ease-of-use and is aimed primarily at the data science community. This talk will outline PyMM and discuss how it is being used to manage Intel Optane persistent memory. We will review the PyMM programming abstractions and some early data science use-cases. PyMM is currently an early research prototype with open source availability.


SNIA SDXI Roundtable: Towards Standardizing a Memory to Memory Data Movement and Acceleration Interface

Moderator:
Shyamkumar Iyer, Distinguished Engineer, Dell

Panelists:
Philip Ng, Sr. Fellow, AMD
Alexandre Romana, Principal System Architect, ARM
Jason Wohlgemuth, Partner Software Engineering Lead, Microsoft
Donald Dutile, Principal Software Engineer, Red Hat, Inc
Rich Brunner, Principal Engineer and CTO of Server Platform Technologies, VMware
Paul Hartke, Xilinx

Abstract

Smart Data Accelerator Interface (SDXI) is a proposed standard for a memory to memory data movement and acceleration interface. Software memcpy is the current data movement standard for software implementation due to stable CPU ISA. However, this takes away from application performance and incurs software overhead to provide context isolation. Offload DMA engines and their interface are vendor-specific and not standardized for user-level software.

SNIA’s SDXI TWG is tasked with developing and standardizing an extensible, forward-compatible memory to memory data mover and acceleration interface that is independent of actual data mover implementations and underlying I/O interconnect technology.

In this panel discussion, experts and representatives of SDXI TWG member companies will talk about their motivations in joining this industry-standard effort.


Rethinking Software Defined Memory (SDM) for Large-Scale Applications with Faster Interconnects and Memory Technologies

Anjaneya "Reddy" Chagam, Senior Principal Engineer, Intel Corporation

Manoj Wadekar, Senior Storage Architect, Facebook

Abstract

Software-Defined Memory (SDM) is an emerging architecture paradigm that provides software abstraction between applications and underlying memory resources with dynamic memory provisioning to achieve the desired SLA. With emergence of newer memory technologies and faster interconnects, it is possible to optimize memory resources deployed in cloud infrastructure while achieving best possible TCO. SDM provides a mechanism to pool disjoint memory domains into a unified memory namespace. SDM foundation architecture and implementation framework is currently being developed in OCP Future Technology Initiative (FTI-SDM) project. Goal for this talk is to share OCP work and explore deeper collaboration with SNIA.

This talk will cover SDM Architecture, current industry landscape, academic research and leading use cases (e.g. Memcached, databases etc.) that can benefit from SDM design. This talk will cover how applications can consume different tiers of memory (e.g., DDR, SCM, HBM) and interconnect technologies (e.g. CXL) that are foundational to SDM framework to provide load-store access for large scale application deployments. SDM value prop will be demonstrated with caching benchmarks and tiering to show how memory can be accessed transparently.


InfiniBand/RoCE RDMA Specification Update

Chet Douglas, Principal SW Architect, Intel Corporation

Abstract

The first phase of the IBTA Memory Placement Extensions (MPE), supporting low-latency RDMA access to persistent memory on Infiniband and RoCE networks, was published in August. In this talk, the MPE will be introduced, motivations for the additions discussed, and performance advantages of the MPE over current techniques will be reviewed. In addition to the these new MPE protocol enhancements in the new specification, additional operations, currently under development and planned for the next version, will also be presented.

 

Hard Drives / HDD

Unleashing the Performance of Multi-Actuator Drives

Arie van der Hoeven, Principal Product Manager, Seagate Technology

Tim Walker, Principal Engineer, Seagate Technology

Abstract

Applications desire for higher IOPS & bandwidth is driving innovations at the hardware level. This requires software stack awareness to harness the benefits. Multi-actuator drives represent the next big innovation in performance.

Dual actuator drives are being deployed today in large numbers for specific storage applications with Seagate MACH.2 SAS drives. This meets a growing need for large disk sizes that can meet both TCO and IOPS goals . As aerial densities and drive sizes increase, multi-actuator drives are migrating from early adopter deployments to the mainstream. This offers the promise of deploying larger disk sizes while continuing to meet performance goals, but only if there is software stack awareness that a single drive whether SATA, SAS or NVMe has two or more actuators that can be treated as logically independent disks for optimal performance. In this talk storage experts will explain what has been learned from early adoption, how this can be deployed for scenarios including CDN, Object Store, Big Data, AI and Machine Learning and what types of IO perform best in this new storage hardware paradigm. Coming support for SATA presents new challenges and deployment and software options including use of optimized Linux IO schedulers such as BFQ. Are you ready for the future of hard disk storage?


BFQ Linux IO Scheduler Optimizations for Multi-Actuator SATA Hard Drives

Tim Walker, Principal Engineer, Seagate Technology

Paolo Valente, Assistant Professor, University of Modena and Reggio Emilia

Abstract

Reaching higher IOPS becomes increasingly important as drive capacities grow. Multi-actuator drives are an effective response to this need. They appear as a single device to the I/O subsystem. Yet they address commands to different actuators internally, as a function of LBAs. This poses the following important challenge: information on the destination actuator of each command must be used cleverly by the I/O subsystem. Otherwise, the system has little or no control over the load balance among actuators; some actuators may be underutilized or remain totally idle.

I/O schedulers are the best candidates for tackling this problem, as their role is exactly to decide the order in which to dispatch commands. In consultation with Seagate Technology, we have enriched the BFQ I/O scheduler with extra logics for multiple actuators. Exploiting knowledge of the destination actuators of commands, this extended version of BFQ provides dramatic performance improvements, over a wide range of workloads. At the same time, it preserves the original bandwidth and latency guarantees of BFQ. As a more general contribution, the concepts and strategies used in BFQ show effective ways to take advantage of the IOPS gains of multi-actuator drives.


Be On Time: Command Duration Limits Feature Support in Linux

Damien Le Moal, Director, Western Digital

Abstract

Delivering high throughput and/or high IO rates while minimizing command execution latency is a common problem to many storage applications. Achieving low command latency to implement a responsive system is often not compatible with high performance. The ATA NCQ IO priority feature set, supported today by many SATA hard-disks, provides a coarse solution to this problem. Commands that require a low latency can be assigned a high priority, while background disk accesses are assigned a normal priority level. This binary hinting for the latency requirements of read and write commands allows a device firmware to execute high priority commands first, therefore achieving the desired lower latency.

While NCQ IO priority can satisfy many applications, the standards do not clearly define how high-priority commands should be executed. This often results in significant differences between different device models from different device vendors.

The Command Duration Limits (CDL) feature introduces a more detailed definition of command priorities and of their execution by the device. A command duration limit descriptor defines an inactive time limit (command queuing time), active time limit (command execution time) and a policy to apply to the command if one of the limit is exceeded (e.g. abort command, execute command, etc). Up to seven command duration limits can be defined and controlled by the user for read and write commands, allowing a very fine control over disk accesses by an application. Furthermore, unlike the ATA NCQ IO priority feature which has no equivalent in SCSI, Command Duration Limits is defined for both ATA and SCSI disks.

This talk will present Linux implementation of CDL support. The kernel interface for discovering and configuring the disk command duration limits is described. Examples will also illustrate how a user application can specify different limits for different IO operations using an interface based off the legacy IO priority support. The effect of Command Duration Limits will be illustrated using results from various micro-benchmarks.


Facts, Figures and Insights from 250,000 Hard Drives

Andrew Klein, Principle Storage Cloud Evangelist, Backblaze

Abstract

For the last eight years Backblaze has collected daily operational data from the hard drives in our data centers. This includes daily SMART statistics from over 250,000 hard drives, and SSDs, totaling nearly two exabytes of storage, and totaling over 200 million data points. We'll use this data to examine the following:

  • the lifetime failure statistics for all the hard drives we have ever used.
  • how has temperature effects the failure rate of hard drives.
  • a comparison the failure rates of helium filled hard drives versus air-filled hard drives.
  • the SMART stats we use to indicate whether a Hard Drive may fail and if you can use SMART stats to predict hard drive failure.
  • how SSD failure rates compare to HDD failure rates.

The data we use is available for anyone to download and use for their own experimentation and education.


Introduction to HDD Field Accessible Reliability Metrics to Machine Learning Applications

Paul Burnett, Principal Engineer, Seagate Technologies

Abstract

Data is growing at an exponential pace and the need to manage this data at the core and edge is a multi-facet problem. New innovative methods to ensure data availability & utilization of the resources that store this data are being developed. Storage device health monitoring & utilization is one such issue. Developing models to predict drive degradation while using machine learning principles is highly desirable.

A recent Google blog described the machine learning techniques being used to promote & improve drive maintenance of their fleet. Enabling this capability requires the ability to efficiently pull drive information. Seagate has developed a method that can allow data to be extracted quickly and reliably while combining many different data sets into a single command operation. Customers including Google and Tencent implement such drive health monitoring into their eco-system. This presentation will review what is openly available from these high capacity devices and how they can be used to create novel ML models to predict device behavior and make future utilization decisions.

 


Drive Health Monitor (DHM) for Drives On-Prem (or Core Data Center) and Cloud

Mahmoud Jibbe, Technical Director, NetApp

Charles Binford, Software Architect, NetApp

Abstract

This paper looks back at the analysis that has been done for the drive wear-out issue on for the different E-series array systems running at different customers’ sites and uses that data to give more specific guidance on thresholds for a preemptive drive removal.

Motivation of DHM:
Customers with old ventage or Refurbished drive replacement may experience a data loss event and continues to see a high drive failure rate >5% AFR (Annual Failure Rate). Storage vendors expect to see high fallout rates across the hard drive population as they age.

The high utilization and the age of the drives are likely to continue and possibly increase the drive failure rate. Failures are as such:

  • Outages: Loss of access
  • Performance impact
  • Possible data loss

Keynotes

Power-Efficient Data Processing with Software-Defined Computational Storage

Yang Seok Ki, Sr. Director, Samsung Electronics

Abstract

CPU performance improvements based on Dennard scaling and Moore's Law have already reached their limits, and domain-specific computing has been considered as an alternative to overcome the limitations of traditional CPU-centric computing models. Domain-specific computing, seen in early graphics cards and network cards, has expanded into the field of accelerators such as GPGPUs, TPUs, and FPGAs as machine learning and blockchain technologies become more common. In addition, hyperscalers, where power efficiency is particularly important, use ASICs or FPGAs to offload and accelerate OS, security, and data processing tasks.

Meanwhile, technologies such as cloud, machine learning, big data, and the edge are generating data explosively, and the recent emergence of high-performance devices such as over-hundred-gigabit networks, NVMe SSDs and SCMs has made CPU-centric computing more the bottleneck. Processing large amounts of data in a power-efficient manner requires re-examining the existing model of moving data from storage to the CPU, which consumes a lot of power and limits performance due to bandwidth limitations. Eventually, we expect that each device will extend the functions it performs into the realm of computing per its needs, and each device will participate in heterogeneous computing coordinated by the CPU. Samsung believes that near-data processing, or in-storage computing is another important piece of the puzzle.

In this keynote, we look back at the system architecture that has changed to handle a variety of data and discuss the changes we expect from system architecture in the future. And we'll talk about what Samsung can contribute to these changes, including the evolution of computational storage, form factors, features, roles, benefits, and components. We'll also look at the ecosystem elements this computational storage needs to settle into, and talk about areas in which various industry players need to work together.


Quantum Technology and Storage: Where Do They Meet?

Doug Finke Managing Editor, Quantum Computing Report

Abstract

Although quantum technology can be leveraged to do many amazing things, it is not able to provide a general replacement for the storage capabilities we have today with HDDs and SSDs. However, there are a few things where quantum can be leveraged to provide some capabilities that are related to storage and this presentation will cover them. The presentation will start with a quick overview of some of the basic concepts of quantum technology and the reasons why quantum computing may potentially provide significant performance improvements over classical computing for certain applications. It will discuss how quantum computing does implement something similar to computational storage and follow that by explaining how quantum memories can be utilized in certain applications. It will wrap up by explaining how quantum computers work very closely with classical computers to form hybrid classical/quantum processing systems and mention that traditional SSD and HDD storage devices will still be needed on the classical side to support these types of systems.


Panel Discussion: From DNA Synthesis on chips to DNA Data Storage

Hosted by:
Dave Landsman, DNA Data Storage Alliance; Director Industry Standards, Western Digital

Moderator:
John Hoffman, Lead Scientist, QS-2

Panelists:
Karin Strauss, Sr. Principal Research Manager,Microsoft
Craig Ciesla, VP Advanced Engineering, Ilumina
Daniel Chadash, Sr. Director DNA Data Storage, Twist Bioscience
Steffen Hellmold, VP Strategic Intitiatives, Western Digital

Abstract

Today, information is being digitized on a massive scale, by servers in datacenters, by mobile devices, and by networks of sensors everywhere around us. Artificial intelligence techniques and ubiquitous processing power are making it possible to mine this massive ocean of data; however, integral to harnessing this data as knowledge is the ability to store it for long periods of time. Legacy storage solutions have scaled extensively over the years, but the areal density of magnetic media (HDD and tape), which enables today’s mainstream archival storage solutions, is slowing, and the size of libraries is becoming unwieldy.

The industry needs a new storage medium that is more dense, durable, sustainable, and cost effective in order to cope with the expected future growth of archival data. DNA, nature’s data storage medium, enters this picture at a time when synthesis and sequencing technologies for advanced medical and scientific applications are enabling the manipulation of synthetic DNA in ways previously unimagined.

In this panel discussion, the founders of the DNA Storage Alliance will discuss their views of this emerging technology and the challenges we face to make it a commercially viable part of the storage ecosystem.


Unlocking the potential of NVMe-oF and Software Defined Storage Thanks to Programmable DPUs

Sebastien Le Duc, Software Engineering Director, Kalray

Abstract

During the last two decades, the data center world has been moving to a “Software Defined Everything” paradigm. This has been taken care of mostly by hypervisors running on the x86 up to recently.

In parallel, a new communication protocol to interface with SSDs has been specified from the ground-up, allowing to fully exploit the levels of parallelism and performances of all-flash storage: NVMe, and NVMe-oF. NVMe-oF promises to enable the performances of direct attached all-flash storage with the flexibility an TCO savings of shared storage. To fully unlock the benefits of NVMe-oF while keeping the software defined paradigm, we believe a new kind of processor is needed: the Data Processing Unit, or DPU.


Maximizing Flash Value with the Software-Enabled Flash™ SDK

Eric Ries, SVP Memory and Storage Strategy Division, KIOXIA

Rory Bolt, Principal Architect, Sr. Fellow KIOXIA America, Inc.

Abstract

The hyperscale cloud innovates through relentless optimization. Cloud providers are always looking for ways to maximize the efficiency of every hardware and software component they deploy. To help them achieve that goal, KIOXIA released the open source Software-Enabled Flash™ API, which redefines the relationship between the host and flash devices, and allows cloud-scale users to unlock the most value from their flash. By removing the impediment of legacy protocols used by current designs, and architecting a completely new way of using flash, Software-Enabled Flash technology delivers software-defined flash to hyperscale users.

Collaborating with hyperscale developers, KIOXIA made Software-Enabled Flash technology more powerful and easier to use with a new, higher-level SDK. This new SDK enables low-code prototyping and evaluation by providing high-level abstractions to speed application development and flash deployment. It enhances the Software-Enabled Flash ecosystem with a reference flash translation layer (FTL) and virtual I/O (VIRTIO) device, as well as command-line management and testing utilities. It also preserves full access to the low-level Software-Enabled Flash API primitives: workload isolation, quality of service management, latency outcome control, generation and vendor abstraction, and flash management offload. KIOXIA will present this new SDK and describe how it can be used by developers to solve storage challenges, and help derive the maximum value of deployed flash, at hyperscale.


Beyond Zoned Named Spaces – ZNSNLOG bridging the semantic gap

Chun Liu, Chief Architect at Futurewei Technologies

Abstract

When data processing engines use more and more log semantics, it’s natural to extend Zoned Namespace to provide a native log interface by introducing variable size, byte append-able, named zones. Using the newly introduced ZNS: NLOG interface, not only the storage device enables less write amplification/more log write performance, but a more flexible, robust, and scalable naming service to natively facilitate distributed frameworks and applications. Given the trend towards a computer-storage disaggregation paradigm and more capable computational storage, the ZNSNLOG extension enables direct mapping and enables more opportunities for near-data processing.


Innovations in Load-Store I/O causing Profound Changes in Memory, Storage, and Compute landscape

Dr. Debendra Das Sharma, Intel Fellow, Intel Corporation

Abstract

Emerging and existing applications with cloud computing, 5G, IoT, automotive, and high-performance computing are causing an explosion of data. This data needs to be processed, moved, and stored in a secure, reliable, available, cost-effective, and power-efficient manner. Heterogeneous processing, tiered memory and storage architecture, accelerators, and infrastructure processing units are essential to meet the demands of this evolving compute, memory, and storage landscape. These requirements are driving significant innovations across compute, memory, storage, and interconnect technologies. Compute Express Link* (CXL) with its memory and coherency semantics on top of PCI Express* (PCIe) is paving the way for the convergence of memory and storage with near memory compute capability. Pooling of resources with CXL will lead to rack-scale efficiency with efficient low-latency access mechanisms across multiple nodes in a rack with advanced atomics, acceleration, smart NICs, and persistent memory support. In this talk we will explore how the evolution in load-store interconnects will profoundly change the memory, storage, and compute landscape going forward.


The Perspective of Today’s Storage Architectures, Viewed Through a Long Lens

Randy Kreiser, Storage Specialist/FAE, Supermicro

Alan Johnson, Director of Emerging Technologies, Supermicro

Abstract

Two veterans of the storage industry discuss the impact of hardware innovation to the common deployment methods used in SDS.

At scale, embedded storage solutions fall away in favor of distributed approaches that encapsulate and aggregate the many previous steps taken, bringing us to the common SDS methods we use today. The session speakers were present for many of these baby-steps and major milestones, providing a unique perspective to our current “state of the art.”


Amazon Aurora storage – Purpose built storage for databases

Murali Brahmadesam, Director of Engineering, Aurora Databases, Amazon

Abstract

In this talk you will learn about the reasons for building a specialized storage system for databases, and how Aurora databases are built using such a storage system in cloud.

NVMe

NVMe 2.0 Specifications: The Next Generation of NVMe Technology

Peter Onufryk, Intel Fellow, Intel

Abstract

The NVM Express® (NVMe®) family of specifications, released in June 2021, allow for faster and simpler development of NVMe solutions to support the increasingly diverse NVMe device environment, now including Hard Disk Drives (HDDs). The extensibility of the specifications encourages the development of independent command sets like Zoned Namespaces (ZNS) and Key Value (KV) while enabling support for the various underlying transport protocols common to NVMe and NVMe over Fabrics (NVMe-oF™) technologies. The NVMe 2.0 library of specifications have been broken out of multiple documents, including the NVMe Base specification, various Command Set specifications, various Transport specifications and the NVMe Management Interface specification.

In this session, attendees will learn how the restructured NVMe 2.0 specifications enable the seamless deployment of flash-based solutions in many emerging market segments. This session will provide an overview and usages for several the new features in the NVMe 2.0 Specifications including ZNS, KV, Rotational Media and Endurance Group Management. Finally, the session will cover how these new features will benefit the cloud, enterprise and client market segments.


Towards Copy-Offload in Linux NVMe

Kanchan Joshi, Staff engineer, Samsung Semiconductor India Research (SSIR)

Selvakumar Somasundaram, Engineer, Samsung Semiconductor India Research (SSIR)

Abstract

The de-facto way of copying data in I/O stack has been pulling it from one location followed by pushing to another. The farther the application, requiring copy, is from storage, the longer it takes for trip to be over. With copy-offload the trip gets shorter as the storage device presents an interface to do internal data-copying. This enables the host to optimize the pull-and-push method, freeing up the host CPU, RAM, and the fabric elements.

The copy-offload interface has existed in SCSI storage for at least a decade through XCOPY but faced insurmountable challenges in getting into Linux I/O stack. As for NVMe storage, copy-offload made its way into the main specification with a new Simple Copy Command (SCC) recently. This has stimulated a renewed interest and efforts toward copy-offload in Linux community.

This talk presents copy-offload work in Linux, with a focus on NVMe simple-copy command. We outline the previous challenges, and extensively cover the current upstream efforts towards enabling SCC; we believe these efforts can sprout more copy-offload standardization advancements. We also elaborate the kernel/app interface and the use-cases that are being built and discussed in Linux I/O stack. The talk will conclude with the evaluation data comparing SCC with regular read-write based copy.


Enabling Asynchronous I/O Passthru in NVMe-Native Applications

Javier Gonzalez, Principal Software Engineer, Samsung Electronics

Kanchan Joshi, Staff engineer, Samsung Electronics

Simon Lund, Staff Engineer, Samsung

Abstract

Storage interfaces have evolved more in the past 3 years than in the previous 20 years. In Linux, we see this happening at two different layers: (i) the user- / kernel-space I/O interface, where io_uring is bringing a low-weight, scalable I/O path; and (ii) and the host/device protocol interface, where key-values and zoned block devices are starting to emerge.

Applications that want to leverage these new interfaces have to at least change their storage backends. This presents a challenge for early technology adopters, as the mature part of the Linux I/O stack (i.e., the block layer I/O path) might not implement all the needed functionality. While alternatives such as SDPK tend to be available more rapidly, the in-kernel I/O path presents a limitation.

In this talk, we will present how we are enabling an asynchronous I/O path for applications to use NVMe devices in passthru mode. We will speak to the upstream efforts to make this path available in Linux. More specifically, we will (i) detail the changes in the mainline Linux kernel, and (ii) we will show how we are using xNVMe to enable this new I/O path transparently to applications. In the process, we will provide a performance evaluation to discuss the trade-offs between the different I/O paths in Linux, including block I/O io_uring, passthru io_uring, and SPDK.


Can SPDK Deliver High Performance NVMe on Windows?

Nick Connolly, Chief Scientist MayaData / DataCore Software

Abstract

Yes, it really does say Windows and SPDK in the same sentence! The Storage Performance Development Kit (SPDK) is a well-regarded set of tools and libraries for writing high performance user mode storage applications on Linux and FreeBSD. However, in a typical Data Centre, a significant percentage of the servers will be running Microsoft Windows where the options for NVMe support are more limited.

This talk looks at what is involved in making SPDK run natively on Windows. Starting with the creation of the Windows Platform Development Kit (WPDK) as a base, it covers the design options that were considered, the build environment, the trade offs and potential benefits. The current state is explained, together with examples of the changes that were required in both the SPDK and Data Plane Development Kit (DPDK).

WPDK provides the POSIX functionality needed to run SPDK on Windows, implementing a set of headers and a lightweight library which map functionality as closely as possible to existing Windows semantics. It is intended to be a production quality layer that runs as native code, with no surprises, that can be tested independently. Although still at an experimental stage, the project has successfully served NVMe over TCP and iSCSI targets that are directly attached to physical NVMe disks.

As a newcomer to open source development, there will also be a few personal reflections on the experience of gaining support from both the SPDK and DPDK communities to realise the vision.


Testing NVMe SSDs against the DatacenterSSD Specification

David Woolf, Senior Exec Technology Office, UNH-IOL

Abstract

The DatacenterSSD specification has been created by a group of hyperscale datacenter companies in collaboration with SSD suppliers and enterprise integrators. What is in this specification? How does it expand on the NVMe specification? How can devices demonstrate compliance? In this talk we’ll review important items from the DatacenterSSD specification to understand how it expands on the NVMe family of specifications for specific use cases in a datacenter environment. Since the DatacenterSSD specification goes beyond just an interface specification, we will also show what test setups can be used to demonstrate compliance.


QEMU NVMe Emulation: What's New

Klaus Jensen, Staff Software Engineer, Samsung Electronics

Padmakar Kalghatgi, Software Engineer, Samsung Electronics

Naveen Nagar, Software Engineer, Samsung Electronics

Abstract

The QEMU emulated NVMe device is used by developers and users alike to develop, test and verify device drivers and tools. The emulated device is in rapid development and with QEMU 6.0, the device was updated to support a number of core additional features such as an update to NVMe v1.4, universal Deallocated and Unwritten Logical Block Error support, enhanced PMR and CMB support as well as a number of experimental features such as Zoned Namespaces, multipath I/O, namespace sharing and DIF/DIX end-to-end data protection. The addition of these features allow users to test various configurations and developers to test device driver code paths that would normally not be easily exercised on generally available hardware.

In this talk we present the implementation of these features and how they may be used to improve tooling and device drivers.


Boosting the Performance and QoS of MySQL with ZNS SSDs

Aravind Ramesh, Principal Engineer, Western Digital

Abstract

Zoned Namespace SSDs are SSDs that implement the ZNS command set as specified by the NVM Express organization. ZNS SSDs provide an interface to the host such that the host/applications can manage the data placement on these SSDs directly.

This presentation plans to cover the basics of ZNS SSDs and demonstrate the software stack through which MySQL can be run on ZNS SSDs. MySQL integrated with RocksDB is called MyRocks, this presentation discusses about ZNS support in RocksDB and MySQL and evaluate the performance of and QoS benefits for MySQL on a ZNS SSD viz-a-viz a conventional SSD.

NVMe-oF

NVMe-oF: Protocols & Transports Deep Dive

Joseph White, Fellow, Dell

Abstract

Block storage access across Storage Area Networks (SANs) have an interesting protocol and transport history. The NVMe-oF transport family provides storage administrators with the most efficient and streamlined protocols so far leading to more efficient data transfers and better SAN deployments. In this session we will explore some of the protocol history to set the context for a deep dive into NVMe/TCP, NVMe/RoCE, and NVMe/FC. We will then examine network configurations, network topology, QoS settings, and offload processing considerations. This knowledge is critical when deciding how to build, deploy, operate, and evaluate the performance of a SAN as well as understanding end to end hardware and software implementation tradeoffs.

Agenda

  • SAN Transport History and Overview
  • Protocol History
  • Protocol Comparisons
  • NVMe/FC Deep Dive
  • NVMe/RoCE Deep Dive
  • NVMe/TCP Deep Dive
  • Networking Configurations and Topologies for NVMe-oF
  • QoS, Flow Control and congestion
  • L2 local vs L3 routed vs Overlay
  • Offload Processing Considerations and Comparisons

Automating the Discovery of NVMe-oF Subsystems over an IP Network

Erik Smith, Distinguished Member of Technical Staff, Dell

Abstract

NVMe/TCP has the potential to provide significant benefits in application environments ranging from the Edge to Data Center. However, to fully unlock its potential, we first need to overcome NVMe over Fabrics' discovery problem. This discovery problem, specific to IP based fabrics, can result in the need for the Host admin to configure each Host to access the appropriate NVM subsystems. In addition, any time an NVM Subsystem is added or removed, the Host admin needs to update the impacted hosts. This process of explicitly updating the Host any time a change is made does not scale when more than a few Host and NVM subsystem interfaces are being used. Also, due to the de-centralized nature of this process, it also adds complexity when trying to use NVMe-oF in environments that require high-degrees of automation.

For these and other reasons, Dell Technologies, along with several other companies, have been collaborating on innovations that enable an NVMe-oF IP Based fabric to be centrally managed. These innovations, being tracked under nvme.org’s Technical Proposals TP-8009 and TP-8010, enable administrators to set a policy that defines the relationships between Hosts and the NVM subsystems they need to access. These policies are then used by a Centralized Discovery Controller to allow each Host to automatically discover and connect to only the appropriate NVM subsystems and nothing else.


Securing an NVMe-oF IP Fabric

Claudio DeSanti Distinguished Engineer, Dell Technologies

Abstract

Storage Area Networks (SANs) are usually used to access the most critical data of an organization, therefore ensuring their security is of paramount importance. This presentation will introduce the general SAN security threats and the methods (authentication and secure channel) to mitigate them. It will also present the authentication protocol and secure channel specifications that have been defined for NVMe-oF over IP fabrics, with special attention to the NVMe/TCP case.


Intel SmartNIC/IPU Based NVMe/TCP Initiator Offload

Ziye Yang, Staff software engineer, Intel

Yadong Li, Principal Engineer, Intel

Abstract

Infrastructure Processing Units (IPU) is an evolution of SmartNIC, focusing on infrastructure processing such as networking offload and storage offload. IPU is a critical ingredient in the disaggregated computer architectures and becomes a control point in the DC-oF (Data Center of the Future). In this talk, we would like to share the NVMe over TCP Initiator implementation as an example for IPU based storage offload, focusing on SPDK (https://spdk.io) support for IPU based NVMe over TCP Initiator solution.

First, we will introduce our high performance 200GbE IPU, then we will present the software components composed with SPDK based NVMe over TCP software stack. Second, we will share the performance optimizations in SPDK from last year which leverages on Intel® Ethernet 800 Series with Application Device Queues (ADQ) to improve SPDK based NVMe/TCP initiator performance.  Third, we will share some performance results from Intel® Ethernet 800 Series with Application Device Queues (ADQ), which demonstrates ADQ significantly improves the performance (increase the IOPS and reduces the longtail latency) for the SPDK based NVMe over TCP initiator.


Scalable Storage Performance for High Density Applications

Jayamohan Kallickal, Distinguished Engineer, Broadcom

Naveen Krishnamurthy, Sr. Product Manager, VMware

Abstract

Real Time Analytics and Data intensive applications have driven the adoption of high performance, low latency, highly parallel NVMe solutions and now we are in the cusp of wide adoption of NVMe over Fabrics (NVMe-oF) Storage in both On-Prem and Cloud Data Centers. Since the introduction of NVMe-oF, a lot of investment and effort have gone in to improve the overall storage performance and features resulting in substantial improvement of IOPs and CPU utilization and lowering the TCO for the customer.

NVMe over Fabrics lends itself well to cater to the demands of modern applications on Kubernetes Platforms and operating systems are now able to scale up and saturate the Storage Bandwidth.

We will go over various architectures and their benefits/pitfalls. How modern applications are driving requirements for low latency IO. How latest 64G Fibre Channel provides 10+M IOPs with round 10 microsecs latency. What does this mean to the end customer? We will go over the performance benefits from the recent changes with respect to NVMe over Fabrics. Also, will focus on how commonly used applications running in VMs/Containers are driving increase in IO Density and how to get optimum performance.


Challenges and Effects of EDSFF-based NVMe-oF Storage Solution

Duckho Bae, Principal Engineer, Samsung Electronics

Jungoo Kim, Staff Engineer, Samsung Electronics

Abstract

EDSFF (Enterprise and Data Center SSD Form Factor) SSD has been widely adopted in large-scale data centers because of superior manageability, serviceability, and power/thermal characteristics. The new form factor enables the high performance (support up to PCIe Gen6) and high capacity (up to 128TB) SSD, and allows to support new type of devices like NIC, AI accelerator, and CXL DRAM device in a system. However, adopting of EDSFF in NVMe-oF system is non-trivial. For example, with EDSFF SSD, the capacity of single storage server can easily have Petabyte in total with high capacity EDSFF SSDs. In order to manage these resources efficiently, we must consider a number of factors such as CPU resource allocation and scheduling, memory management, and network bandwidth.

In this talk, we would like to compare two EDSFF form factor E1 and E3 form factors. We will also discuss challenges and opportunities for adopting the new form factor in NVMe-oF storage system from the HW and SW point of view. We especially want to share a reference server architecture and will present user-space software stack for NVMe-oF solution. The SW adopts user-space driver, NUMA-aware capability, and 2-staging IO stack, it can enable up to 400GbE performance over TCP network. The SW also manages its metadata with low overhead; it can provides petabyte scale capacity derived from high capacity EDSFF SSDs. Finally, we will share the performance measurement methodology for systems equipped with large quantities of EDSFF NVMe SSDs.


NVMe/TCP in the Enterprise

Mukesh Gupta, Technical Staff, Dell Technologies

Murali Murali Rajagopal, Director of Technology, VMware

Abstract

This session provides an overview of designing and implementing NVMe/TCP across shared storage for the enterprise. This session will cover enabling this technology in Dell PowerStore, and will be co-presented with VMware to discuss NVMe/TCP initiator support. An analysis on the benefits of NVMe/TCP will be covered, along with the ecosystem support needed to ensure enterprise readiness, simplicity and scale.

Security and Privacy

Ransomware!!! – an Analysis of Practical Steps for Mitigation and Recovery

Thomas Rivera, Strategic Success Manager, VMware

Mounir Elmously, Executive, Consulting Services Ernst & Young

Abstract

Malware, short for malicious software, is a blanket term for viruses, worms, trojans and other harmful software that attackers use to damage, destroy, and gain access to sensitive information; software is identified as malware based on its intended use, rather than a particular technique or technology used to build it.

Ransomware is a blended malware attack that uses a variety of methods to target the victim’s data and then requires the victim to pay a ransom (usually in crypto currency) to the attacker to regain access to the data upon payment (with no guarantees).

However, the landscape is changing, and ransomware is no longer just about a financial ransom. Attacks are now being aimed at the infrastructure and undermining public confidence, witness recent headlines regarding incidents affecting police informant databases and oil pipeline sensors. There is also the recent US Treasury guideline to businesses advising them not to pay the ransom.

What can we realistically do to prevent such attacks, or do we simply surrender and accept we will lose our data and that the insurance payout will cover any loss? There is increasing evidence that the insurance companies are unwilling to meet those claims, so the situation is perilous as the criminals always appear one step ahead.

As a starting point, everyone needs to start assuming they will be attacked at some stage – therefore prevention and mitigation strategies should be based on that assumption.

This session outlines the current threats, the scale of the problem, and examines the technology responses currently available as countermeasures. What can be done to prevent an attack? What works and what doesn’t? What should storage developers be thinking about when developing products that need to be more resilient to attack?


Privacy's Increasing Role in Technology

Cathleen Scerbo, VP, Chief Information Officer, International Association of Privacy Professionals

Abstract

Every organization today is in some state of digital transformation. While the understanding of security needs in the digital age has matured significantly in the last 2 decades, the implication for data privacy and in particular its interaction with technology solutions, are still not well understood. As data regulations and laws continue to evolve, globally, organizations require an increased understanding of privacy requirements and their impact on technology solutions. In this session, Cathy will provide a high level overview of data privacy including a snapshot of the evolution of privacy, key privacy principles, Privacy by Design, Privacy and the SDLC, the NIST Privacy framework. Cathy will also discuss the overlap between Security and Privacy and highlight the criticality of understanding the critical role of tech professionals data privacy today.


Designing with Privacy in Mind

David Sietz, Systems / Solution Architect, IAPP

Abstract

Business requirements are not the only influencers of our technical solutions. Laws and Regulations transform the technical landscape in ways that require us to redefine our architecture, as well as our skill-set. This is especially true with Data Privacy. Since GDPR and CCPA, our industry is witnessing a new career path emerge: the Privacy Engineer. Where security started 10 years ago, so does privacy engineering. Join us as we look at Privacy by Design (PbD) and introduce some architecture patterns that align with privacy strategies.

Agenda:

  • Overview
  • Data Usage Agreements
  • Data Tracker Chain
  • Data Security Guard
  • Data Privacy Inspector
  • Forward Thinking

Emerging Storage Security Landscape

Eric Hibbard, Director, Product Planning – Storage Networking & Security, Samsung Semiconductor, Inc.

Abstract

Current storage technologies include a range of security features and capabilities to allow storage to serve as a last line of defense in an organization’s defense in depth strategy. However, the threat landscape continues to change in negative ways, so new responses are needed. Additionally, the storage technology itself is changing to address the increased capacity and throughput needs of organizations.

Technical work in ISO/IEC, IEEE, NVM Express, DMTF, OpenFabric Alliance, Trusted Computing Group (TCG), Open Compute Project (OCP), Storage Networking Industry Association (SNIA), etc. are introducing new storage technologies, specifying the way storage fits into increasingly complex ICT ecosystems, and identifying protection mechanism for data and the systems themselves. Understanding these developments and their interrelationships will be critical for securing storage systems of the future.

This session highlights important storage security elements of both current and emerging storage technologies, including encryption, key management, storage sanitization, roots of trust and attestations, secure communications, and support for multitenancy. Like storage, security technologies are also changing, so crypto-agility, protocol changes, and security practices (e.g., zero trust) are explored.


Quantum Safe Cryptography for Long Term Security

Basil Hess, Research Engineer, IBM Research Europe

Abstract

Quantum computers with the capability to threaten the cryptography used today may seem a long way off, but they already pose a threat to both data and systems that we are protecting today. This talk will introduce the quantum threat and discuss why this is already a topic for today and not sometime in the future when large quantum systems will emerge, with particular considerations for long-term secure storage.

This will be followed by an overview of the race to standardize new cryptographic algorithms that are secure even against large quantum computers of the future. The new quantum safe algorithms will bring a lot of diversity to the cryptographic landscape. It is expected that multiple schemes will be standardized, based on different mathematical problems such as lattices, isogenies of elliptic curves or error-correcting codes. Different performance and bandwidth characteristics will further increase the complexity of cryptographic management and will pose a demand for cryptographic agility.

We will further give an overview of ongoing projects in quantum safety in areas such as in storage and will also show how developers can already today prototype quantum safe applications using open-source projects like Open Quantum Safe.


Sanitization – Forensic-Proofing Your Data Deletion

Eric Hibbard, Director, Product Planning – Storage Networking & Security, Samsung Semiconductor, Inc.

Abstract

Almost everyone understands that systems and data both have lifecycles that typically include a disposal phase (i.e., what you do when you do not need something anymore). Conceptually, data needs to be eliminated either on a system or entirely (everywhere stored) as part of this disposal. Simply hitting the delete-key may seem like the right approach, but the reality is that eliminating data can be difficult. Additionally, failing to correctly eliminate certain data can result in costly data breach scenarios.

“Sanitization” is the term used to label actions taken to eliminate data with a given level of assurance. This assurance assumes a competent forensic profession with a full complement of forensic tools being used for data recovery attempts. To be successful, the sanitization techniques must be matched to the underlying storage and, in some cases, may require action prior to recording of any data.

This session outlines the various forms of sanitization and methods used (e.g., clear, purge, and destruct). In addition, details are provided on representative storage to help explore what needs to be done, what can go wrong, and identify additional measures that may be needed to protect an organization. Lastly, the session will provide information on the state of sanitization standards and practices.

SMB

SMB3 Landscape and Directions

Wen Xin, Software Engineer, Microsoft Corporation.

Steven Tran, Software Engineer, Microsoft Corporation

Abstract

SMB3 has seen significant adoption as the storage protocol of choice for running private cloud deployments. With the recent advances in persistent memory technologies, we will take a look at how we can leverage the SMB3 protocol in conjunction with SMBDirect/RDMA to provide very low latency access to persistent memory devices across the network. With the increasing popularity of cloud storage - technologies like Azure Files which provide seamless access to cloud stored data via the standard SMB3 protocol is seeing significant interest. One of the key requirements in this space is to be able to run SMB3 over a secure / firewall friendly internet protocol. We will update you on some work we are doing to enable SMB3 over QUIC - which is a recent UDP based transport with strong security and interop properties. We will explore some work we have done to enable on-the-wire compression for SMBDIRECT SMB3.


To the Cloud and Beyond, Accessing Files Remotely from Linux: Update on SMB3.1.1 Client Progress

Steven French, Principal Software Engineer - Azure Storage, Microsoft

Abstract

Over the past year many improvements have been made in Linux for accessing files remotely. This has been a great year for cifs.ko with the addition of new SMB3.1.1 features and optimizations. It continues to be the most active network/cluster file system on Linux.

Improvements to performance with handle leases (deferred close), multichannel, signing improvements, huge gains in read ahead performance, and directory and metadata caching improvements have been made. And security has improved with support for the strongest encryption, and more recently the exciting work on QUIC. Many other security improvements have been added and will be described. This presentation will go through the features added over the past year to the Linux client, and demonstrate how they help common scenarios, from accessing the cloud faster (like Azure) to accessing Samba, Windows, Macs and the new Linux kernel server (ksmbd).


The New Samba VFS

Ralph Böhme, Senior Software Engineer, SerNet / Samba Team

Abstract

Starting with version 4.14 Samba provides core infrastructure code that allows basing all access to the server's filesystem on file handles and not on paths. An example of this is using fstat() instead of stat(), or SMB_VFS_FSTAT() instead of SMB_VFS_STAT() in Samba parlance.

Historically Samba's fileserver code had to deal a lot with processing path based SMB requests. While the SMB protocol itself has been streamlined to be purely handle based starting with SMB2, large parts of infrastructure code remains in place that will "degrade" handle based SMB2 requests to path based filesystem access.

In order to fully leverage the handle based nature of the SMB2 protocol we came up with a straight forward way to convert this infrastructure code, so it can be converted to make use of a purely handle based VFS interface.

The talk will present what we have achieved so far and what is left to do. It's intented audience is anyone working on the Samba fileserver code and anyone working on Samba VFS modules.


Samba Multi-Channel/io_uring Status Update

Stefan Metzmacher, Developer, SerNet/Samba-Team

Abstract

Samba had experimental support for multi-channel for quite a while. SMB3 has a few concepts to replay requests safely.

We now implement them completely (and in parts better than a Windows Server). The talk will explain how we implemented the missing features.

With the increasing amount of network throughput, we'll reach a point where a data copies are too much for a single cpu core to handle. This talk gives an overview about how the io_uring infrastructure of the Linux kernel could be used in order to avoid copying data, as well as spreading the load between cpu cores. A prototype for this exists and shows excellent results.

Solid State Storage Solutions

Challenges & Opportunities with Hyper-Scale Boot Drives

Karthik Shivaram, Storage Engineer, Facebook

Abstract

Boot Devices are a critical component of servers used at scale in the Data Center. The needs and use cases for boot drives in the Data Center are very different than client HDD and Flash based boot drives in laptop applications. This talk will describe the differences between client boot drive use cases and hyperscale use cases. This talk will describe hyperscale use-cases for boot drives, unique needs Hyper-Scalars have for flash based boot drives and hyperscale challenges when deploying flash based boot drives at-scale. This talk will also discuss pros and cons of various boot drive options for now and the future available to resolve these challenges.


From DRAM to SSDs, Challenges with Caching at FB Scale

Sathya Gunasekar, Software Engineer, Facebook

Abstract

Large scale caching systems are making a transition from DRAM to SSDs for cost/power trade-offs and that brings out interesting challenges to both software and hardware. At Facebook, CacheLib is a widely deployed general purpose caching engine, enabling this transition through hybrid caching. In this talk, we introduce hybrid cache, highlight the challenges of hybrid caches at scale and the various techniques Facebook adopts to tackle these challenges. Looking forward CacheLib offers a platform for the software, hardware and academic community to collaborate on solving the upcoming challenges with new hardware and software solutions for caching.


SPDK Implementation on a Manycore / Many Node System

Jean-François Marie, Chief Solution Architect, Kalray

Rémy Gauguey, Senior Software, Architect, Kalray

Abstract

As you know, the Storage Performance Development Kit (SPDK) provides a set of tools and libraries for writing high performance, scalable, user-mode storage applications. Kalray’s MPPA® manycore architecture proposes a unique 80-cores system.

A manycore processor is characterized by an apparent grouping from a software point of view of cores and their portion of the memory hierarchy into computing units. This grouping can delimit the scope of cache consistency and inter-core synchronization operations, include explicitly addressed local working memories (as opposed to caches), or even specific data movement engines and other accelerators. Computing units interact and access external memories and processor I/O through a communi¬cation device that can take the form of a network-on-chip (NoC).

The advantage of the manycore architecture is that a processor can scale to massive parallelism by replicating the computing units and extending the network on chip, whereas for a multi-core processor the replication applies to the core level. For storage purposes, the internal processor clusters are configured with one dedicated cluster as a control and management plane, and the remaining four clusters as four independent data planes.

We have implemented SPDK so that it provides a unique scalable platform that can deliver high performances on an 80-core system.This presentation will explain how we have ported SPDK on our processor core, and what unique pieces of technologies have been developed in order to coordinate with the processor internals. We will also explain how the platform can scale.

Storage Architecture

Netflix Drive for Media Assets

Vikram Krishnamurthy, Senior Software Engineer, Netflix, Inc.

Kishore Kasi, Software Engineer in Data Platform, Netflix, Inc.

Abstract

Netflix Studios produces petabytes of media content accounting for billions of media assets. These assets are managed, created, edited, encoded, and rendered by artists working on a multitude of workstation environments that run on cloud, from different parts of the globe. Artists working on a project may only need access to a subset of the assets from a large corpus. Artists may also want to work on their personal workspaces on intermediate content, and would like to keep only the final copy of their work persisted on cloud.

Ever wondered about the architecture that works for this scale and provides artists with a secure, performant and seamless storage interface?

In this talk, we present Netflix Drive, a Cloud Drive for Studio Media applications and a generic paved path solution for storing and retrieving all assets in Netflix. Netflix Drive ties together disparate data and metadata stores in a cogent form for creating and serving assets.

Talk Structure:
In this talk, we will share with the audience how Netflix Drive is an extensible, scalable, performant, hybrid architecture for managing Studio and Media assets. We explore how Media pipelines leverage the dynamic namespace design provided by Netflix Drive to expose pertinent assets to artists. We also highlight different instance types of Netflix Drive that open up several integrations with tools and workstations used by Studio artists.

Key Takeaways:
As studio applications generate and consume assets, there is a need to design scalable architectures that work in cloud and on-premise, provide a globally consistent view of data, and integrate seamlessly with artist workflows. In this talk, attendees would learn about an extremely performant and scalable file system built using FUSE to provide an intuitive interface to artists, and how multiple data & metadata stores, which can be on-premise or in cloud, can be plugged into Netflix Drive’s ecosystem. Attendees will also learn how different instances of Netflix Drive can be used by different studio applications and workflows to store and retrieve pertinent content.


From NASD to DeltaFS: CMU and Los Alamos's Efforts in Building Large-Scale Filesystem Metadata

Qing Zheng, Scientist, Los Alamos National Laboratory

Abstract

It has been a tradition that, every once in a while, we stop and reassess whether we need to build our next filesystems differently. A key previous effort was made by the Carnegie Mellon University's NASD project, which decoupled filesystem data communication from metadata management and leveraged object storage devices for scalable data access. Now, as we enter into the exascale age, once again, we need bold ideas to advance parallel filesystem performance if we are to keep with up the rapidly increasing scale of today's massively-parallel computing environments.

In this presentation, we introduce DeltaFS, a research project at Carnegie Mellon University and Los Alamos National Lab. DeltaFS is based on the premise that at exascale and beyond, synchronization of anything global should be avoided. Conventional parallel filesystems, with fully synchronous and consistent namespaces, mandate synchronization with every file create and other filesystem metadata operations. This must stop. At the same time, the idea of dedicating a single filesystem metadata service to meet the needs of all applications running on a single shared computing environment, is archaic and inflexible. This too must stop.

DeltaFS allows parallel computing jobs to self-commit their namespace changes to logs later published to a registry, avoiding the cost of global synchronization. Followup jobs selectively merge logs produced by previous jobs as needed, a principle we term No Ground Truth which allows for scalable data sharing without requiring a global filesystem namespace. By following this principle, DeltaFS leans on the parallelism found when utilizing resources at the nodes where job processes run, improving metadata operation throughput as job processes increase. Synchronization is limited to an as-needed basis that is determined by the needs of followup jobs, through an efficient, log-structured format that lends itself to deep metadata writeback buffering and deferred metadata merging and compaction. Our evaluation shows that no ground truth enables more efficient inter-job communication, reducing overall workflow runtime by significantly improving client metadata operation latency and resource usage.


TCG Storage Workgroup Status Update

Chandra Nelogal, DMTS (distinguished member of technical staff), Trusted Computing Group

Joseph Chen, CEO, ULINK Technology

Abstract

This is an overview of the new standards work being defined in the storage work group of the TCG. This includes overview of the TCG Opal SSC, SIIS (Storage Interface Interactions Specifications), Configurable Namespace Locking, and Key Per IO. The session may also touch upon some of the enhancements being worked on in the work group such as Settable Trylimits and Persistence feature set.


Fine Grain Encryption Control for Enterprise Applications

Festus Hategekimana, Senior FW Engineer, Trusted Computing Group

Frederick Knight, Principal Standards Technologist, Trusted Computing Group

Abstract

The Key Per IO (KPIO) project is a joint initiative between the NVM Express® and TCG Work Groups (WGs) to define a new KPIO Security Subsystem Class (SSC) under TCG Opal SSC for NVMe® class of Storage Devices.

Self-Encrypting Drives (SED) perform continuous encryption on user accessible data based on contiguous LBA ranges per namespace. This is done at interface speeds using a small number of keys generated/held in persistent media by the storage device. KPIO will allow large numbers of encryption keys to be managed and securely downloaded into the NVM subsystem. Encryption of user data then occurs on a per command basis (each command may request the use a different key).

This provides a finer granularity of data encryption that enables a granular encryption scheme in order to support the following use cases:

  1. Easier support of European Union’s General Data Protection Regulations’ (GDPR) “Right to be forgotten”.
  2. Easier support of data erasure when data is spread over many disks (e.g., RAID/Erasure Coded)
  3. Easier support of data erasure of data that is mixed with other data needing to be preserved.
  4. Assigning an encryption key to a single sensitive file or host object.

The presentation will include a brief introduction to the architectural differences between traditional SEDs and the KPIO SSC, followed by an overview of the proposed TCG KPIO SSC specification, and the features in the NVMe commands to allow use of KPIO. The talk will conclude by summarizing the current state of the standardization proposals with in NVM Express.org and the TCG Storage WG.


Emerging Computer Architectures Powered by Emerging Memories

Thomas Coughlin, President, Coughlin Associates, Inc.

Jim Handy, Principal Analyst, Objective Analysis

Abstract

This talk will discuss the latest trends in the growth of emerging non-volatile memories and look beyond at the emergence of new computer architectures which will use non-volatile memories for near memory and shared far memory. Emerging non-volatile memory technologies (such as 3D XPoint, MRAM and RRAM) are now available in the data center and in the next generation of AI-based IoT devices. Major foundries are making SoCs with MRAM and RRAM replacing NOR flash and SRAM. Non-volatile memory will play a big role in the data center, at the edge and endpoints.


Unify Data and Storage Management with SODA ODF

Steven Tan, VP & CTO Cloud Solution, Storage at Futurewei / Board Chairman at SODA Foundation, Futurewei Technologies

Anjaneya ‘Reddy’ Chagam, Principal Engineer and Chief SDS Architect, Intel’s Data Center Group / Board Member at SODA Foundation, Intel Corp.

Abstract

The Open Data Framework (ODF) unifies data and storage management from the core, to cloud and to edge. In this talk, we will show how ODF simplifies Kubernetes storage management, provides data protection for applications, and connect data on-prem to clouds. We will also be introducing how ODF can be extended with other SODA projects such as DAOS - a distributed asynchronous object storage for HPC, ZENKO - a multicloud data controller with search functionality, CORTX - an object storage optimized for mass capacity storage and others (YIG, LINSTOR, OpenEBS).

SODA Foundation is a Linux Foundation project focused on building an ecosystem of open source data management and storage software for data autonomy.


How to Connect a SAS System

Alex Haser, Senior Industry Standards Engineer, Molex, Board Member, SCSI Trade Association

Abstract

The new Serial Attached SCSI (SAS)-4.1 (INCITS 567) technology is being deployed in the market for use in practically every industry, including hyperscale data centers, banking, education, government, healthcare and manufacturing. It maintains backwards compatibility with previous-generation SAS implementations, which means that older drives will be compatible with newer storage controller and SAS expander products.

In this session, you’ll be introduced to the SAS Integrator’s Guide: a quick reference to the menu of standard connectors and cables required to assemble a SAS system, from the SAS drive (a hard drive or a solid-state drive) out to the enclosure level, including the associated box-to-box cabling.


A Tiering-Based Global Deduplication for a Distributed Storage System

Sung Kyu Park, Staff Engineer, Samsung Electronics

Myoungwon Oh, Staff Engineer, Samsung Electronics

Abstract

Reducing the amount of data is a huge advantage of saving a total cost of ownership for a distributed storage system. To do this, a deduplication method which removes redundant data is being used as one of the promising solutions to save storage capacity. However, in practice, traditional deduplication methods designed for a local storage system is not suitable for a distributed storage system due to several challenging issues. First, I/O overhead due to additional data and metadata processing can have a huge impact on performance, and the deduplication ratio is not high enough due to data distributed across multiple nodes. Second, it is not easy to design efficient metadata management for deduplicated data along with legacy metadata management due to scale-out characteristics.

To address these challenges, in this talk, we propose a global deduplication method with a multi-tiered storage design and self-contained metadata structure. A tiering with deduplication-aware replacement policy can improve a deduplication efficiency by filtering out more important chunks which have high deduplication ratio. In addition, by adopting a self-contained metadata structure, it can also provide compatibility with existing storage features, recovery and snapshot. As a result, our proposed tiering-based global deduplication can reduce I/O traffic and save storage cost for a distributed storage system.


On Asymmetrical Storage Implementations

Josh Salomon, Senior Principal Software Engineer, Red Hat

Orit Wasserman, Senior Principal Software Engineer, Red Hat

Abstract

We are all used to thinking that from a performance perspective distributed storage symmetrical systems perform best because of the weakest link in the chain effect. This presentation discusses situations in which asymmetrical implementation reduces the cost of implementation and improves the performance. This happens because of the public cloud resource variety and price structure. The presentation discusses the model and the details for cheaper and more performant asymmetrical Ceph deployment.

Storage Networking

Exploiting RDMA to Create a Geographically Dispersed Storage Network over the WAN with Real-Time Access to Data

Stephen Wallo, CTO, Vcinity

Abstract

Accessing data within an enterprise spread across geo-diverse locations challenges work productivity, time, and IT resources. Existing methods require data to be transferred and replicated leading to delayed business insights or even insights based on stale data. In addition, having to send a copy of data to every user that requires it leads to copy sprawl, data management challenges and compromised data security. Even with data transfer, data arrives at the destination in an unpredictable time and performance varies with data type/size and application. Utilizing network data mover appliances and applications is so cumbersome that even physical transportation of media or data is considered an acceptable solution. In summary, there is a need to access data without replication across geographic distances.

It is commonly accepted that remote application execution is not possible when a high Bandwidth Delay Product (BDP), i.e., the product of a link's capacity (in [x]bits per second) and its round-trip delay time (in [x]seconds), exists between the compute location and the data location. The BDP between the compute and data when they are within a data center, on a LAN, or in the same Cloud provider, is acceptable for most common applications. However, as soon as the BDP increases traversing a WAN over even minor latencies, these same applications are functionally unusable due to the time required to get the data to the location of the compute. BDP has major impact on traditional networks using the TCP/ IP protocol; various methods have been used to optimize or tune TCP/IP to make it more efficient with moderate results.

The Vcinity Data Access Platform™ (VDAP) enables enterprises to instantly access and operate on data sets over any distance, with local performance, and without copying them. This is accomplished by transforming the enterprise WAN into a Global LAN enabling local application performance on data over global distances. This capability lets enterprises leverage modern business tools such as machine learning modeling and artificial intelligence innovations leading to more efficient business processes and a greater competitive advantage. The concept of turning WAN into Global LAN and enabling a Global Fabric is achieved through RDMA over WAN.

With over 32 patents as the underpinning to the VDAP portfolio, it’s the individual advancements that in aggregate achieve application reach-in to data over distance. The solution is made up of an RDMA based network fabric at Layer 2 or layer 3 incorporating the following:

  • Routing at layer 3 supporting L2/L3 tunneling
  • Data inflight buffering / crediting creating a lossless WAN tunnel
  • Packet loss recovery for efficient error correction
  • End-to-end flow control for forward and backward congestion signaling
  • Tunnel monitoring, heartbeat and occupancy threshold support
  • Priority classification
  • Standard NFS / SMB/ S3 extensibility
  • Packet flow segmentation and reassembly
  • Security through in-flight encryption and multipath obfuscation

The result is the ability to sustain 90%+ of theoretical data throughput on global distances to provide applications the same performance experience as when the data is local to the application.

As seen above the Vcinity Data Access Platform integrates with other HPC technologies and high-speed storage. It attaches to a standard NAS or transitional high-speed storage tier and, when connected over the MAN/ WAN, provides a high-performance, geo-diverse data exchange. VDAP enables a global federated data platform for accessing data without replication, using global namespace and network-mapped drive volumes across geographically distributed enterprise storage.


Introducing Fabric Notifications, from Awareness to Action

Howard Johnson, Technology Architect, Broadcom BSN, Member, Fibre Channel Industry Association

Abstract

Marginal links and congestion have plagued storage fabrics for years and many independent solutions have been tried. The Fibre Channel industry has been keenly aware of this issue and, over the course of the last two years, has created the architectural foundation for a common ecosystem solution. Fabric Notifications employs a simple message system to provide registered participants with information about key events in the fabric that are used to automatically address link integrity and congestion issues. This new technology has been embraced by the Fibre Channel community and has demonstrated a significant improvement in addressing the nagging issues. In this informative session, storage experts will discuss the evolution of this technology and how it is a step toward a truly autonomous SAN.


SmartNICs, The Architecture Battle Between Von Neumann and Programmable Logic

Scott Schweitzer, Sr. Mgr Product Planning - SmartNICs, Achronix

Abstract

This presentation will outline the architectures for the top three platforms in each of these two categories, Von Neumann and Programmable Logic. Showing how vendors like NVIDIA, Pensando, Marvell, Achronix, Intel, and Xilinx have chosen to architect their solutions. We will then weigh the merits and benefits of each approach while also highlighting the performance bottlenecks. By the end of the presentation, it may be fairly clear where the industry is headed, and which solutions may eventually win out.

Storage Performance / Workloads

Uncovering Production Issues - with Real World Workload Emulation

Swati Chawdhary, Senior Manager, Samsung

Byju Ravindran, Associate Director, Samsung Semiconductor India Research

Abstract

Current enterprise storage devices have to service many diverse and continuously evolving application workloads (For e.g., OLTP, Big Data/Analytics and Virtualization). These workloads combined with additional enterprise storage services like deduplication, compression, snapshots, clones, replication, tiering etc. result in complex I/Os to the underlying storage. Traditional storage system tests make use of benchmarking tools, which generate a fixed and constant workload, comprised of a single or few I/O access patterns and are not sufficient for enterprise storage testing. Workload simulation-based tools, which are available in the market come with their own challenges like cost, learning curve and workload support. Hence, it has become a very big challenge to generate, debug and reproduce these workloads, which could eventually lead to many customer found defects.

This arises a need for a robust testing methodology, which closely emulates production environment and helps identify issues early in testing. In our solution testing lab, we have been working on a unique test framework design which leverages software defined services and helps to uncover and reduce turnaround time for complex production issues.

In our talk, we will show how we have built our test framework using containers and various open-source tools and its role in our solution testing efforts for our next generation storage products.


Unearthing the Impact of Sanitize on Performance and Latency in Embedded Storage

Shankar Athanikar, Staff engineer, Samsung Semiconductor India Research

Sumeet Paul, Staff Engineer, Samsung Semiconductor India Research

Abstract

It is broadly known that in an operating system, if any file is deleted, Discard will be issued to underlying storage device. When user deletes file through Operating system, it is not physically deleted from the storage medium, as a matter of fact, this file data is marked as Invalid but remains in the unmapped address space. In another instance, when host performs over write on the previously written logical space and then this previously written memory space can be invalidated by discard operation. These all cases may create lot of fragmentation in device and eventually make the system slow i.e. user starting seeing lag in application, performance drop, high write latency etc.

In order to handle this unmapped address space effectively there is provision in JEDEC specification and that is called “Sanitize”. In a nutshell, sanitize process removes the data from the unmapped address space either by performing Physical erase of all the blocks or vendor defined method. To unearth the impact of sanitize, various real time work load taken from different automotive host patterns and examined along with FTL data extracted using debugging Firmware. This study helps to understand how seasonably use of sanitize helps in reduce the latency, better user experience with applications and improve the performance etc. Sanitize utilized in accordance with storage device policy will significantly improve QoS (Quality of Service) i.e. better consistency and predictability of latency (storage response time) and performance while serving read/write commands.


A Primer on GPUDirect Storage

Kushal Datta, Senior Product Manager, Nvidia

Abstract

Extreme Compute needs Extreme IO. The convergence of HPC and AI are using GPUs in wider range of applications than ever before on multitude of platforms ranging from edge devices, commodity hardware to high performance supercomputers. Larger datasets enable more accurate AI models which gathers deeper information enabling enterprises to collect more and more data. This virtuous cycle is enabling the explosive demands in processing larger amounts of data and the need to reduce IO bottlenecks is greater than ever. With a strong and growing ecosystem and 1.0 GA release, GPUDirect Storage brings in a wealth of capabilities to traditional HPC applications, applications sitting at the convergence of HPC and AI and data analytics ubiquitiously. In this session, we will talk about what's new in this GA release, what value GDS brings to customers and how storage partners are helping NVIDIA grow the ecosystem and developer community.


Distributed WorkLoad Generator for Load Testing Using Emerging Technologies

Vishnu Murty, Karrotu Automation Technologist, Dell Technologies

Abstract

In DellEMC Enterprise Server/Storage Validation Organization, We perform Load testing using different workloads (Web, File, FTP, Database, Mail, etc.) on Servers to identify the performance of the Systems under heavy load. Knowing how DellEMC Enterprise Systems perform under heavy load (% CPU, % Memory, % Network, % Disk) is extremely valuable and critical. This is achieved with the help of a Load Testing Tools. Load testing tools available in market comes with its own challenges like Cost, Learning Curve and Workloads Support. Here in this talk we are going to demonstrate how we have built JAAS (JMeter As A Service) Distributed WorkLoad Testing solution using Containers and opensource tools and how this solution playing a crucial role in Delivering Servers Validation efforts.


Revolutionizing Cloud Data Center Designs

Jai Menon, Chief Scientist, Fungible

Abstract

Current workloads at data centers are changing at a blistering pace and fast becoming data-centric. Modern cloud-native applications are written as microservices distributed across network connected servers and many of these applications need to process large amounts of data quickly—data that cannot fit in a single server and therefore needs to be “sharded” or spread across many servers. In this environment, the next generation data center infrastructure must be disaggregated, secure, durable and efficient with sophisticated data reduction techniques, while still providing high performance. This presentation will review broad industry trends and cloud data center requirements before outlining how an emerging class of storage system will differ from existing storage. This presentation will also introduce the industry’s first true scale out storage system suitable for next-generation data centers.


FinTech Data Pipelines and Storage I/O Related Benchmarks in Public Cloud

Shailesh Manjrekar, Head of AI, Weka

Abstract

Data has become the new source code and data storage and I/O are becoming the major inhibitor for deriving actionable intelligence as well as proving data management capabilities for managing data - DataOps, is becoming paramount for the success of AI projects, particularly in Capital markets. This session will discuss the trends, solutions and benchmarks involved for successful AI imprementation for historic and realtime datasets.


SPDK Schedulers – Saving CPU Cores in a Polled Mode Storage Application

Jim Harris, Principal Software Engineer, Intel

Tomek Zawadzki, Cloud Software Engineer, Intel

Abstract

Polled mode applications such as the Storage Performance Development Kit (SPDK) NVMe over Fabrics target can demonstrate higher performance and efficiency compared to applications with a more traditional interrupt-driven threading model. But this performance and efficiency comes at a cost of increased CPU core utilization when the application is lightly loaded or idle. This talk will introduce a new SPDK scheduler framework which enables transferring work between CPU cores for purposes of shutting down or lowering frequency on cores when under-utilized. We will first describe the SPDK architecture for lightweight threads. Next, we will introduce the scheduler framework and how a scheduler module can collect metrics on the running lightweight threads to make scheduling decisions. Finally, we will share initial results comparing SPDK NVMe-oF target performance and CPU efficiency of a new scheduler module based on this framework with the default static scheduler.


The Future of the Storage World is Autonomous

Irfan Ahmad, Founder & CEO, Magnition.io

Abstract

Today, storage and memory hierarchies are manually tuned and sized at design time. But tomorrow’s workloads are increasingly dynamic, multi-tenant and variable. Can we build autonomous storage systems that can adapt to changing application workloads?

In this session, we demonstrate how breakthroughs in autonomous storage systems research can deliver impressive gains in cost, performance, latency control and customer out-of-the-box experience. Attendees will be able to see results from the latest research and development and learn:

  • Why static memory hierarchies leave so much performance on the floor.
  • What is a fully autonomous storage hierarchy and how does it automatically adapt to changing application workloads?
  • The efficiency, QoS, performance SLAs/SLOs and cost tradeoffs that fully autonomous caches and hierarchies juggle.

Software-Defined Performance Engineering

Irfan Ahmad, Founder & CEO, Magnition.io

Abstract

In mechanical engineering, CAD has enabled engineers, architects, and construction to create fully-featured designs so that they can visualize the construction which enables the development, modification, and optimization of the design process.

Why is this missing from the world of performance engineering? Until now, it has been seen as an intractable problem to build for the exponential difficulty of complex storage and memory hierarchies. That’s no longer the case.

Join our session to learn about the engineering behind StorageLab and how it allows end-to-end storage architecture performance engineering, in real-time.

Storage Resource Management

Redfish Ecosystem for Storage

Jeff Hilland, Distinguished Technologist, President DMTF, DMTF; HPE

Abstract

DMTF’s Redfish® is a standard API designed to deliver simple and secure management for converged, hybrid IT and the Software Defined Data Center (SDDC).

This presentation will provide an overview of DMTF’s Redfish standard. It will also provide an overview of HPE’s implementation of Redfish, focusing on their storage implementation and needs.

HPE will provide insights into the benefits and challenges of the Redfish Storage model, including areas where functionality added to SNIA™ Swordfish is of interest for future releases.


Open Industry Storage Management with SNIA Swordfish™

Richelle Ahlvers , Senior Storage Management Architect, Intel Corporation

Abstract

If you’ve heard about the SNIA Swordfish open industry storage management standard specification and want to get a technical overview, this presentation is for you. This presentation provides a broad look at the Redfish and Swordfish ReSTful hierarchies, maps these to some common applications, and provides an overview of the Swordfish tools and documentation ecosystem developed by SNIA’s Scalable Storage Management Technical Work Group (SSM TWG) and the Redfish Forum.


What’s New in SNIA Swordfish™ in 2021

Richelle Ahlvers , Senior Storage Management Architect, Intel Corporation

Abstract

If you haven’t caught the new wave in storage management, it’s time to dive in and catch up on the latest developments of the SNIA Swordfish™ specification. These include:

  • Expanded support for NVMe and NVMe-oF Devices
  • Managing Storage Fabrics
  • Standardized Capacity and Performance Metrics Management

Managing Open Fabrics with a Standards-based Interface: Bringing GenZ, Redfish, and Swordfish Together

Erich Hanke, Sr. Principal Engineer, Storage and Memory Products, IntelliProp

Abstract

The SSM TWG and OFA OFMFWG are working together to bring to life an open source Open Fabric Management Framework, with a Redfish/Swordfish management model and interface.

This presentation will provide an overview of the status of this work, and a demo of the current state of the proof of concept, built leveraging the Redfish and Swordfish-based open source emulator.


Completing the Picture for NVMe and NVMe-oF Management: Guidelines for Implementations

Curtis Ballard, Distinguished Technologist, HPE

Abstract

The SNIA Swordfish specification has expanded to include full NVMe and NVMe-oF enablement and alignment across DMTF, NVMe, and SNIA for NVMe and NVMe-oF use cases.

This presentation will provide an overview of the most recent work adding detailed implementation requirements for specific configurations, ensuring NVMe and NVMe-oF environments can be represented entirely in Swordfish and Redfish environments.


Managing Exported NVMe-oF Resources and Fabrics in Swordfish and Redfish

Phil Cayton, Senior Staff Engineer, Intel Corporation

Abstract

As NVMe-oF specifications continue to develop in NVM Express, so are the corresponding Swordfish management capabilities. This presentation provides an update on the latest NVMe-oF configuration and provisioning capabilities available through Swordfish.


Managing Ethernet-Attached Drives using Swordfish

Mark Carlson, Principal Engineer, Industry Standards, KIOXIA

Abstract

NVMe-oF drives can support NVMe over ethernet, but how do you manage them? This presentation will show how Swordfish has developed a standard model for NVMe ethernet-attached drives, providing detailed profiles as guidance for implementations including required and recommended properties. The profiles are now part of the Swordfish CTP program; ethernet attached drives can validate conformance to the specifications by participating.


Building a SNIA Swordfish™ Implementation: A Retrospective

Chris Lionetti , Senior Technical Marketing Engineer, Hewlett Packard Enterprise

Abstract

HPE will provide an overview of their experience developing an initial Swordfish implementation. This session will provide an overview of lessons learned through the initial proof-of-concept through development phases and will include recommendations to other implementers of areas that may require additional focus.


How to Drive Adoption of Your Products with the Swordfish Conformance Test Program

Richelle Ahlvers, Senior Storage Management Architect, Intel Corporation

Abstract

The SNIA Swordfish Conformance Test Program enables manufacturers to test their products with a vendor-neutral test suite and validate conformance to the SNIA Swordfish specification.

Swordfish implementations that have passed CTP are posted on the SNIA website; this information is available to help ease integration concerns of storage developers and increase demand for available Swordfish products. This session will provide an overview of the program, what functionality implementations and base requirements are needed for implementations to pass Swordfish CTP.


Expanding Development of your Swordfish Implementations Using Open Source Tools

Don Deel, Storage Management Architect, SNIA Member

Abstract

The SNIA Swordfish™ ecosystem is supported by open source tools, available in open repositories that are managed by the SNIA Scalable Storage Management Technical Working Group on GitHub, and the DMTF Redfish Forum, also on Github.

This session will walk through the tools you can use to go from zero to working SNIA Swordfish implementations. Starting from generating, validating, and using static mockups, using the emulator to make your mockups “come alive,” and then verifying your Swordfish service outputs match your expectations using open source validation tools; the same tools that feed into the Swordfish Conformance Test Program.


Accelerating NVMe / NVMe-oF RF/SF Development Using DEM

Rajalaxmi Angadi, Senior Engineer / Lead, Intel Corporation

Abstract

DEM is an open source simulator/emulator, providing an RF/SF interface, on top of an NVMe / NVMe-oF emulation environment. DEM enables centralized, efficient, and dynamic configuration and provisioning of NVMe-oF Resources. It is designed to do:

  • Remote configuration of NVMe-oF resources through RESTful web interface or CLI
  • Enumeration of remote NVMe resources provisioned to individual consumers
  • Notification of changes to resources

In this session, Raj will provide an overview of the DEM system tool design, as well as demonstrate the end-to-end system operation using RF/SF commands to discover, create, and configure NVMe-oF systems.


Simplifying Client Interactions with SMI-S using PyWBEM

Mike Walker, Storage Management Architect, Storage Networking Industry Association

Abstract

For client applications interacting with SMI-S implementations, PyWBEM is a python package that provides access to WBEM Servers. It also provides a mechanism for mocking WBEM Servers.

In this presentation, Mike Walker will provide an overview of PyWBEM, with an emphasis on how to learn to interact with SMI-S, through open-source mockup SMI-S 1.8.0 WBEM servers available on Github. He will highlight multiple use cases for the mockup servers, including:

  • Exploring new features in SMI-S 1.8.0
  • Developing IT applications on your local system
  • Developing client software on your local system

We Addressed the Elephant in the Room with OpenStack Sahara

Ajitha Robert, Cloud Stack Engineer, MSys Technologies

Abstract

Hadoop becomes imperative to process a large and complex set of data. However, often the issue of architecture scalability pose unnecessary roadblock in this process. OpenStack – an open-source software instills the required operational flexibility to scale-out architecture for Hadoop.

This session will throw light on how we helped a client install Hadoop on VMs using OpenStack. We will discuss in-depth the challenges of manual operations and how we overcame them with Sahara. The audience will also learn why we virtualized hardware on the computing nodes of OpenStack and deployed VMs.