Home > Architectures, SSD > SAN Shared File Systems with SSDs

SAN Shared File Systems with SSDs

On a recent post I discussed how PCIe SSDs fit in with the rise of shared-nothing scale-out clusters and why exceptionally large clusters favor direct attached storage.  There are two elements that are required for this type of architecture to be successful: a logical way to partition the data and a need to scan large chunks of data from those partitions.  These elements are prevalent in many applications, however, not to all of them.  There are applications where “Total Access” to the data is required and the scale-out model is ineffective as most of the data is fetched across the network rather than accessing it locally.

Applications requiring total access are common in the scientific computing, financial modeling, and government areas.  They are characterized by being able to be effectively parallelize the processing, but the data that each process needs could be located anywhere in a large dataset.  This results in lots of small random access that traditional cluster designs are not well suited to.  There have been three primary methods to tackle these problems: add a task specific preprocessing step to allow subsequent accesses to be less random, create a cluster where each compute node’s memory can be accessed over a network, or deploy the applications on a large memory SMP server.

There is a new option that I have seen getting deployed more and more often: using high capacity SSDs and a SAN shared file system.  A SAN shared filesystem provides the locking to allow multiple servers to directly access the block storage concurrently.  This provides the ease of use of a file system with the performance benefits of block storage access.  If you add SSDs into this setup you can build a very powerful solution.  The basic setup looks like the following:

 

Using SSDs as the primary storage allows the shared file system to handle small block random workloads in addition to the standard high bandwidth workloads that are the mainstay of most SAN shared file system deployments.  It is relatively easy to construct a system that has a couple hundred cores of processing power and tens to hundreds of TBs of flash.  This system can tackle the types of workloads that were previously reserved for the large memory SMP systems of the past.

Workloads that require “Total Access” to a large data set are limited in performance by the network and process steps that connect the compute resources and the storage. The big benefit of a SAN shared file system is first, that the shared components – the SAN resources – can use simple block level communication and multiple high bandwidth interfaces to have a very high performance and low latency. Secondly, solutions can scale-up the performance almost linearly since the coordination processes run on the servers rather than the storage.  When customers have extreme application performance needs this is the architecture I look to first.

Advertisements
Categories: Architectures, SSD
  1. Robert Norman
    July 12, 2011 at 9:42 pm

    This makes a lot of sense from a Topology and fast access SSD, but in a network many things the software requirements are confusing. What special software is required? I assume the SSD must be RAID for reliability.

    • July 13, 2011 at 9:06 am

      The SSD needs to have redundancies and a mission critical setup my want to have redundancies across chassis. The software that is needs to run on the host is a SAN shared files system (GPFS, Stornext, GFS, VCFS, etc). For example, with IBM’s GPFS (General Parallel File System) running on the servers in this setup, each node will have direct block access to the storage. The application will access the storage through a GPFS file system mount point, and GPFS will handle locking concerns and coherency as the multiple server nodes access the storage in parallel. GPFS effectively has an integrated cluster volume manager so it can also be used to mirror across chassis.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: