I have had the chance to meet with several analysts over the past couple of weeks and have raised the position that with eMLC the long awaited price parity of Tier-1 disks and SSDs is virtually upon us. I had a mixed set of reactions, from “nope, not yet” to “sorry if I don’t act surprised, but I agree.” For the skeptics I promised that I would compile some data to back up my claim.
For years the mantra of the SSD vendor was to look at the price per IOPS rather than the price per GB. The Storage Performance Council provides an excellent source of data that facilitates that comparison in an audited forum with their flagship SPC-1 benchmark. The SPC requires quite a bit of additional information to be reported for the result to be accepted, which provides an excellent data source when you want to examine the enterprise storage market. If you bear with me I will walk through a few ways that I look through the data, and I promise that this is not a rehash of the cost per IOPS argument.
First, if you dig through the reports you can see how many disks are included in each solution as well as the total cost. The chart below is an aggregation of the HDD based SPC-1 submissions showing the reported Total Tested Storage Configuration Price (including three-year maintenance) divided by the number of HDDs reported in the “priced storage configuration components” description. It covers data from 12/1/2002 to 8/25/2011:
Now, let’s take it as a given that SSD can deliver much higher IOPS than an HDD of equivalent capacity, and price per GB is the only advantage disks bring to the table. The historical way to get higher IOPS from HDDs was to use lots of drives and short stroke them. The modern day equivalent is using low capacity, high performance HDDs rather than cheaper high capacity HDDs. With the total cost of enterprise disk at close to $2,000 per HDD, the $/ GB of enterprise SSDs determines the minimum logical capacity of an HDD. Here is an example of various SSD $/GB levels and the associated minimum disk capacity points:
Enterprise SSD $/ GB
Minimum HDD capacity
To get to the point that 300 GB HDD no longer make sense, the enterprise price per GB just needs to be around $7/GB and 146 GB HDDs are gone at around $14/GB. Keep in mind that this is the price of the SSD capacity before redundancy and overhead to make it comparable to the HDD case.
It’s not fair (or permitted use) to compare audited SPC-1 data with data that has not gone through the same rigorous process, so I won’t make any comparisons here. However, I think that when looking at the trends, it is clear that the low capacity HDDs that are used for Tier-1 one storage are going away sooner rather than later.
About the Storage Performance Council (SPC)
The SPC is a non-profit corporation founded to define, standardize and promote storage benchmarks and to disseminate objective, verifiable storage performance data to the computer industry and its customers. The organization’s strategic objectives are to empower storage vendors to build better products as well as to stimulate the IT community to more rapidly trust and deploy multi-vendor storage technology.
The SPC membership consists of a broad cross-section of the storage industry. A complete SPC membership roster is available at http://www.storageperformance.org/about/roster/.
A complete list of SPC Results is available at http://www.storageperformance.org/results.
SPC, SPC-1, SPC-1 IOPS, SPC-1 Price-Performance, SPC-1 Results, are trademarks or registered trademarks of the Storage Performance Council (SPC)
I started my career at a different Texas company – Texas Instruments – and I remember the 1” drive division aimed at the mobile device market. It didn’t take very long for this to get end-of-lifed. It was a neat product and a serious feat of engineering, but it just couldn’t compete with Flash. At first it was because Flash was smaller, more rugged, and used less power. However, it was ultimately just because Flash was cheaper! (Compare the disk-based iPod Mini and the Flash-based iPod Nano.)
Disks have a high fixed cost per unit and a small marginal cost per GB . Physically bigger disks have a lower cost per GB than smaller ones. This is very different from the other storage media like Flash and tape. So it was bothering me recently – since each generation of disks takes on a smaller form factor than before, why are mainstream disks still shrinking, from 3.5” to 2.5”? If the disk market was just concerned with cost per GB and “tape is dead”, this is crazy – disk should be getting bigger! Why do disks continue their march towards smaller form factors when that just makes SSDs more competitive?
I originally thought that this was just a holdover from the attempts to make disks faster. Bigger disks are harder to spin at a high speed, so as the RPM rate marched forward disks had to get smaller. The advent of cost effective SSDs, however, has stopped the increase in RPMs. (Remember the news in 2008 of the 20k RPM disk?) The market for performance storage at a premium has been ceded to SSDs.
After spending some time thinking on it I think there are a few basic reasons disks continue their march:
- The attempt to have a converged enterprise, desktop, and laptop standard.
- The need for smaller units to compose RAID sets, so that during a rebuild the chance of a second failure is not too high. I understand this, but RAID-6 is an alternate solution.
- Disks are not just for storage, they are for both performance and long term storage.
Simply because disks store data on a circular platter, every time the bit density increases, the capacity grows by a power of 2 function, but the ability to access randomly doesn’t change, and the bandwidth only grows by a power of 1 function. At some point the need for capacity is more than adequately met so the performance need takes over and disks shrink to get the performance and capacity more in sync.
Neither tape cartridges nor Flash suffer the fixed cost problem or geometry induced accessibility issues of disks. With the new high density cartridges coming online tape continually avoids being supplanted by disk for pure capacity requirements. TMS even recently had a customer that was able to leverage a Tape + SSD deployment and skip disks altogether.
Is the future of storage SSD + tape?
No. While this works for a streamlined processing application, tape just isn’t ever going to be fast enough for data that needs to feel like it is instantly available. There is just too much data that probably won’t be needed much, but when it is, it must be instantly available. However, disks are much faster than the ~1 second response time needed for a user facing application.
With SSD handling more and more of the performance storage requirements it will be interesting to see if disks stop their march toward smaller form factors and head in the other direction by becoming bigger and slower and fully cede the “tier 1 storage” title to SSDs.
Comparing storage performance is a bit more difficult than meets the eye. This comes up quite a bit as I frequently address the differences between RAM and Flash-based SSDs. This particularly comes up when comparisons are made between TMS’s Flagship Flash system, the RamSan-630 and the Flagship RAM system, the RamSan-440. The Flash system has higher IOPS but the RAM system has lower latency.
There are three independent metrics of storage performance: response time, IOPS, and bandwidth. Understanding the relationships between these metrics is the key to understanding storage performance.
Bandwidth is really just a limitation of the design or standards that are used to connect storage. It is the maximum number of bytes that can be moved in a specific time period; response time overhead or concurrency do not play a factor. IOPS are nothing more than the number of I/O transactions that can be performed in a single second. Determining the maximum theoretical IOPS for a given transfer size is as simple as dividing the maximum bandwidth by the transfer size. For a storage system with a single one Gbps iSCSI connection (~100 MB/s bandwidth) and a workload of 64 KB transfers, then the maximum IOPS will be ~1,500. If the transfer size is a single sector (512 bytes), then the maximum IOPS will be ~200,000 – a notable difference. At this upper limit, bandwidth will more than likely not be the performance limiter.
There is another relationship that ties together response time and concurrency – Little’s law. Little’s law governs the concurrency of a system needed to achieve a desired amount of throughput. For storage, Little’s Law is: (Outstanding I/Os) ÷ (response time) = IOPS. I consider this the most important formula in storage performance. If you boil this down, ultimately the limitation of IOPS performance is the ability of a system to handle Outstanding I/Os concurrently. Once that limit is reached, the I/Os get clogged up and the response time increases rapidly. This is the reason a common tactic to increase storage performance has been to simply add disks – each additional disk increases the concurrent I/O capabilities.
Interestingly, the IOPS performance isn’t limited by the response time. Lower response times merely allow a given level of IOPS to be achieved at lower levels of concurrency. There are practical limits on the level of concurrency that can be achieved by the interfaces to the storage (e.g. the execution throttle setting in an HBA), and many applications have fairly low levels of concurrent I/O, but the response time by itself does not limit the IOPS. This is why even though Flash media has a higher response time than RAM, Flash systems that handle a high level of concurrent I/O can achieve as good as or better IOPS performance.
Recently, I was helping with a project where the performance difference of Oracle on disks vs on SSDs was being showcased. A a database was created on 11g and a simple script was querying random subsets of rows. The database was setup so we could flip between the disks and SSDs quickly by just changing the preferred read failure group in ASM. Something very odd was happening. The SSD was faster than the 48 disks we were comparing to, but only 3 times as fast. This would be fine for a production database that was doing actual work, but in a benchmark we controlled, we needed to show at least a 10x improvement.
In cases where a system isn’t behaving as expected it is helpful to look at the basic statistics where different subsystems meet. In this case, we looked at the IOstat to see what the storage looked like to the host. The response time of the SSD was about 2 milliseconds even though the IOPS were only around 10,000 – well below the point where the SSD starts to get stressed. We stopped the database and pulled out a synthetic benchmark, ran a random load against the SSD and saw at the 10,000 IOPS level the response time was 0.3 ms – about what we would expect. We switched back to the query and again, 10,000 IOPS, 2 ms response time – why? What was Oracle doing differently than a simple benchmark?
We thought for a while and the engineer that I was working with had a “eureka moment.” We referred to the Integration guide that we had written for our customers to use:
To further improve the performance on Linux, append the kernel parameter “elevator=noop” to disable the I/O scheduler. This will help reduce latency on small-block requests to the RamSan. This will also greatly improve performance of mixed reads and writes to the filesystem. Example entry in /boot/grub/menu.lst (/etc/grub.conf): title Red Hat Enterprise Linux Server (2.6.18-164.el5) root (hd0,0) kernel /vmlinuz-2.6.18-164.el5 ro root=LABEL=/ elevator=noop rhgb quiet initrd /initrd-2.6.18-164.el5.img
We implemented the settings changes, ran the query again, and saw exactly what we expect, 0.3 ms response time, >50,000 IOPS, and the CPU was now the bottleneck in the database benchmark. We added a second RAC node and hit 100,000 IOPS through the database.
The Linux I/O scheduler is designed to reorder I/Os to disks to reduce the thrash and thus lower the average response time. In the case of our simple I/O test, the requests were truly random and arrived at a regular interval, so it didn’t bother trying to reorder. With the Oracle tests, it was less random, burstier, and there were some writes to files that control the database mixed in. The scheduler saw that it could “help” and spent time reordering I/Os under the assumption that delaying the I/Os a little to make the disks more efficient was a good trade off. Needless to say this is a bad assumption for SSDs and is a good example of optimizations put in place for disks becoming obsolete and just slowing down overall performance.
The SPC imposes strict rules on how their results are presented, so in the interest of brevity I’ll only highlight two very significant items wrapped up in these results:
First, on a “cost for capacity” basis, the scales have tipped in favor of Flash vs. a disk-based system configured for maximum performance. “Cost for capacity” is not part of the SPC Reported Data but derived from SPC Reported data by dividing Total Price by Application Storage Unit (ASU) Capacity, which are both part of SPC Reported Data.
Second, the RamSan-630 produced highly competitive SPC-2 bandwidth results. This is significant because high bandwidth use cases have not been part of the traditional SSD discussion. The typical argument goes, “SSDs are great at IOPS and disks are good for bandwidth.” This is changing; SSDs are going to challenge disks anywhere that they are used for performance. It no longer matters if you need random IO or streaming sequential IO; if performance is the primary criteria – SSDs will be the solution, period.
In other news, this week EMC announced that they were seeing demand for all flash configurations of VMAX and VNX (hopeful they will publish Storage Performance Council results!). While I would encourage these clients to take a look at what a purpose built Flash system like the RamSan-630 can do; I do think that it clearly highlights where SSDs are going, from a “Tier-0” configuration to “Tier-1”.
Welcome to the future, EMC, glad you could join us.
SPC, SPC-1, SPC-2, are trademarks or registered trademarks of the Storage Performance Council (SPC).
From time to time I become involved in discussions about where SSDs are making an impact in the consumer market and what I think is going to happen. The biggest knock that I hear about SSDs making serious inroads is: Consumers buy computers based on specs, most just won’t accept a computer with less storage at the same price as one that has more. The details on the SSD benefits are lost on this mainstream market and disks can maintain a price per GB advantage for a long time in the future.
I attended a marketing presentation by David Kenyon from AMD recently and he pointed out something about computer marketing trends that I found insightful. The use of specs as a computer differentiator is becoming less prominent. The look and feel and the fitness for a particular use case are becoming more important selling points. The reduced prominence of specs started when the clock rates of CPUs stopped being promoted and instead the family name and model number were used. If you look at Apple products, it is hard to even find the specs until after you have selected the make you want and are trying to decide on a model.
Part of the shift towards use case based computing is a fragmentation of computing resources into multiple devices. People have many devices – laptops, work desktops, home desktops, tablets, and smart phones. Having devices that are accessible and convenient for a particular use is a wonderful thing. But there is one major headache that comes with this – having access to your data from a particular device that you would like to use is a pain. A Kindle is great to dive into a book on a quiet afternoon, but it is relatively inconvenient to take with you all the time. Being able to pull out a smartphone in a waiting room and pick up reading where you had left off is what people want. Multiple device shared access has already happened with email and it is just a matter of time until the rest of private data goes the same route. The access to data without physical device dependence is what cloud storage is all about.
So what does this have to do with SSDs? Besides the lower prominence of specs, using more and more devices makes it clear that having a bunch of storage on any particular device just isn’t valuable. The data needs to be accessible from the other devices. Consumers are not going the go through the hard work of setting up data synchronization though. They will eventually pay to have it done for them, by whoever wins a monopoly over access to user’s data – Microsoft, Google, Facebook, or someone new. Soon, their data is going to end up in a datacenter somewhere that all of the devices can access. There may be a full copy of everything on the computer at home, but even this could fall by the wayside. In this environment, having a disk in any of the devices is just crazy. This is simply because at low capacities, SSDs are cheaper than disks! They are also higher performance, lower power, and have a malleable form factor.
There has to be a pretty robust high-speed network available almost everywhere for this to work, but that is clearly not that far off. Once the network is in place the service offerings and vendors will coalesce to develop a clear standard and price model. At that point consumer disks will move to the datacenter. This may sound like too much complexity to occur quickly, but the benefits that come from easy access to your data and the profits that will be bestowed on the vendor that becomes the gatekeeper are just too great to prevent it from happening.
If this framework develops, the total disk capacity will grow more slowly as the efficiencies that have developed in the enterprise storage arena are brought to bear – just in time provisioning, deduplication, and compression. (Just imagine how much unused capacity is isolated on all of the disks in the consumer computers today.) In the not too distant future, having a disk in your computing device will be the exception to the norm.
An aside on cloud computing frameworks, data, and network bandwidth
The biggest issue with cloud storage is the network bandwidth. I don’t mean to suggest network bandwidth needs to be high enough to use cloud storage remotely – that may never happen. Today, the data in successful cloud services is being fragmented by application. Keeping the data and the compute resources close gives a big benefit in reducing the network traffic needed for processing. The drawback is that data behind the scenes is handled differently by each service provider and managing credentials for each separate service is difficult. In effect it is creating a data management nightmare for the user. This fragmentation is bad for users – they want an easy way to access and control all of the data that belongs to them.
The efficiency of having the data near the compute resource is huge, but there is no real reason that the data has to fragment and move to the service providers. With the proper cloud computing framework, the applications could just as easily move to the data’s location and run in the same datacenter. This would provide the same benefit but make it easy for the user to see and mange the data that belongs to them. I don’t see an easy way to separate cloud storage from cloud computing – but at the end of the day the data is what everything else depends on and frameworks have to account for this.
One of the pet peeves that come with the territory when deploying SSD systems is being compared to the price of consumer disks. I might be bothered in particular because I have seen how rapidly the price has declined since Flash entered the field. I remember that it was not very long ago (2004) that SSDs were thousands of dollars per GB! Now, as the price of SSDs comes much closer to what high performance enterprise disk systems cost, the difference does not seem that bad to SSD veterans.
There is a general disconnect between what hard drives cost in the consumer market and what the disk based enterprise storage systems cost per GB. I am sure that IT administrators get offers from end users all the time to personally buy a 1 TB drive for $80 to increase the size of their exchange mailbox.
So where can you find what enterprise storage systems cost?
The best source available is the Storage Performance Council’s (www.storageperformance.org) published data on the benchmark results of various enterprise storage systems, and one of the requirements is that the full costs must be disclosed. When you look at this data in a few different ways you can draw some general conclusions. First, the obvious one, disks are rapidly getting cheaper per GB (below is some historical $/GB data on test results from systems with more than 100 disks):
However disks are not getting cheaper – they are just getting bigger. Enterprise disks are very expensive once you include the costs of the storage controller, switching, and maintenance. Below is the cost of a solution divided by the number of disks:
From these costs it is easy to see how there is a business case for deploying a solid state solution to eliminate 20 disk drives (or more). You can always get more capacity with disks at a lower price point than SSDs and that will continue for a long time. However, since the price per disk is so high, for smaller capacity, high performance workloads, SSDs are just cheaper. The price point of a 15K RPM drive behind a storage controller is so high that you don’t have to be at the extreme end of the performance curve anymore to justify SSDs.
Realistically, once you are putting in 2-3 times as many drives for performance as you need for capacity, a serious investigation of SSDs should follow.