Home > Architectures, PCIe > Where Does Data Management Software Belong?

Where Does Data Management Software Belong?

It is interesting to see how some of the developments in the IT space are governed by intradepartmental realities.  I see this most pronounced in the storage team’s perspective versus the rest of the IT team.  Storage teams are exceptionally conservative by nature. This makes perfect sense – servers can be rebooted, applications can be reinstalled, hardware can be replaced – but if data is lost there are no easy solutions.

Application teams are aware of the risk of data loss, but are much more concerned with the day-to-day realities of managing an application – providing a valuable service, adding new features, and scaling performance.  This difference in focus can lead to very real differences in viewpoints and a bit of mutual distrust between the application and storage teams.

The best example of this difference is where the storage team classifies storage array controllers as hardware solutions, when in reality most are just a predefined server configuration running data management software with disk shelves attached.  Although the features provided by the controllers are important (like replication, deduplication, snapshot, file services, backup, etc.) they are inherently just software packages.

More and more of the major enterprise applications are now building in the same feature set traditionally found in the storage domain.  With Oracle there is RMAN for backups, ASM for storage management, Data Guard for replication, and the Flash Recovery Area for snapshots.  In Microsoft SQL Server there is database mirroring for synchronous or asynchronous replication and database snapshots.  If you follow VMware’s updates, it is easy to see they are rapidly folding in more storage features with every release (as an aside, one of the amazing successes of VMware is in making managing software feel like managing hardware).  Since solutions at the application level can be aware of the layout of the data, some of these features can be implemented much more efficiently.  A prime example is replication, where database level replication tools tie into the transaction logging mechanism and send only the logs rather than blindly replicating all of the data.

The biggest hurdle that I have seen at customer sites looking to leverage these application level storage features is the resistance from the storage team in ceding control, either due to lack of confidence in the application team’s ability to manage data, internal requirements, or turf protection.  One of the most surprising reasons I have seen PCIe SSD solutions selected by application architects is to avoid even having these internal discussions!

As the applications that support important business processes continue to grow in their data management sophistication there will be more discussions on where the data management and protection belong.  Should they be bundled with the storage media?  Bundled within the application? Or should they be purchased as separate software – perhaps as virtual appliances?

Where do you think these services will be located going forward? Let me know below in the comments.

Advertisements
Categories: Architectures, PCIe
  1. Matt Key
    October 7, 2011 at 3:07 pm

    Jamon,

    I believe the data management features belong in the application and the roles of ownership also belong to the application team. I see storage and infrastructure (including OS) as resources. Apps are services and the availability and cost of resources is based upon the app.

    I’m pretty sure you’ll find Amazon, Microsoft, and Rackspace agreeing here. There are no guarantees in cloud as it is just a resource community for any paying customer. If you want replication – do it yourself. If you want dedupe – do it yourself. If you want consistent latency and performance – probably shouldn’t do cloud 🙂

    App-based management keeps resources flexible. Efficient, high performance, and hybrid resources across the board. The apps know what has changed, what hasn’t, and where things need to move in a much more granular level than the storage. As you mentioned, most application work at 4KB or 8KB, then why do we need to move data at 4GB extents by the array? That’s a lot of money in your DR trunk’s bandwidth.

    Tiering is inevitable to be done by the application. Flash has made its presence with storage and now we see application vendors starting to acknowledge the trend. Oracle’s Flashcache is a prime example of extending the Buffer data in the server using Flash. There’s also VMWare 5.0’s ability to recognize the B0 an B1 inquiry pages to know what datastores are SSD and can expect the sub-millisecond response times and target the hot resources.

    I will give credit to the feature-rich array vendors though. There is no better way to get lock-in with a customer than to make it prohibitively expensive to change vendors as it controls the data (the 2nd most valuable item in a business to its personnel) in an imperial way that keeps the renewals and the product refreshes coming.

    What’s that? You need more storage or faster storage? Well, if you want to use these features that were licensed so heavily, you will need to stay on the current product tract. Too much I/O for features? Need to add more controllers. It’s a money pit that is tough to break away.

    – Matt Key

  2. Shawn Authement
    October 10, 2011 at 3:14 pm

    Or perhaps, somewhere in the middle? You mentioned VMware … if more O/S could implement smarter storage array handling and present it up to the applications as an API (or, hide it from the applications), then it would level the playing field, especially for legacy apps or apps which have slow turn-around times for new technologies. In addition, storage vendors could spend less time on making their own proprietary tuning knobs and opt for following a standard which should lessen the time-to-market.

  3. Bob Norman
    December 16, 2011 at 2:56 am

    To me the tradeoff is what you can cost reduce in the storage controller by moving the software into the server. Also, what percentage of time is spent in the server managing the drive details. I think a good case can be made both ways.
    If I was a customer I would be nervous letting the software manage the hardware down below. How do you check out something so open ended and assure your not corrupting or losing data. A system approach may save money, but you may never get it fully tested to make customers feel safe.
    Then again, I’m an old timer and my view of the storage world is biased.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: