The Enterprise Copy Data Mess: How Do You Clean it Up?

Catalogic 12/23/2014 0 Comments
I hate messes. I hate all kinds of messes. There is the kind of mess that requires cleaning, like anytime anyone in my family makes a sandwich.  I guarantee there will be a mess. Then there is the kind of mess that stems from disorganization. A picture of a hoarder comes to mind. It is really hard to find what you are looking for in that type of mess.

What kind of a mess exists in our data centers and on our storage devices? Excess copy data has made the storage environment quite messy in most cases.  I will be the first to admit that I close my eyes and attempt to avoid that mess. The first mess includes the iterations of documents such as draft 1, 2, 3 or version 1, 2, 3. However you label it, there is copy after copy of incremental work and the copies we have shared with each other. Oh and don’t forget the secondary copies (backups, replicas, test/Dev, analytics) of each of our copies. Some PowerPoint decks are over 10 MB with over 30 versions adding up to a lot of wasted disk space. Just three of us are wasting well over 1 TB of space on just one document and its associated copies. Extrapolate this across any size data center with any number of users and consider the amount of wasted storage. Every organization has this problem and every organization is ignoring the mess even though it is costing them real money.

And like a hoarder, I never delete. No one deletes. The reality is if there was a way to do a simple search on the age of the files and locate all the copies in all locations, I am sure I could take action and instantly save a ton of space. Instead, I just add more storage.

And then I have to leverage my data for multiple business reasons, so of course, I create more copies of all this data to all my backup tiers (data silos). I hate the mess, but I can easily forget it is there. It doesn’t show itself like those bread crumbs on my counter top. The reality is the possible savings become very real, very fast if I could just have visibility and insight into it. If that stale data was identified and eliminated as well as leveraged across our data centers, it would instantly save companies money.

Not everyone minds a mess. I remember working for a department Chairman at a University whose entire office was cluttered with stacks of papers, books, and magazines. One time when I was attempting to organize the mess, I found an old pizza box in a stack. It still makes me laugh today. When you can physically see the mess, it is easy to pass judgment.

This brings me back to our data center and the hidden mess. It isn’t in our face or easy for others to see. And it isn’t just about copies of files. By way of example, how many VM’s are launched and then never used? Eventually, our users feel the impact as the server’s performance is impacted. But does anyone correlate the reality of VM Sprawl to server performance? They can’t see the mess.

What’s needed, are purposefully built tools that can show me the mess. I want a daily report to land in my email box and throw the mess right in my face. That would prompt me to take action and actually do something about this. I want to assess and understand my problem. Then, I want to be business savvy enough to correlate that against dollars saved through storage efficiency. Once I clean up the mess, I report this to my supervisors and I add it to my yearly performance review. Then utilizing the proper tools I can keep from ever having this mess again and have a solution that helps me to manage the full, true life cycle of my data.

Let us show you around