Is IT Scale Eating Up Your Resources? Copy Data Management Might be the Answer

None of this is news to anyone, but what is changing is the willingness of organizations to throw bodies at the problem. Global economic uncertainty means the pressure is on to keep headcounts down.  Everyone wants a company that grows and makes more money, but few are willing to invest in the wetware to make it go. 

IT staffs rarely complain. They take the burden upon themselves and simply try to do more with the same staffing levels. While this can help a Pointy Haired Boss’ (PHB’s) quarterly numbers, the long-term impacts of this are devastating. Morale suffers. Staffs burn out. There are finite limits to what any person can realistically do.

With the world’s major powers entering a new era of economic détente that may well evolve into Cold War II, the tug of war between business demands and IT requirements isn’t going to get better any time soon. The economy will bounce between outright lousy to dangerously uncertain and pressure on IT will continue. Faced with realities outside their control, what is a storage admin to do?

Automate

When you can’t use wetware to do a job you turn to software. A great many jobs can be partly or even fully automated. Storage administration is no different.

A business doesn’t make money copying LUNs, and a storage administrator isn’t going to demonstrate their ongoing value to the business by doing so. Basic tasks like copying, snapshotting, cloning and so forth are scut work, unworthy of a storage administrator’s time. These are the sorts of things that should be happening in an automated fashion, freeing the storage administrator to work on grander things.

Storage administrators understand storage. They understand the impact that storage has on networking, on compute workloads, on WAN bandwidth and more. Storage administrators are domain specialists in a difficult field and they should be architecting long term solutions, not putting out fires.

Templates, profiles and role-based administration are the future. For storage administrators, virtual administrators, network administrators and every other discipline within IT. It is long past time that this is where we put our efforts instead of manually configuring and executing tasks.

Passing the buck

Some of this is passing the buck. Instead of being the one to push the button that causes a workload to clone, snap or replicate a storage administrator focusing on automation will instead define rules and parameters in which these events can occur. The storage admin will then make an interface available to virtual admins, devs or other interested parties who can then cause the data management event to occur as they please.

For devs, this is probably going to take the form of an API call executed from a script they’ve written.  For virtual admins, they may use a GUI to make it go. Either way, the buck is being passed in that the “scut work” of executing storage management is up to a downstream administrator.

The wonderful part of this arrangement is that the downstream administrators don’t feel slighted by this; instead they feel empowered. Now they can act on storage needs without a burdensome change management process. This allows them to execute storage events both in real time and to a schedule, allowing them, in turn, to automate the scut work of their jobs.

All of this sounds great in theory, but it is terribly complicated to build and implement. I certainly wouldn’t want to code such a beast from scratch. Fortunately, there are startups who have arisen to meet this need.

Copy data management software – the decent stuff, at least – allows storage administrators to do exactly what is described above. For storage administrators, copy data management is the means to cope with scale in a world where getting more bodies to push the buttons just isn’t likely to occur. Have you automated yet?

For more information on Copy Data Management, please see the previous blogs in this series, Copy Data Management is Much More than Just Making Copies and Too Many Things Demand Your Attention: Solving the Conundrum of IT Automation.

Trevor Pott is a guest writer with Catalogic Software. Trevor is a full-time nerd from Edmonton, Alberta, Canada. He splits his time between systems administration, technology writing, and consulting. As a consultant he helps Silicon Valley start-ups better understand systems administrators and how to sell to them. He currently pens a weekly column for The Register; one of the world’s largest online science and technology magazines, with monthly readership of 7.2 mil. people worldwide.Trevor can be found at http://www.egeek.ca/ for those looking to engage his jedi-like guidance.

Read More
05/16/2016 0 Comments

Too Many Things Demand Your Attention: Solving the Conundrum of IT Automation

Another school of thought has it that if you only perform a task every N times in M timeframe, you should automate that task because there’s a good chance you’ll forget how to do it by the time it comes up to do again. Both points of view have merit, but they don’t really shed any light on how to set about automating.

The traditional administrator’s path to automation is scripts. Lots and lots of scripts. Shell scripts, batch files, PowerShell scripts, you name it! Cron jobs and scheduled tasks running on a dozen different machines all executing little scripts in the background.

At first, these scripts are simple time savers; they are helping an administrator avoid scut work to focus on what really matters. Over time, however, the business comes to rely on those scripts. Eventually the author will go on vacation, quit or otherwise be unavailable. That’s usually right about when the scripts stop working and someone else has to fix them, and quickly, because the whole automated house of cards comes tumbling down.

A large enough organization’s IT can be held together by badly documented scripts and applets written by dozens of administrators over the course of decades, most of which nobody currently employed in IT knows how to manage or maintain. This is a nightmare scenario, and one best avoided.

A better way

APIs and code versioning tools have helped make automation easier. Scripts probably still aren’t properly documented, but if adhered to then at least what they are trying to do and the history of their evolution can be understood by those seeking to make changes later on.

Ideally all scripting within an IT organization would be well documented, version controlled, undergo unit testing, integration testing and regression testing. In reality, however, it is far easier to issue an edict to that effect than to actually have it followed.

Scripting frameworks evolved, and they were good. The DevOps movement got born and Infrastructure-As-Code become a buzzword that actually meant something practical in the real world. Scripting evolved from code written in a dozen languages, depending on platform and administrator comfort to become something that organizations invested time, money and training into.

Instead of reliance on scattered cron jobs, scripting became managed by centralized servers with specialized agents for all workloads under management. These could eventually control workloads in the public cloud as well as on premises.

With the likes of Puppet, Chef, Ansible and Salt, automation has grown up.

The missing component

Unfortunately, not all is well in Mudville. For all that the tools to make automation less risky have grown and adapted, vendors haven’t. Most vendors post an API of some variety, but in far too many cases it is a mere afterthought, or originally designed for internal use only.

The APIs are often poorly documented. Modules to talk to your favourite automation suite may or may not exist, with the quality being something of a toss up. Automation may have grown up, but it still hasn’t move out of its parents’ house yet.

A look at networking offers a great example of the battles being fought here. Networking competes with storage for the most conservative market within IT, and the dominant player – Cisco – has fought against automation tooth and nail.

This is largely because automation leads to commoditization of the underlying infrastructure. Cisco has built an empire on being a virtual monopoly and they aren’t keen to see the foundations eroded.  Fortunately, they don’t have much of a choice.

Software Defined Networking (SDN) is the relevant buzzword in the ongoing efforts of separating the control plane (the configuration) from the data plane (the functionality). We’ll get back to this and discuss why this is so important that it merits its own movement later.

Much as with networking, the storage market stubbornly resists automation as well. APIs abound, but actually using them is a whole other discussion entirely. Unlike networking, however, storage went through a massive diversification prior to automation really becoming a mainstream concern.

The result is that no one entity dominates the storage market. There are storage solutions that focus almost entirely on automation, and virtually every storage solution (regardless of provenance) has enough of an API that if you really wanted to you could make it dance.

This does not mean storage vendors have embraced Puppet or any other automation framework. Typically, there is a great deal of antipathy towards these frameworks. For some vendors it is a desire not to be commoditized, for others they simply don’t see automation as a “serious” endeavour for the storage market. They want to sell SANs and they don’t really care much what you do with them once they have your money.

Who owns the stack?

Whether we are talking networking, storage or some other recalcitrant aspect of the IT ecosystem only one thing really matters to vendors: control.

Cisco doesn’t really want SDN to take off. It doesn’t want to lose control over the market by letting individuals other than its certified domain experts be able to design and implement complex networks. It doesn’t want customers to be able to take their scripts and simply point them at another vendor’s switches without having to go through a lot of pain during the migration. In short, Cisco doesn’t want to be commoditized.

Storage is already a commodity. Here the game is being played for control of upper layers of the stack.  Storage vendors want you to buy into their automation platforms, their integration with virtualization and containerization and their hooks into public, private and hybrid cloud solutions.

Of course, as customers, we don’t want vendors to have control. Vendors with vices locked onto our genitals have a nasty history of squeezing until there is no money left to be had. Just look at Oracle licensing.

Automation in these areas thus requires arbiters. Applications that you can code to which in turn can apply your scripts to hardware and software from multiple vendors.

These arbiters have been around for some time, but they are now growing into full grown frameworks in their own right. Templates, profiles and role-based administration are regularly featured. REST APIs and integration with mainstream automation frameworks like Puppet are par for the course.

You can not today simply install Puppet and automate your whole datacenter. With the help of SDN frameworks, however, you can automate networking. With the help of copy data management applications, you can automate storage. And you can do so in a reliable, sustainable fashion that will be manageable and maintainable long after the current round of administrators has moved on.

This is how you automate the datacenter. Not with a collection of scripts and cron jobs, but with frameworks and API arbitration. Sidestep the power games of vendors and get on with the job of running your datacenter. Happy coding.

For more information on Copy Data Management, please see the previous blog in this series, Copy Data Management is Much More than Just Making Copies.

Trevor Pott is a guest writer with Catalogic Software. Trevor is a full-time nerd from Edmonton, Alberta, Canada. He splits his time between systems administration, technology writing, and consulting. As a consultant he helps Silicon Valley start-ups better understand systems administrators and how to sell to them. He currently pens a weekly column for The Register; one of the world’s largest online science and technology magazines, with monthly readership of 7.2 mil. people worldwide.Trevor can be found at http://www.egeek.ca/ for those looking to engage his jedi-like guidance.

Read More
05/10/2016 0 Comments

Copy Data Management Takes Wing Off EMC Announcement. We Have Info!

Why are we pleased at EMC’s annoucement, you ask? We’re pleased because EMC brought new attention to the technology of CDM and in the process validated points that Catalogic has been making for years now. They even took a similar technical approach of doing CDM on the array – what we’ve been calling “in-place” copy data management — and offered reasons for why it’s better. In an eWeek interview, EMC President of Products and Marketing Jeremy Burton compared eCDM to Actifio, which uses a separate hardware layer, noting:

So architecturally, we’re not dependent on any intermediate layer. It’s going to take a little bit of time to build support for various different storage arrays and so on, but we firmly believe that in order for this to be pervasive, it has to be non-invasive… Non-invasive is a frictionless deployment, and it’s got to be a global view, it can’t just be on-prem.

Couldn’t have said it better myself! In fact, we had the good folks at Storage Switzerland put together thoughts on this topic and we released that as a white paper called Copy Data Management: In-Place vs. Rip and Replace, and you can get it here.

We’re also seeing a lot of talk about the economics of proper copy data management. We’ve got good information on that as well in an excellent analysis by Enterprise Management Associates called Impact Analysis: Copy Data Management. This report provides a great breakdown of the ways CDM can save you budget.

And that’s not the end of the good stuff we have (we’ve been at this for a while!) There’s also a fine report from the folks at IDC, the same analyst group that EMC has been citing in their discussions. We’ve made that report, Solving the Copy Data Problem with In-Place Copy Data Management, available to you at no cost.

If you want a deeper dive on any of this, we have an excellent webinar series you can sign up for, or watch a previously broadcast event.

A final note. In-place CDM works with the storage array, and therefore you have to support the specific command sets of each array. That’s how you make it solid: you don’t put a translation layer in-between. This is what Mr. Burton means by “non-invasive.” Currently, Catalogic supports NetApp ONTAP based systems (including Cloud ONTAP), various IBM systems, and the EMC VNXe line (which eCDM does not support). More EMC support will be announced shortly, so stay tuned. If you’re looking for help managing snapshots and replication – also known as copies! – then look no further than the Catalogic Software ECX product. We can make your storage sing!

Read More
05/04/2016 0 Comments

Copy Data Management is Much More than Just Making Copies

Copy data management, as you might expect, concerns itself with creating and managing copies of data.  What’s important to bear in mind is that this isn’t all there is to copy data management.  The critical piece that is often overlooked is that more than creating and managing copies of data, copy data management is about making that data useful.

Let’s consider a virtual machine to be the data in question. Copy data management can and does deal with other forms of data, but virtual machines are the primary containers of data in today’s datacenters, so we’ll start there.

Making a copy of a virtual machine and then turning it on isn’t a particularly good idea.  A perfect copy will have the same data in the virtual machine as well as the same configuration.  This means the same name at the virtualization management layer, the same name at the operating system level as well as many other unique elements ranging from the MAC address of the network cards to the unique IDs of the hard drives, operating systems and applications.

A perfect copy of a virtual machine can’t be registered with the same virtualization management software; it doesn’t like having two VMs of the same name.  Similarly, two VMs of the same name on the same subnet makes Windows very upset, and DNS registration can get tricky as well.  Unless the copy is designed to work as part of a cluster with the original VM, data intended for the original VM could start showing up at the clone and vice versa.  All in all, not good.

To be made useful copies of VMs have to go through a process known as genericization.  The degree to which the virtual machine is genericized depends on intended purpose and the context under which the copy is to be deployed.

A copy of a VM made for disaster recovery purposes may need no genericization, or it may need only its network address changed.  A copy of a golden master, template or development environment may need all systemwide unique IDs, names and addresses changed, as well as license keys wiped so that new ones can be entered.

Copy Data Management Isn’t Backups

Today’s copy data management software has to deal with all of this.  It is part of what sets it apart from simple backups.  It has been quite some time since simply taking a copy of data or a VM was “good enough”: in the real world we make copies of data because we eventually want to do something with those copies, so the ability to manipulate basic configuration aspects before bringing the copy online is both critical and fundamental.

It is important to bear in mind that copy data management isn’t just backups.  It certainly can be, but making copies of data and stashing it in a corner somewhere is such a fractional aspect of functionality that it is almost irrelevant, in the grand scheme of things.

Copy data management is as much about packaging data together as it is making copies.  A single service might consist of a master and slave pair of databases, file storage, several web servers, a load balancer, a security system and a firewall.  This could be packaged together as a single entity, snapshot, cloned, replicated and copied to dev & test as a unit.

Indeed, an entire department’s worth of data can be handled this way, or workloads could be broken into tiers, with each tier getting a different treatment regarding data protection, offsite replication and dev & test availability.

Copy data management then is about data lifecycle management.  Monitoring all copies that are in play, creating copies as needed, modifying copies to be useful and then bringing them online and providing the means to do this in an orchestrated, automatable and easy to use fashion.  It isn’t easy, but it is increasingly a basic requirement of the modern datacenter.

Trevor Pott is a guest writer with Catalogic Software. Trevor is a full-time nerd from Edmonton, Alberta, Canada. He splits his time between systems administration, technology writing, and consulting. As a consultant he helps Silicon Valley start-ups better understand systems administrators and how to sell to them. He currently pens a weekly column for The Register; one of the world’s largest online science and technology magazines, with monthly readership of 7.2 mil. people worldwide.Trevor can be found at http://www.egeek.ca/ for those looking to engage his jedi-like guidance.

Read More
05/03/2016 0 Comments

Let us show you around


Data ProtectionData ManagementOpen VM BackupNetApp Ransomware ShieldNetApp File Catalog