By Victoria Grey, CMO, Aparavi
In the past, archives were thought of strictly as a long-term repository for infrequently accessed data – think cold storage – and so not much thought was put into intelligently managing this data. The hope was that your archive was like an insurance policy that you would never need to use.
With the advent of the cloud and its increased utilization, organizations have begun to find that, in addition to achieving huge cost savings by leveraging cloud economics, there is inherent value remaining within the archived data itself. To fully leverage this value, archived data must be organized, accessible, retrievable, and intelligently retained. Simply put, an “active” archive.
Unstructured data, such as office documents, videos, audio files, images, .pdfs, and anything not in a database, has now become the lifeblood of most organizations. Intelligently storing this data over the long term is critical not only for compliance and organizational history, but increasingly for business intelligence, analysis, data mining, and other purposes.
With overall data continuing to grow unabated at tremendous rates year over year, and unstructured data leading that charge with growth rates of 60 percent or more (climbing to a predicted 90 percent or more of all data within just a few years), there is an ever-present need for active archiving that uses any combination of public, private or hybrid cloud, with easy ability to move data at any time. In order to best manage this data within a multi-cloud infrastructure, organizations need to adhere to the basic tenants of active archiving as follows:
Data should be organized
Unstructured data tends to be messy – a typical organization can have millions and millions of files not necessarily organized in any particular fashion. Some might be stored on a private cloud, some on a public cloud. To make sense of this, it’s helpful to be able to classify and tag data based on categories that are important both internally and externally. Think “confidential” or “legal” as useful flags for the ability to retrieve data in the event of an audit, or PII and similar for compliance. But more than that, all sales data, all financial data, etc., could be classified for fast and easy retrieval for future use.
Data should be accessible
You need to be able to store your data where you want, and get at it easily. This could mean on-premises in a private cloud, in the public cloud, or even across clouds. We’re beginning to see increased competition among cloud vendors, and having the ability to take advantage of changing cloud economics is extremely valuable. An Active Archive should support both on-premises and true multi-cloud with the ability to dynamically migrate data, at will, across cloud destinations, and not require the administrator to remember where that data is.
Data should be retrievable
Complementary to classification and tagging is full-content search. Imagine the ability to quickly and easily search through petabytes of data with millions (or billions) of files to find that needle in a haystack you were looking for, using a word, a phrase, or its metadata, rather than having to know where or when a file was saved (think Google search). This opens up what has been a black hole of practically unusable data into a usable repository. Being able to locate archived data in a cloud repository and easily re-hydrate it so that it becomes a part of an active data set leverages the full value of that information.
Data should be intelligently retained
If you ask an audience of IT administrators what their corporate policy is on data retention, the vast majority of them will tell you they keep everything, forever. Data governance is a huge topic, more than we can get into this post, but suffice-it- to-say best practices are not to keep everything forever, and to intelligently prune data that is no longer needed, for legal, space, cost, and other reasons. An Active Archive, especially one living in a multi-cloud environment, is one that helps an administrator set policies to enable intelligent pruning of data no longer needing to be retained, freeing up space and decreasing storage costs from unwanted and unneeded capacity expenses.
An Active Archive provides intelligent, multi-cloud data management, making the long-term storage of an organization’s most critical asset – its data – useful, today and forever.