As business and society at large undergo digital transformation, traditional data storage methods cannot accommodate the resulting deluge of data. In fact, by the year 2020, about 1.7 megabytes of new information will be created every second for every human being on the planet.
Trying to scale linearly to match this rate would be unreasonably expensive. It would also take too long. Even if it could be done faster and cheaper, adding multiple servers could not accommodate storage demands. Vertical storage architecture contains bottlenecks that slow performance to an unacceptable level.
However, just as internet-based technologies have skyrocketed the creation of data, storage technologies have evolved to meet the challenge. One example is Software-Defined Storage (SDS), which decouples the programming that controls storage-related tasks from the physical storage hardware and thereby dramatically reduces costs associated with hardware. Fewer, less-expensive servers can be used to improve both capacity and performance. Administration is simplified and made more flexible and efficient.
SDS enables users to allocate and share storage assets across all workloads. It’s no wonder that the industry has embraced SDS. Gartner recently reported that by 2020, anywhere from 70 to 80 percent of unstructured data will be stored and managed on lower-cost hardware supported by software-defined storage.
File Systems and Their Features
About 80 percent of the SDS solutions on the market offer file systems, which makes sense, since 80 percent of data is unstructured. While it is widely understood that unstructured data is best managed with a file system, for some reason, many SDS offerings focus solely on block or object store. Few offerings focus on file systems or do them well. Without a file system overlaying this data, it becomes very difficult to manage that data.
There are different kinds of storage because they fulfill different roles:
- Block is used for storing databases or virtual machines, but you need files as well to deal with all the unstructured data.
- Object is used for applications that require extreme scalability such as machine-to-machine and IoT transactions, but it isn’t that much better than block when it comes to managing data.
- File systems are the best choice for handling unstructured data.
File systems are critical for SDS, and the vendors know it. What’s strange, though, is the quality of the file system they offer. These file systems are usually based on Samba and exclude some features most Windows users are accustomed to. Samba is a freeware module that enables support for SMB and allows end users to access and use files on the company’s intranet or network. However, providing file services through Samba, which is open source, often means going without needed features.
This is no light matter. In addition to needing a file system, organizations need file-related features as well to deal with unstructured data. These features include:
- Retention: Automatically creates a single folder or a hierarchy of folders on file servers, to be deleted according to assigned policies.
- Snapshot: A read-only copy of the contents of a file system or independent file set taken at a single point in time. When a snapshot of an independent file set is taken, all files and nested dependent file sets will be included in the snapshot.
- Quota: Use this feature to help monitor the amount of storage you are using. You can set a soft limit quota that will warn you when part of a file system is close to reaching its storage limit but still allow data to be saved. If you set up a hard limit quota, after the quota is reached, no new data can be saved.
- Tiering: Creates a policy to designate where a chosen file will be placed and if and when the file will be migrated between file system pools. You can define both file placement and migration policies. By using a policy, you create a filter that designates a specific file type to a particular tier. Tiered storage is more efficient and boosts performance.
A Critical Decision
In a storage market in which vendors know what you need but often are reluctant to provide it, forewarned is forearmed. You must have a file system to manage the 80 percent of your data pool that is unstructured. Remember that many SDS solutions come with a thin excuse for a file system, and that open source solutions can only offer limited feature sets. The answer is SDS, which accommodates all three storage types, covering all your organization’s storage needs.
About the Author
Stefan Bernbo is the founder and CEO of Compuverde. For 20 years, Stefan has designed and built numerous enterprise-scale data storage solutions designed to be cost-effective for storing huge data sets. From 2004 to 2010 Stefan worked within this field for Storegate, the wide-reaching Internet based storage solution for consumer and business markets, with the highest possible availability and scalability requirements. Previously, Stefan has worked with system and software architecture on several projects with Swedish giant Ericsson, the world-leading provider of telecommunications equipment and services to mobile and fixed network operators.