– Jim Jenkins, Sales Manager with GreenBytes (www.getgreenbytes.com), says:
Why is high speed inline compression and data deduplication useful in today’s enterprise data centers?
Today’s enterprise data centers are experiencing exponential data growth with limited budgets and staff. The cost of storage has become a major component of IT budgets, often growing faster and less predictably than other areas.
Inline compression and data deduplication allow data to be stored using less physical hardware than would otherwise be required. Depending on the application and data being stored, it is possible to reduce the storage requirement by 50% to 99% or more. The ability of the compression and deduplication to be done in real time with little or no negative impact on I/O performance can significantly reduce the cost of storage.
Descrption: High speed inline compression and data deduplication.
In order to describe high speed inline compression and data deduplication, it is beneficial to understand how inline deduplication is done.
With ‘inline’ deduplication, data is deduplicated before it is stored on or read from disk. The process involves breaking data into small pieces called blocks. A database is created and managed in real time to allow storing each unique data block only once. The size of this deduplication database grows proportionally to the amount and type of data stored. Given terabytes of data stored, the number of unique blocks can total in the hundreds of millions or billions.
First generation inline data deduplication systems used very expensive DRAM as a cache for help to increase the performance of the I/O intensive process of managing the deduplication database. The DRAM cache was only able to store a small portion of the database at a time, the majority being stored on much slower hard disk. These systems provided good data deduplication, but due to reduced I/O performance, were limited to primarily backup applications.
Second generation data deduplication systems attempted to increase backup performance by delaying the deduplication until after a backup was completed. These ‘post process’ deduplication systems initially write data to hard disk in a native or undeduplicated state, then later move the data to a ‘retention’ area where it is deduplicated. While this approach offers some speed improvement, it requires twice the hard disk capacity and is only viable for backup applications.
GreenBytes high speed inline deduplication design is effectively a third generation design that resolves the performance and architectural problems of older technologies. Like the first generation approach, all data is deduplicated prior to writing it to hard disk. The difference is that rather than using DRAM and hard disk as a paged cache, GreenBytes systems can hold the entire deduplication database in high speed solid state disk (SSD). The resulting system provides near line speed I/O performance with compression and or data deduplication enabled. The result is a versatile storage system that provides excellent I/O performance with the added efficiency of compression and data deduplication.
Benefit for data center/IT managers:
GreenBytes fits into a number of applications within a data center:
Disk Backup with Deduplication: GreenBytes can provide equivalent data reduction to alternative disk backup products at a lower price and with enhanced features and performance. GreenBytes is certified with Symantec OpenStorage (OST) that provides enhance operation in Backup Exec and NetBackup environments.
- Archive Applications: Archiving seldom used data from expensive Tier 1 storage to GreenBytes frees up the Tier 1 storage and stores this data efficiently and cost effectively due to the ability to compress and deduplicate the data.
- Virtualized Environments: Virtual server or desktop applications typically involve large amounts of replicated data sets with large I/O requirements. GreenBytes SSD cache is particularly effective in these applications due to the way it reduces effective requirements on hard disk performance. GreenBytes is an excellent solution for these applications.
Unstructured Data Storage / User Directories: GreenBytes is particularly well suited to efficiently storing unstructured user data. GreenBytes supports both NAS and iSCSI SAN protocols with can be used as needed for this type of data.
Bottom Line:
Data center managers need to understand that data deduplication isn’t just for backup anymore. GreenBytes makes inline compression and data deduplication viable for use in mainstream primary storage applications. GreenBytes is happy to discuss a user’s unique requirements and explain how our technology will offer benefits.