Businesses need data to fuel a wide range of new AI projects and services. On-premises object storage offers scalability, flexibility, and economy-of-scale for AI data, and provides bullet-proof cyber-resiliency for both neocloud providers and large enterprises.
Data center construction is in the news constantly as mega developments are underway to support AI compute power requirements. This is fueling the rise of a new breed of cloud providers for AI, termed neoclouds, and it is a new land rush. As McKinsey estimates by 2030, an almost $7 trillion in global investment will be made in AI, including $5.2 trillion to support AI processing and $1.5 trillion for traditional IT applications.
Many businesses are in the process of simultaneously developing AI applications while trying to formulate an AI data storage strategy that makes sense from the perspective of cost, future business needs, scalability, and compliance requirements. And as more questions arise, as McKinsey notes, there is no absolute certainty that businesses might have either over-invested or under-invested in AI infrastructure due to the fluid nature of AI related growth.
It prompts a question: where will businesses store all this data to feel in control of its AI destiny?
AI Data Storage Choices
All types of AI applications will need to collect, aggregate and prepare data before model training stages. A wide variety of AI applications will create differing demands on data storage in terms of type, volume and performance requirements:
- Multimodal AI: includes computer vision, robotics, and sensor fusion, processes high-resolution images, video, LiDAR, and radar inputs. Training datasets can easily reach multi-petabyte scale, with data storage requirements for data preparation reaching into 100s of petabytes and beyond.
- Generative AI: encompasses models that produce text, images, or multimodal outputs. Training or fine-tuning datasets are typically hundreds of gigabytes to hundreds of terabytes, but will require petabytes of aggregated data to prepare the training data sets.
- Agentic AI: orchestrates multiple reasoning steps and often chains together several LLM calls while integrating external tools and APIs. These workloads will have relatively smaller hot data but can entail very large archives of logs and context data.
- Classical machine learning and predictive analytics: mainly works on structured or tabular data, usually in the gigabyte-to-low-terabyte range, but having many of these workloads in an enterprise will aggregate the data storage requirements.
Enterprises today most commonly use public cloud storage or on-premises NAS or SAN for storing this data, each offering users different tradeoffs. Public cloud storage offers flexibility as its key advantage. However, public clouds can have greater exposure to issues of data privacy, security, and sovereignty – since these are public cloud shared infrastructures with risks of data leakage. Public cloud hyperscaler use also risks sensitive data crossing borders, creating a sovereignty and compliance liability for a business which needs to be carefully considered in this age of new regulations. And finally, there are well-known issues of higher cost, and especially surprises from fees for data egress.
In contrast to public cloud storage, legacy NAS or SAN storage were designed in a time before AI (and for many, even before the rise of cloud computing). For these new workloads, they can be limiting in terms of scale for both capacity and performance, as well as being inflexible in dealing with the new dynamics of AI applications. This will give rise to the well-known “storage silos” problem in the enterprise that creates IT management cost burdens.
Businesses today need to consider using storage architectures that are specifically designed to manage the scale and unpredictability of AI data requirements.
Neoclouds And Large Enterprises: Who Will Need New Modern AI Storage Solutions?
The largest enterprises (Fortune 500 and Global 2000 corporations) are all rapidly planning, developing and deploying their new generation of AI driven services. For many of these organizations, the decision to keep data in-house will be driven by a desire to have the best control, security and cost of their infrastructure. GPU resources are an extremely costly investment for enterprises, but once deployed they become a critical resource for rapid development of new services. Given the performance demands of AI workloads, a close proximity of storage to compute and GPU resources will be a driver for on-premises storage.
For many other enterprises who will not want to invest in costly AI infrastructures, a new breed of service providers termed Neoclouds are now offering AI centric infrastructures. Neoclouds have started offerings that provide GPUs and TPUs (graphics processing units and Tensor Processing Units) to facilitate model training, with traditional cloud style pay-as-you pricing offers. As service providers, Neoclouds will be aggregators of hundreds to thousands of AI use-cases and workloads. Given the large scale data storage requirements described above, it is clear that neocloud providers will be in need of modern storage infrastructures for AI data.
The Best AI Storage Solution: On-Premises Object Storage
The popularity of public cloud object storage such as AWS S3 attests to the power of object storage for solving the challenges of large-scale data. For emerging AI workloads, on-premises object storage provides the scalability advantages of public cloud storage, and also provides an answer to: “How can I keep the agility of public cloud storage, but gain better control, security and costs?” The answer is in on-premises object storage designed to deal with the unique challenges of the AI data pipeline.
On-premises object storage offers improved data privacy, security, and sovereignty. Regulated industries benefit from compliance with standards including HIPAA, GDPR and FedRAMP. In enterprises, on-premises object storage operates behind a corporate firewall, adding another level of security. It also supports sovereignty policy by enabling explicit location control over stored data.
Object storage has an added data protection benefit due to its intrinsic immutability, and vendors are now supplementing this with multi-level cyber-resiliency capabilities that protect at the API, network, storage, admin and operating system level. For example, vendors offer AWS compatible object-locking for data immutability, enabling IT to store data and prevent it from being inadvertently or maliciously deleted or overwritten. IT staff can also set governance, compliance, and retention object lock protocols, to control user access, overriding and deleting tasks. This makes cyber-resilient object storage even more optimal as infrastructure for AI data.
Refinements in Object Storage, notably multi-dimensional scaling (MDS) and disaggregated architecture, add more flexibility and budget control.
Built on S3 object storage architecture, MDS solves a number of challenges businesses face in scaling to support compute performance, metadata handling and transaction capacity.
AI, machine learning and the increasing volume of cloud workloads intensify the need to have a more cyber resilient architecture in place that can provide scalability without relinquishing performance. MDS enables a business to run diverse applications over many concurrent workloads, regardless of application or cloud storage location. If needed, MDS can facilitate exabyte scaling, adding servers or data centers to accommodate more volume, remaining online to avoid service disruption.
Disaggregated architecture is a budget saver, further controlling cloud storage costs by untying, or disaggregating, compute performance from storage scaling. Businesses can add performance without adding new costly storage servers, an evolutionary advancement from legacy storage systems which cannot scale performance without adding storage. By placing services onto dedicated servers and resources, businesses can scale to boost metadata service for higher ops/second, for example, without incurring capacity expense.
The ability to disaggregate services onto dedicated servers is particularly beneficial in large object data handling like medical imaging and AI data sets. These workloads may not require more storage but rather more system resources to avoid stressing out current storage capacity.
Evaluating AI’s Storage Future
Businesses examining their strategic plan for AI data use and storage have the option of committing to a public cloud hyperscaler provider or looking at object storage as a means of preparing for more AI volume, with cost control as a major factor.
It is an unusually difficult time to more precisely predict AI demand when looking at investment options in compute power and storage. As McKinsey notes, supply chain constraints like chip availability and regulatory hurdles, and fluctuating tariffs and export controls impact AI growth and investment.
Precision may not be possible. However, projecting future and present use case needs and evaluating whether a public hyperscale cloud or the flexibility, control and security of on-premises object storage is the right strategic choice is a good place to start.
# # #
About the Author
Paul Speciale is a data storage and cloud industry veteran, with over 20 years of experience with small and large companies. Paul is currently the Chief Technology Evangelist and CMO for Scality, leading the team across activities ranging from building awareness to content development and lead generation, as well as being a spokesperson for the company.