Basho Technologies

frank wu

– Frank Wu, Basho Technologies, Regional Account Manager, says:

Enterprise IT is a funny space. I fell into the industry years ago and spent a good number of years at a Fortune 500 pitching traditional monolithic, datacenter architectures to other Fortune 500s.

Initially entering the United States by acquiring an IBM mainframe plug-compatible company in the 1970s, my organization had a “mainframe heritage” embedded in its DNA. We sold “trusted, proven” hardware platforms, i.e. large UNIX servers and even larger Storage Area Networks (SANs)

What did it mean to be “trusted & proven” though? Typically, it meant a highly reliable system less prone to failure. These feature sets offered mission-critical RAS capabilities that were developed over many years including:

  • ECC protection, instruction-level retry, register protection, cache degradation
  • Predictive self-healing software UNIX operating systems
  • Hardware redundancy – hot swappable processors, memory, I/O cards, & drives

This type of hardware reliability did not come at a small price. Companies such as IBM, EMC, Fujitsu, and others have spent decades investing billions of R&D dollars into making redundant architectures, making incredibly reliable machines and then marketing them as Trusted, Proven, especially for databases that reside on single machines.

Storage Area Networks (SANs) really took off with the introduction of Server Virtualization and help aid the concept of increasingly hardware commoditization in the server-space. For even though processors were getting increasingly faster, small Intel x86 1U/2U Servers weren’t necessarily getting more reliable.

Server Virtualization solved these hardware failures by pooling compute resources together in a virtualized pool made of many “servers” but then put all the data in a single storage repository. Trusted & Proven remember?

These storage devices inherited mainframe-like reliability and even a floating army of technicians nationwide on standby to show up at a moment’s notice and swap out a failed DIMM or hard drive. Even with all that reliable technology built-in, enterprises demand 24x7x4 SLAs.

For those that live in this world however, the reality is much less rosier. In reality, things break. And when they do they fall hard. When these type of machines break, despite the overabundance of precautions noted above, super-man firefighters are called upon for nominal $500/hourly fee as a database, storage, network, virtualization, and whatever specific other highly paid domain consultant.

Distributed Systems’ premise is fundamentally different in that they acknowledge things fail from the onset. 

Global communications, America

Rather than build reliability into a single machine why not have multiple machines, and with it multiple copies of data? In a true distributed peer-to-peer environment where there are no master-slave replication relationships, applications always have a “node” ready to write to and “read” back.

If a node breaks, other nodes step-in to take the write, and then hands it back to the node when it come back alive, often temporarily due to network partitions. [1]

In an individual peer-to-peer node, hardware sizing still remains a concern, whether on-premise or in the cloud, however it becomes a great deal simpler when you only deal with individual distributed nodes versus: Compute Servers, Hypervisors, Switches, SANs, File Systems, Databases, and then your Application. Operationally, the TCO is paid back once you account the cost of those superman-firefighter consultants.

Basho Technologies, with distributed systems in its DNA (ex-Akamai legacy) works on problems like these and offers new platforms such as Seagate’s Kinetic Open Storage Vision.[2]

Other major products have also been creating including Riak, a NoSQL key-value distributed database that has seen production deployments at multiple Fortune 500 enterprises including: Best Buu, Comcast, The Weather Channel, State Farm, TBS, and others.

Amazon designed distributed systems [3] in order to allow Americans to always add Kindles to their shopping carts in an increasing 24×7 world, and because datacenter hardware eventually always broke at scale. These technologies are allowing enterprises to stop worrying about hardware failures – and enable 24×7 application availability.

For even though NoSQL is allowing faster development, the higher availability piece sometimes gets lost in the fold, and is one characteristic which enabled the United Kingdom’s National Health Services (NHS) to deploy a single global distributed database for all  80 million citizens holding patient records without constantly worrying about hardware failures. [4]

Relational Databases and large scale-up machines are not going anywhere, however the world is increasingly become distributed and with it polyglot persistence in the database world. IBM continues to have double-digit growth in mainframe sales[5] (and with it margin), but distributed systems are here to stay.

And hey, who is to say decades from now, or even sooner: Distributed Systems will also be known as Trusted, Proven Platforms. Considering one doesn’t have to worry about hardware failures in this world, that’s not a bad thing to be thankful for this Thanksgiving.

  1. “The Network is Reliable”
  2. “The Seagate Kinetic Open Storage Vision”
  4. “NHS to benefit from agile development [and more] with selection of open source database Riak for Spine2”
  5. “IBM posts 17% drop in profit despite mainframe sales increasing”

About the Author

Frank Wu, Basho Technologies, Regional Account Manager-NYC

Frank is a recovering datacenter hardware geek and still enjoys a clean internal server system layout, especially with regards to energy efficiency. In his previous life of running multiple sales & marketing position at Fujitsu, he is now at Basho Technologies and believes he can hang with the cool open-source, distributed crowd after many years of scale-up architectures.