– Daniel Hardman, software architect at Adaptive Computing, says:
Adaptive Computing’s announcement about Big Workflow claims that we need a fundamentally new approach to compute problems at the nexus of Big Data, Cloud, and HPC. Why? IT’s already made huge progress since the days of rack-and-stack datacenters. This is the age of software-defined everything. Hadoop’s picking up steam. HPC folks are exploring cloudbursting. So what’s the problem?
To explain, let me tell you a little story…
An ordinary day in the life of a CIO
Suppose you’re the CIO of Acme Corp, manufacturer of thousands of varieties of widgets. One day the Chief Marketing Officer walks in and announces that she needs some guidance on exactly how to craft the marketing campaign for Widget9000, the latest cool gadget that’s about to roll out of R&D. She says she has data to help her make decisions, but she needs your computing skills and resources to crunch it.
“Great!” you say. “What kind of data are we talking about?”
“Well,” the CMO replies, “I bought access to a corpus of tweets so we can do sentiment analysis and segmentation. It’s about 500 billion tweets, growing at another billion each day. Right now we can access it over a web services API that twitter exposes—or we can have them ship us hard drives by courier, if we’re willing to wait a couple weeks. I need an answer in 60 days, so you’ll have to tell me whether that’s a good tradeoff. We have to correlate the tweets with the corpus of advertising data from DoubleClick that we already have sitting on the Hadoop cluster at our Austin datacenter, plus some realtime financial feeds; those cover pricing and competitor stock prices and dump about 1 TB per day onto our S3 storage cloud. We’ll use all this stuff to generate scenarios, and do some serious number crunching of risk/reward profiles on our HPC cluster that’s also in Austin.”
“Sounds like a challenging project,” you say. You’re already imagining the low-priority tasks that you can kill to free up a data scientist, the sorts of IT tickets your team will need to handle, the network pipes you’ll need to upgrade, and the extra servers you’ll need to buy. Good thing the CMO has her own budget…
Take a deep breath…
Three weeks later, the team and technology built around the CMO’s project is running at breakneck speed. It’s going to be tight, but you think you can have answers before Widget9000′s launch is finalized.
And then the CEO walks in, looking pale.
“We’re being targeted for a hostile takeover,” he says. “Our stock price is going crazy. I need you to drop everything and do some hard-core Monte Carlo analysis of a whole range of portfolio decisions. I really need answers today, but I know that’s unrealistic. How about end of week?”
You get a sinking feeling… You know the Monte Carlo simulations the CEO has in mind; he ran something similar before last year’s merger. Then, he gave you a month. Doing the same thing in 3 days is impossible with the HPC cluster you have. Cloudbursting might help, but that takes planning and human capital to pull off, and the team’s already working 20-hour days. You happen to know that the CMO’s project is not working the Hadoop cluster all that hard, but there’s no easy way to repurpose it for a few days. A wild idea about buying cycles on NCSA’s for-rent HPC cluster pops in your brain, but you’re not sure whether data transfer constraints would allow it. And you know that the aggressor in the hostile takeover bid uses that same cluster…
While you’re stuttering, one of your sysadmins runs in and makes a breathless announcement. “Amazon’s having a major disruption on the east coast. Data access speeds for the financial feeds have dropped through the floor. They’ll probably be browned out for the rest of the day.”
Where Traditional IT Projects Are Challenged
Step back from the details of this story for a minute, and look at the big picture.
The fictional CIO has lots of unsurprising challenges—but two phenomena make them particularly overwhelming: big data and shifting priorities. Big data introduces complex constraints on what is practical, where computation has to happen, and the who and when of process flow. Shifting priorities mean that choices about what’s optimal have to be reanalyzed without warning, with a realistic assessment of tradeoffs, risk, and quantifiable impact to schedule and budgets.
The way that IT has traditionally approached problems is to formalize, process-ize, and project-ize them. This tends to lead to more-or-less static thinking, and to technological silos (notice how the CIO has hand-built a technology stack and a human team around the CMO’s specific problem). When I was an employee at Symantec, IT had a mandate to deploy single sign-on across all enterprise products. The rollout took years. It had multiple phases, pilot groups, tiered and sandboxed budgets, designated hardware and software purchases, and so forth. You were either using SSO, or you weren’t. Authentication was either handled by an SSO provider, or by a legacy system. The expertise and management lifecycles for each was different, and there was no casual intermingling of boundaries or resources.
In our story, the fictional CIO still has the static, semi-permanent mindset, although to his credit he’s aware of new possibilities and is beginning to think outside the box.
Big Workflow Thinks About Flux Differently
What he’s missing is a “Big Workflow” view of the world.
When you see the problem through a big workflow lens, disruptions are not anomalies—they’re expected and manageable events. Sure, you still stand up email servers and CRM in more-or-less permanent fashion—but you also see flux as a normal and happy part of your business imperatives; it’s an opportunity for competitive advantage, not a corner case to be shunned. Your mindset recognizes silo boundaries, but assumes they are at least somewhat collapsible. Each problem is an organic whole, and tradeoffs are susceptible to automation and policy. The complexity of ad-hoc, reactive firefighting is tamed.
In other words, you have big workflows, not big logjams.
Making It Real
What makes this possible is an astute combination of the following technologies:
The knowledge that today resides in a data scientist’s head–stages in a data set’s lifecycle, constraints on movement, cost to duplicate or re-create, regulatory and privacy constraints, sequencing and cost-to-compute—is all formally modeled and thus amenable to software management.
A robust, policy-based optimizer that can arbitrate between competing priorities makes minute-by-minute rethinking a tractable problem.
Familiarity with all forms of virtualization and software-defined everything is a key enabler. Battlescars and a capacity to build and optimize clouds is a huge deal.
Hadoop, HPC, and image rendering farms are all seen as variations on a theme. Ability to speak their languages natively, and to pass workloads across their boundaries, makes these silos less problematic.