Why the Big Three Can’t Win the Hybrid Multi-cloud Game

Why the Big Three Can’t Win the Hybrid Multi-cloud Game

The point of data is insight, and the point of insight is progress: lower costs, higher margins, more delighted customers, and in some cases, life-saving responses to emergent threats. What distinguishes a good data strategy from a bad data strategy is, to a first approximation, time to insight. How long does it take an enterprise to get to the right answer, strategy, or policy derived from data analysis? How responsive can the enterprise be to emergent threats, disruptions, or opportunities?

Because data is the lifeblood of the modern enterprise, it is, at worst, the second most strategic asset of every enterprise; arguably even more important than human capital. To date, modern data strategy has focused mainly on data volume, data velocity, and, to a much lesser degree, on data variety.  But we're now in a hybrid, multi-cloud era where the proliferation of data environments — that is, growth in the sheer number of public and private clouds — is the new, fierce driver of complexity. In the hybrid multi-cloud world, data management strategy is increasingly indistinguishable from enterprise growth strategy itself. While the Big Three — AWS, Azure, GCP — are absolutely critical to the functioning of this new world, it's my contention that a full solution to the unique challenges of hybrid, multi-cloud data strategy will have to come from outside the Big Three.

The Game Within the Game

How do we integrate data in the hybrid multi-cloud world? Unfortunately, we do it in the same way we've always done it: by moving and copying data between various storage systems — data lake to data warehouse and databases to and from cloud apps and APIs, etc. ETL jobs as far as the eye can see! However, we can safely assume that modern data integration needs to operate equally on both on-prem and cloud-hosted data assets, since 85% of all businesses, regardless of size, have data assets in multiple clouds. Even small businesses may have data hosted in Amazon S3, say, and sales data in Salesforce.

But wait, why does on-prem data matter at all since everything is moving to the cloud anyway, and isn't all cloud-hosted data already integrated? Let's take the second question first: the cloud is just someone else's data center and there was never any magical fairy dust that could be sprinkled on all the data in a data center to make it integrated, and there isn't any in the cloud either. Physical co-location of data does not mean that data is integrated and ready for analysis. What about the first question? Well, no, actually, everything isn't moving to the cloud, and even if it were, which cloud is everything moving to? There are, after all, several clouds. Increasingly, however, people realize that even if most data moves to the multi-cloud, what matters isn't so much "where is most of our data" but, rather, "in how many different places do we have data?"

Further, there is growing concern over consolidation and Big Three vendor lock-in, so much so that "data repatriation" — bringing data back home from the cloud to on-prem — is a growing thing. Consider, for example, why the Big Three's egress and ingress fees are radically asymmetric when their underlying costs are not. It's a trick question: the Big Three don't charge for ingress fees because once they have your enterprise data, they don't ever intend to let it go. Not really.

And all of this is to say nothing of industry-specific concerns; if I were the CIO of a large retail enterprise, I would be very skeptical of AWS since I'm not sure I want my competitor hosting my data. Finally, there's concern over costs, even absent worries overpower of lock-in. While everyone loves shifting CAPEX to OPEX, no amount of balance-sheet maneuvering can offset endless cost growth.

So where are we so far?

  • data has to be integrated into the hybrid, multi-cloud world to fuel better analytics.
  • including data in any of several clouds and the data still on-prem, which may never go to the cloud
  • number of places for data to live is increasing, not decreasing; this increases, rather than decreases, the difficulty of data integration
  • the future will look almost exactly like the present, only more so, for several decades: data in many different environments simultaneously and increasing requirements to integrate or connect that data

Vertically Integrated Stacks in a Best-of-Breed Era

We're already in a good place to understand why the Big Three cannot win the hybrid multi-cloud data integration game. The answer is that the Big Three are all operating cloud-native, vertically integrated stacks in a best-of-breed era. That means it will never be in the Big Three's self-interest to integrate with data that exists in one of its competitor's environments, and doing that is the key challenge of data integration in the hybrid multi-cloud. There is a fundamental disconnect in the interests of enterprise customers to connect data across the hybrid multi-cloud and the interests of the Big Three to capture and retain data assets within, but not across, their vertically integrated stacks.

That disconnect is a big problem, and it's the real reason the Big Three can't win, but let's unpack it a bit more. Since the modern data stack is very complex, let's just consider the three core parts:

1. Where is data stored?

2. Where is data governed and cataloged?

3. Where is data analyzed?

To get insight from data, you need to store data; you also need to govern and catalog that data; and finally, you need to bring that data together in some way to connect it so that you can analyze it to generate insight. Of course, you need to do other things, too, but we already have enough detail to make the main point.

So the Big Three have vertically integrated stacks covering the core trifecta — storage, governance, analytics — and they have massive data centers. So, everything is great right? Well, sure, everything is great if all of your data is also in — or can be made to reside within — the storage layer of the Big Three. But remember, nearly every company has data in many data environments, including on-prem.

And here again is the crux of the competitive dilemma for the Big Three, which is made up of economic, technical, and regulatory hurdles. As long as there is a multi-cloud, all enterprises will have data in many places. The Big Three are incentivized to lock those data assets away; hence, the asymmetry of egress and ingress, among other tactics. But the higher up the stack we look, the more we see a growing need to span the multi-cloud rather than merely consolidate it in the storage layer. Customers need to connect data no matter where it's stored. The Big Three are economically and technically incentivized to prevent that from happening. In fact, we don't have to look very high in the stack at all to see the need for multi-cloud integration; no, we have a multi-cloud data integration problem at the storage-governance-analytics core of the stack.

Largely fueled by data spread over the multi-cloud, and the desire to innovate, as well as seeking to avoid some of the cloud lock-in costs and threats I mentioned earlier, most enterprises have pursued a best-of-breed IT procurement strategy. The modern CIO wants to pick the best storage solution, the best governance or catalog solution, and the best analytics solution. For example, consider an enterprise that's standardized on Databricks (or Snowflake) for storage; Collibra (or Alation) for governance and cataloging; and, finally, Tableau (or PowerBI) for analytics.

For the modern CIO or CDO, the two horns of the dilemma look like this: either (1) adopt a vertically integrated stack from the Big Three and run the risks described above, including, critically, ignoring the data you've got in other environments; or (2) adopt a best-of-breed solution to the storage-governance-analytics core and carry the integration burden yourself. This dilemma is resolvable, but there's no such thing as a free lunch, here or anywhere else.

For economic and regulatory reasons, we can't expect consolidation or acquisition at this level; none of the Big Three or even the Big Five is going to acquire any of the others. That's unthinkable and would face massive regulatory hurdles, even if the economics made sense. While the Big Three are working on "bastion host" technologies to extend their cloud environments into the on-prem environments of their customers, which is a welcome trend, try to imagine them extending those technologies to each other. It's difficult to assume a world in which AWS's stack is available in GCP or Azure or vice versa.

Where Do We Go From Here?

The only tractable solution to the dilemma presented earlier is to embrace the best-of-breed horn and solve for the integration of storage-governance-analytics directly. And that leaves a clear and obvious opening for a data management player outside the Big Three, one without a vertically integrated stack to avoid self-disrupting, to out seat the Big Three and satisfy this unmet demand for hybrid multi-cloud data integration solutions. This is because that level of data connectedness is absolutely required to generate the kinds of data-driven insights necessary to be competitive in the knowledge economy.

The key insight here is that we can avoid the limitations of full-stack solutions only by moving to a data integration solution that:

1. moves data integration from storage to compute, that is, fully connects data without moving or copying it first

2. maybe operated both on-prem and in any cloud simultaneously

The Big Three will not self-disrupt and build such a solution themselves; hence, if the problem is going to be solved, it will be so by a vendor, startup, or coalition of vendors outside the Big Three. Physics and basic economics suggest that the winner of the data integration game for the next 20 years will have to be some market player other than AWS, Azure, GCP, that is, the Big Three cannot win this game.

About the Author:

Kendall Clark is the founder and CEO of Stardog, the leading Enterprise Knowledge Graph (EKG) platform provider. For more information visit www.stardog.com or follow them @StardogHQ.

Related Stories

No stories found.
logo
Analytics Insight
www.analyticsinsight.net