Does Snowflake mean the end of open source?

The cloud-based enterprise data platform may mark the end of a decades-long run in the dominance of open source infrastructure

Does Snowflake mean the end of open source?
wezlo (CC0)

The Snowflake IPO was a big deal, and not merely because of the company’s enormous valuation.

In 2013 Cloudera co-founder Mike Olson confidently (and accurately) declared “a stunning and irreversible trend in enterprise infrastructure.” That trend? “No dominant platform-level software infrastructure has emerged in the last 10 years in closed-source, proprietary form.” Snowflake, a cloud-based enterprise data platform, may spell the end of that run. 

Sure, we had Splunk, but Splunk squeaked through the hypothesis police before open source had found its feet, as Lightspeed partner Gaurav Gupta told me. MySQL, Apache Hadoop, MongoDB, Apache Spark... all of them (at least initially) open source.

But now... Snowflake. Is Snowflake a snowflake? Or is the era of open source infrastructure coming to a close?

Closing up shop?

In part the answer to that question depends on just how fiercely you’re prepared to defend the underlying assumption. After all, it’s simply not the case that all “dominant platform-level software infrastructure” is open source. This isn’t really to dispute Olson’s central thesis, because it’s absolutely true that the bulk of enterprise infrastructure has trended toward open source over the past 10 to 20 years.

As Gordon Haff puts it, “You can certainly construct a narrative for the infrastructure being heavily driven by open source: Most NoSQL, Hadoop, Kafka, Spark, Ceph, Jupyter, etc. But a lot in the space isn’t as well: lots of cloud services, Tableau, Splunk, etc.” And Snowflake, of course.

Though you’d never guess it from the energetic proselytizing of yesteryear, developers have never been overly religious about open source. The reason for that “stunning” trend is simply that open source made it easier for developers to get their jobs done thanks to high-quality, easily accessible open source data infrastructure. There are, of course, other benefits, such as the communities that often accompany open source projects, coupled with a desire to have more granular control of one’s software stack. But ultimately open source has won because it enables developers to “get ---- done.”

Which is why, for example, you’ll find developers happy to use open source software like Apache Airflow to load data into their proprietary Snowflake data platform. It’s not cognitive dissonance. It’s pragmatism.

The shift to managed services

Speaking of such pragmatism, Tom Barber suggests that the shift to managed cloud services somewhat negates “people’s interest in open source... because with SaaS you’re not paying for licenses but for a service, which changes the thinking somewhat.” After all, he continues, “Open source meant you didn’t pay for licenses but you still had to pay someone internal or external to install it, tune it, run it…. Most people can apt/yum install MySQL but tuning it requires in-depth knowledge.”

Or let’s express that another way, as Redmonk analyst James Governor does: “Cloud is a better distribution and packaging mechanism than open source ever was…. Convenience is the killer app. Managed services win.” Or, as Olson himself suggested to me,

I still believe that open source software provides strategic advantage. But “elimination of friction” isn’t the differentiator it seemed a decade ago. Smart cloud folks learned that lesson; proprietary infra in the cloud is super easy to acquire and use. 

That’s not to say open source is irrelevant. Far from it. “Open source is not a business model but is a great way to build software, build trust, and foster community,” Governor continues.

That “great way to build software” also applies to SaaS vendors like Snowflake. While services like Snowflake might not be open source, they’re actively using open source under the hood, as Gordon Haff suggests. For example, Snowflake relies on open source FoundationDB as “a key part of our architecture [because it] has allowed us to build some truly amazing and differentiating features.”

A Whitesource analysis in 2019 found that 99 percent of software includes open source. Snowflake, in this respect, is no snowflake.

Open source, in sum, still matters. A lot. 

Open source under the hood

But for would-be buyers of services like Snowflake, open source might not be the primary attraction. As Ken Horn posits, data, not source code, needs to be the “first-class citizen” for something like Snowflake. And “once on cloud, the whole open source software thing is a bit :shrug:.”

It’s not “shrug” for Snowflake and other vendors who may choose to deliver data warehousing and other such services, because open source affords them the chance to build on a rich ecosystem of open source foundational building blocks. But for the would-be buyers, they just need “to get ---- done,” and this may mean they don’t want to perform the spade work sometimes associated with open source.

So is Olson’s 2013 declaration wrong? No, but perhaps we can rephrase it: No dominant platform-level software infrastructure has emerged in the last 10 20 years in closed-source, proprietary form that is not either licensed as open source software or that heavily depends upon open source software.”

Read more about open source:

Copyright © 2020 IDG Communications, Inc.