Why no one migrates from Databricks to Snowflake
The market trend is to migrate from Snowflake to Databricks. Why?
We'll discover it in this blog, but to set the stage, Don't you find it strange to search on Google for "Migrate from Databricks to Snowflake" and find almost nothing? I've found it quite difficult to find information on "How to migrate from Databricks to Snowflake" and even more so, companies that have done it and share their experience.
Introduction: Databricks' Dominance (aka The New Era)
Most search results are about how to migrate from Snowflake to Databricks (even when searching for the opposite). It's a reality that the degree of Databricks' penetration among Snowflake users is on the rise. This can be evidenced in the following image that noticeably reflects the growth rate of Databricks accounts from 2021 to 2023 (surely today the rate must be considerably higher).
It is true that it would be more relevant to analyze actual usage of accounts rather than just their activation. There's also the possibility that in some cases, one product is replacing another. However, in general terms, it becomes evident that Databricks is outpacing Snowflake.
I have only come across one case of a company that migrated from Databricks to Snowflake and claimed to have reduced costs by around 90%. Intrigued by this assertion, I decided to investigate thoroughly because it caught my attention, and I wanted to better understand it.
After consulting various sources such as LinkedIn, Reddit, Whitepapers, and speaking with people, I reached the following conclusion:
Databricks had nothing to do with the architecture they started from; in fact, these costs included other components that artificially inflated the baseline. The 'savings' do not reflect reality or are significantly exaggerated. This demonstrates how comparative studies in the tech industry often tend to be biased and should be analyzed with caution, although this does not negate that the switch to Snowflake was positive for this company
"Why are companies moving away from Snowflake?"
The success of Snowflake can be explained by the fact that a few years ago, everyone recognized the need for a Data Warehouse but not necessarily for Data Science. In my opinion, this has been the primary driver of Snowflake's early success because as a Data Warehouse tool, Snowflake is very good—it fulfills a specific need and does it very well.
The 'issue' with Snowflake begins when it starts to move away from its initial scope and increasingly tries to mimic Databricks' platform, which has a comprehensive approach to data and artificial intelligence. It's natural for Snowflake to do this because the market has shifted, and it's no longer just about Data Warehousing. Databricks has also made efforts to popularize Databricks SQL and has introduced advancements like Photon to enhance performance.
The image below shows the traditional scope of Snowflake. It's clear that as a platform, Snowflake couldn't fully address the Data & AI needs that modern enterprises require and was primarily focused on SQL-based business intelligence applications. This has driven Snowflake users to seek other platforms to fulfill the rest of their needs, and that's where Databricks comes into the picture."
The market has adopted Databricks because it is the only platform that covers the entire spectrum of Data and AI. What is happening currently is that many Snowflake users who are trying to cover everything that initially wasn't possible in Databricks have realized that what Snowflake does so well (Data Warehousing) can also be done in Databricks. This has driven many companies to start a process of decommissioning Snowflake and fully migrating to Databricks.
It Is true that Snowflake is striving to replicate Databricks' capabilities and has successfully been populating its solution. However, it is also losing its identity by aiming to become a Databricks itself. Databricks has also made efforts to compete with Snowflake in the Warehouse domain (Example: Photon), where Snowflake held a unique advantage, albeit only in performance (not functionality).
The price factor vs Databricks.
We can summarize this section by stating that Snowflake is expensive. To delve deeper into the topic and avoid oversimplification, let's explore further: As usage scales up, costs increase more rapidly than the performance gained. In other words, the issue lies in Snowflake's non-linear price/performance ratio.
In contrast, Databricks does not reach a point where costs increase disproportionately relative to the performance gained. This has been Snowflake's Achilles' heel. Initially, everything seems fine, but as the platform scales, it becomes costly. Additionally, the necessity in Snowflake to replicate data in a specific format also contributes to escalating costs.
The following image illustrates this issue: Databricks' performance remains proportional to the price paid, even as operational scale increases.
I always say, if money isn't a problem, using Snowflake SQL as a warehouse within Databricks isn't a bad option (i.e., as a "Gold Layer"), while Databricks handles the rest. Especially for business users familiar with the platform, this approach minimizes friction. It's worth noting that migrating from Snowflake to Databricks isn't difficult, and at SunnyData, we can assist you with that.
What will happen with Snowflake and what role will it play?
Rather than trying to predict the future, I'll explain what I'm seeing in the present and how I believe this trend will evolve further. Currently, I see two well-defined scenarios:
Clients using Snowflake as a Data Warehouse and for SQL/BI: These clients maintain or adopt Snowflake as a "Gold layer" (aggregated data), as discussed in the blog. They excel in this area, and it makes sense if cost is not a limiting factor.
Companies beginning a decommissioning process of Snowflake: These companies are realizing that they can also have a highly efficient Data Warehouse in Databricks, all within an integrated platform with additional benefits. The introduction of tools like Unity Catalog and Photon in Databricks also plays a significant role here.
Until a few months ago, the first option was more prevalent, but recently, I've been surprised by the number of clients migrating from Snowflake to Databricks.
Final conclusions
It's not surprising that Databricks is dominating and encroaching into Snowflake's territory. This isn't just a personal opinion but a fact supported by consulting firms like Gartner and Forrester.
In the next blog release, we will discuss how to migrate from Snowflake to Databricks. Thank you for reading this far!