• Turing Post
  • Posts
  • Databricks: the Future of Generative AI in the Enterprise Arena

Databricks: the Future of Generative AI in the Enterprise Arena

Navigating the Shift: Databricks' Journey from Open Source to AI Powerhouse

Welcome to our new series! AI Infrastructure Unicorns. They provide the hardware, software, and services necessary for Generative AI companies but even if GenAI will someday become extinct, these infrastructure builders won’t stay without the job as they serve a much bigger industry of AI/ML models in general.

As Databricks positions itself at the forefront of the data analytics and AI revolution, the tech community watches with keen interest, especially amid speculation about its potential IPO. Led by CEO Ali Ghodsi, Databricks has embarked on a journey marked by strategic acquisitions and substantial investments. The recent purchase of AI data platform Einblick, along with the acquisition of MosaicML for $1.3 billion and Arcion for $100 million, reflects a deliberate strategy to strengthen its foothold in the generative AI landscape. With a valuation peaking at $43 billion after its latest funding round, Databricks' approach suggests a preference for expanding its technological capabilities and global presence over rushing toward an IPO.

Let's explore Databricks' unusual history, its contributions to the generative AI field for Enterprise, and what its strategies reveal about the evolving landscape of GenAI and data intelligence platforms.

  1. Starting point of Databricks

  2. We are open-source and... poor. Changes in strategy that led to $$$

  3. Turning to Generative AI (and proving again open-source is cool)

  4. Generative AI's role in the next wave of enterprise data applications

  5. Product policy: “Cannibalize yourself before someone else does”

  6. Mission

  7. Financial situation and acquisitions

  8. How does the company make money?

  9. Vision of the future of the AI industry

  10. Bonus: Seven brilliant founders behind Databricks

Starting point of Databricks

Databricks has a more-than-ten-years-long history starting in 2013 when the company was founded by seven enthusiasts from academia: Ali Ghodsi, Andy Konwinski, Arsalan Tavakoli-Shiraji, Ion Stoica, Matei Zaharia, Patrick Wendell, and Reynold Xin. But the ground for this company was prepared even earlier and started with the creation of a research project inside UC Berkeley AMPLab in 2009, the project that would be open-sourced one year later under the name “Spark.”

The rest of this fascinating story is available to our Premium users only. Please →

Leave a review!

Login or Subscribe to participate in polls.

Seven brilliant founders behind Databricks (what a miracle there was never a public fight between them!)

Ali Ghodsi

In 2009, Ali Ghodsi began his job as a visiting scholar at UC Berkeley. According to the official Databricks website, Ali was one of the creators of Apache Spark, a game-changer in big data analysis.

But his earlier research aimed at optimizing the behind-the-scenes processes, like resource allocation and data storage. These ideas made both Apache Mesos, which efficiently distributes computing tasks, and Apache Hadoop, a massive data storage system, run smoother and faster.

Andy Konwinski

In 2009, Andy was a PhD student who co-created Apache Mesos and Apache Chukwa (the project is now retired). He also worked on Spark. When the company started, he was the VP of Product AI/ML and created the Data+AI Summit, which began as a Spark Summit in 2013. Andy also co-founded Perplexity AI in 2022.

Ion Stoica

In 2009, Ion was (and still is) a co-director of the UC Berkeley AMPLab, the research group in which Apache Spark was born. Apart from Databricks, he also co-founded Conviva Networks in 2009 and Anyscale in 2019. He has been working as a professor at UC Berkeley Computer Science Department since 2000 and has a very rich list of former MS and Ph.D. students and postdocs who either joined Databricks, founded their own companies, or joined other famous tech companies. Ion joined the company as its first CEO in 2013 and transferred the company’s management to Ali in 2016.

Matei Zaharia

In 2009, Matei was a PhD student at UC Berkeley and was the one who started the Spark project. Apart from that, he contributed to many other open-source initiatives and worked as a professor at MIT, Stanford, and UC Berkeley. He took on the role of CTO.

Patrick Wendell

In 2009, Patrick was pursuing a Bachelor's at Princeton University. Apparently, he joined the Spark team as a Master's student at UC Berkeley in 2011. Now, Patrick is a VP of Engineering and a founding Committee and PMC member of Apache Spark.

Reynold Xin

In 2009, Reynold was a Software Engineer at IBM, and in 2010, he started his PhD at the UC Berkeley AMPLab, where he focused on scalable data processing. He wrote the highest cited papers in SIGMOD (The ACM Special Interest Group on Management of Data) and won the Best Demo Award at SIGMOD 2012 and VLDB (Very Large Data Bases) 2011.

He joined the company as a Chief Architect, overseeing technical contributions to Apache Spark and initiating efforts such as Data Frames and Project Tungsten. To demonstrate Spark's scalability and performance, he led the efforts in the prestigious Daytona GraySort contest, which benchmarks systems' ability to sort massive datasets. In 2014, his team smashed the previous record held by Hadoop, achieving a staggering 30x higher per-node efficiency. This accomplishment not only translated to sorting datasets 30 times faster with the same resources but also solidified Spark's position as a leading technology in big data processing.

Arsalan Tavakoli-Shiraji

Arsalan earned his Ph.D. at UC Berkeley in 2009 and joined McKinsey & Company as an Associate Principal, where he advised enterprises, vendors, and the public sector on a broad spectrum of strategic topics. He joined the company as an SVP of Field Engineering.

Ali said in the interview: “I think a lot of the success of Databricks was getting all these seven people to really trust each other and do great innovation. Very few companies have the pleasure of having that kind of critical mass of thought leaders together.

Thank you for reading, please feel free to share with your friends and colleagues. 🤍

Join the conversation

or to participate.