Databricks - A Unified Platform to collaborate Data, Analytics, and AI

Databricks - A Unified Platform to collaborate Data, Analytics, and AI
Databricks Success Story

Company Profile is an initiative by StartupTalky to publish verified information on different startups and organizations.

The modern world we live in today depends massively on data and information. Everything around us like, the things we use, see, and are being surrounded by, are in one or the other way influenced by technology.

As the need for technology grows, the significance of data started flourishing. With data piling up, the need for a warehouse to store, analyze and process these data for multiple purposes emerged.

This is where Databricks surfaced their platform. Databricks serves as a cloud platform to store enormous data that can be processed and run smoothly. This is an analytic platform that is built on their popular open-sourced product called Apache Spark. They’ve occupied a 10.19% market share and stand to be the third-largest occupant in the digital analytics market.

Databricks - Company Highlights

Startup Name Databricks
Headquarters San Francisco, California, United States
Industry Computer Software, Data, AI
Founders Ali Ghodsi, Andy Konwinski, lon Stoica, Patrick Wendell, Reynold Xin, Matei Zaharia, and Arsalan Tavakoli
Founded 2013

Databricks - About
Databricks - Industry
Databricks - Founders
Databricks - Startup Story
Databricks - Mission
Databricks - Logo
Databricks - Business and Revenue Model
Databricks - Employees
Databricks - Funding and Investors
Databricks - Acquisitions
Databricks - Social Media Presence
Databricks - Growth and Revenue
Databricks - Products and Features
Databricks - Investment
Databricks - Partnerships
Databricks - Competitors
Databricks - Future Plans

Databricks - About

Databricks was established by the creators of Apache Spark, as a Data and Artificial Intelligence (AI) company. It acts as a warehouse for any structured or unstructured data, on the cloud. Databricks also serves as a combined platform for all your Data, AI, and Analytics functions that helps data engineers, analysts, and data scientists to perform huge workloads, seamlessly. This is done by their Lakehouse Platform powered by Apache Spark, which is the best combination of features from Data Lakes (low-cost and flexibility) and Data Warehouses (performance efficiency).

In addition to Apache Spark, Delta Lake and MLflow are the other two open-sourced projects, that are behind the effective functions of the Lakehouse Platform. Databricks provide their Unified Data services through multiple clouds namely, Google Cloud, AWS, Microsoft Azure, and Alibaba Cloud.

About Databricks

Databricks - Industry

Data Industry has turned to be a large and significant industry in all aspects of life and business. According to Statista, the Data Market is expected to grow to a whopping $103 Billion by 2027. It is double the size of its presence in 2018. Artificial Intelligence is another rapidly growing market that has become an essential element in modern industries.

Key Drivers of Digital Transformation to re-invent business
Digital transformation is re-inventing Businesses and work culture. Read this article to know about the Key Drivers of Digital Transformation.

Databricks - Founders

Databricks Founders
Databricks Founders

Databricks was co-founded by a couple of professors from the University of California and five former Berkeley Ph.D. students.

  • Ali Ghodsi, co-founder and CEO of Databricks, was one of the creators of Apache Spark. He was a professor at the University of California (UC) as well as a board member in UC’s Rising Lab. He has held the primary responsibility for the growth and expansion of Databricks worldwide.
  • Ion Stoica, co-founder and Chairman of Databricks, is also a professor at UC Berkeley. He’s also a co-director at AMPLab. In addition to this, he co-founded a start-up called Conviva, for video distribution on a large scale.
  • Matei Zaharia, co-founder and Chief Technologist at Databricks, was earlier a part of the Spark project and now, is the Vice President of Apache Foundation. ACM Doctoral Dissertation Award was given to him in 2014 for his research in large-scale computer systems.
  • Patrick Wendell, co-founder and Vice President of Engineering in Databricks, had played a major role in Spark’s operations.
  • Reynold Xin, co-founder and Chief Architect and takes care of the technical operations in Apache Spark. He won the Best Demo Award in 2011 at VLDB.
  • Andy Konwinski, co-founder and Vice President of management, takes care of the AI operations in Databricks. Earlier he took care of the company’s market efforts in Spark Summit creation.
  • Arsalan Tavakoli-Shiraji, co-founder and Senior Vice President of field engineering in Databricks, earlier worked in McKinsey as Associate Principal. He was a former Ph.D. student at UC Berkeley.

Databricks - Startup Story

Ali Ghodsi, the CEO of Databricks was keen on coding since the age of 8 when his parents bought him a used Commodore 64. He pursued his higher education in computer engineering and a Ph.D. in distributed computing. Later, in 2009, he joined hands with Ion Stoica and they together created ‘Spark’, which was already instigated by Matei Zaharia.

They further coordinated with another team working on Machine Learning, and they together introduced ‘Apache Spark’ in the market. At first, no companies paid any attention, as the technology seemed alien. In 2013, Ben Horowitz (Co-founder of Andreessen Horowitz VC), planted some hope in them by investing $14 Million and encouraged them on creating a company, that serves as a platform to run Apache Spark. Thus, Databricks was established in 2013.

Databricks - Mission

Databricks functions with a mission to make Data Unification more efficient, by innovating new techniques to unify Data, AI, and Analytics. They strive to make the customer experience more engaging.

Databricks Logo
Databricks Logo

Databricks logo resembles two bricks aligned perfectly like data folders organized on a shelf. It seems that Databricks intended to keep the logo with a starting and ending point without any breaks in-between. This may be done to imply that they unify data collection, storage, and analytics functions under one common platform with no need for an exit, as everything is covered here.

Databricks - Business and Revenue Model

Their business model is positioned on the web-based software that provides a platform to work with Apache Spark. It facilitates automatic group management and Python-style notebooks for Data engineers and scientists.

Databricks provides its resources in the form of Software as a Service (SaaS) and generates revenue through its subscriptions. Their major services are through three cloud platforms namely:

Though the prices vary for each cloud, there’s a common factor to be noted: “Only pay for what you use”. Costs are calculated independently of the services opted for and require no up-front payment. The customers are required to pay only for the number of resources used as they go.

Databricks - Employees

Databricks has over 5,001 - 10,000 employees around the world as of 2023. In November 2019, Databricks celebrated the milestone of having hired the 1000th full-time employee for them. It took 6 years to reach the first 1000 employees and less than 2 years to hire the rest.

Databricks - Funding and Investors

With its recent funding of $503.7 million, Databricks has raised $4 Billion through 12 funding rounds since its formation. A total of 49 investors have so far invested in Databricks.

Date Stage Amount Investors
September 14, 2023 Series I $503.7M T. Rowe Price
July 31, 2023 Secondary Market - -
March 3, 2023 Series H - -
September 4, 2021 Angel Round $200K -
August 31, 2021 Series H $1.6 Billion Counterpoint Global (Morgan Stanley), Baillie Gifford, ClearBridge Investments, UC Investments, Andreessen Horowitz, Amazon Web Services (AWS), Microsoft, CapitalG, CPP Investment Board, Coatue Management, Fidelity Management & Research, Franklin Templeton, GIC, Greenoaks, Octahedron Capital, T. Rowe Price Associates, Tiger Global Management, Whale Rock Capital Management, Insight Partners, Gaingels, New Enterprise Association, Alta Park Capital, a suite of BNY Mellon funds, Discovery Capital, Dragoneer Investment Group, Flucas Ventures, the House Fund Geodesic, and Green Bay Ventures.
February 1, 2021 Series G $1 Billion Franklin Templeton, CPP Investment Board, Fidelity Management & Research LLC, Whale Rock, Amazon Web Services (AWS), CapitalG, Salesforce Ventures, Microsoft, Andreessen Horowitz, Alkeon Capital Management, BlackRock, Inc., Coatue Management, T. Rowe Price Associates, Tiger Global Management, New Enterprise Association, Discovery Capital, Dragoneer Investment Group, Founders Circle Capital, Geodesic, GIC, Green Bay Ventures, Greenoaks Capital and Octahedron Capital.
October 22, 2019 Series F $400 Million Andreessen Horowitz, BlackRock, Inc., T. Rowe Price Associates, Tiger Global Management, Coatue, New Enterprise Association, Microsoft, Alkeon Capital Management, Dragoneer Investment Group, Geodesic, and Green Bay Ventures.
February 5, 2019 Series E $250 Million Andreessen Horowitz, Microsoft, Coatue, Battery Ventures, New Enterprise Association, Green Bay Ventures, and Geodesic Capital.
August 22, 2017 Series D $140 Million New Enterprise Association, Andreessen Horowitz, Battery Ventures, Geodesic Capital, and Green Bay Ventures.
December 15, 2016 Series C $60 Million New Enterprise Association, Andreessen Horowitz and SineWave Ventures.
June 30, 2014 Series B $33 Million New Enterprise Association, Andreessen Horowitz and DCVC.
September 25, 2013 Series A $14 Million Andreessen Horowitz, SV Angel and Alfred Chuang.

Databricks - Acquisitions

Databricks has so far acquired seven companies. Below are the details:

Account Name Date Amount
Arcion Oct 23, 2023 $100M
MosaicML Jun 26, 2023 $1.3B
Okera May 3, 2023 -
DataJoy Inc. Oct 13, 2022 -
Cortex Labs Apr 15, 2022 -
8080 Labs Oct 6, 2021 -
Redash Jun 24, 2020 -

List of Top Cloud Computing Startups in India and their growth
Cloud Computing has become the new norm, the trend has given birth to a huge number of startups that rely on the cloud. These companies are more efficient.

Databricks - Social Media Presence

Databricks has good presence in Twitter and LinkedIn they utilizes these platforms to promote its products and services to gain a market advantage. They also post regarding their world tours and launch events with their latest inventions. Links to Blogs and Articles featuring Databricks or their products and information related to job openings can also be found on their social platforms.

Databricks - Growth and Revenue

Databricks was established in 2013, keeping Spark Technology as its core. Its formation was immediately succeeded by a rumor that ‘Spark Technology won’t work if your data doesn’t fit in their memory’. This discouraged businesses to use Spark.

Finally, in 2015, the founders decided to end these rumors by participating in a contest where they beat the world record for processing one petabyte of data in the lowest time and as a result, they gained media attention and popularity.

By 2017, they were valued at $500 Million but their annual revenue was way lower at $1 Million. Later, participating in the ‘sorting contest’, making some changes in employee hiring and deciding to build software with features demanded by large enterprises, turned out to be fruitful.

Since then, Databricks’s growth is only climbing uphill. Their revenue hit the $100 Million mark for the first time in 2018 and took just another year to reach $200 Million in 2019. The introduction of the Lakehouse feature was a primary factor for its success. The company’s valuation grew from $6.2 Billion in Q3 of 2019 to around $38 Billion in Q3 of 2021.

Databricks reported annual recurring revenue of $425 Million in 2020.

Databricks disclosed that during the fiscal year that concluded on January 31, 2023, it brought in over $1 billion in revenue. The business reported that it expanded by more than 60% in the previous year 2022.

Top 15 Highest Valued Startups in the world
Here’s a list of top 15 highest valued startups in the world. The parent company of TikTok, Bytedance is the most valuable startup in the world with a valuation of $280 billion.

Databricks - Products and Features

Some of the latest prominent launch are:

Data Unity With New Delta Lake Release

Databricks, announced a new version of its Delta Lake data storage format on June 28, 2023. According to the company, this version eliminates data silos. The latest addition to the rival open - source standards for the analytic data tables in data lake systems is Delta Lake 3.0, which includes Iceberg and Hudi from the Apache Foundation.


Databricks unveiled an open-source language model that allows programmers to create their own chatbot applications driven by AI on March 24, 2023.

Lakehouse Federation

At its Data + AI Summit, Databricks launched what it refers to as its Lakehouse Federation function on June, 28, 2023. With this new feature, businesses can discover, query, and administer their data across a wide range of platforms by combining their disparate walled data systems.

Databricks - Partnerships

Databricks has partnered with many companies. Some of the lates prominent partnerships are:


With a new partnership with Databricks in August, 2023 to market AI app-development tools, Microsoft has increased the scope of its AI goals. Businesses will be able to create their own AI models from scratch using the Databricks software.


On September 11, 2023, Databricks and Kobai partnered. Customers may take use of the power and scalability of the Databricks Lakehouse Platform, along with the simplicity and insights of knowledge graphs.

3i Infotech

In order to generate business value by combining data and AI on a single platform, 3i Infotech Ltd and Databricks has partnered on October 18, 2023.

Databricks - Investment

Databricks has invested in 24 companies. Some of the investments are listed below:

Account Name Date Amount
Perplexity AI 2022 -
Arcion 2018 - Jan, 2017 -
Catalyst Sep, 2017 -
Cleanlab - -

Databricks - Competitors

Some of the top competitors of Databricks are:

  • Snowflake
  • Cloudera
  • Datastax
  • Qubole
  • Alteryx
  • Dremio
  • Intellicus

Here are a few comparisons with some competitors:

Snowflake - Snowflake is much larger than Databricks. They both offer similar services with few differences (Databricks processes large data while Snowflake offers elasticity of cloud data for centralized access) at a flexible price. Databricks is making a long battle to overcome its competitor.

Cloudera - Cloudera provides a common cloud storage and management platform that stores, processes, and analyses data for an organization. It is similar to that of Databricks in the form of Data Warehouse, Processing, and Distribution.

Embedded BI Tools for SaaS | SaaS Business Intelligence Software
What is embedded BI? Embedded business intelligence is the integration of BI capabilities within business process applications or portals. Read more here!

Databricks - Future Plans

It is evident that Databricks  was working on two of the fastest-growing big data domains, Streaming and Deep-Learning  in 2021. They were building a multi-faceted Application Programming Interface (API) to process these two domains. Databricks is also keen on accelerating the innovation of Data Lakehouse to gain a greater advantage by conquering data-driven organizations.

According to their website, Databricks plans to enable the workspace's favorites feature. Notes, dashboards, experiments, and searches may all be saved to a list of favorites, which you can then access from the homepage.

Databricks - FAQs

What is Databricks?

Databricks is a cloud-based tool for storing and processing huge quantities of data using Machine Learning models. This is done through their Apache Spark tool.

Who founded Databricks?

Databricks was co-founded by seven people namely, Ali Ghodsi, Ion Stoica, Matei Zaharia, Patrick Wendell, Reynold Xin, Andy Konwinski, and Arsalan Tavakoli-Shiraji.

How much has Databricks secured through funding?

Databricks secured around $4 Billion through 12 funding rounds.

What is the annual revenue of Databricks?

Databricks has reported an annual recurring revenue (ARR) of $1.275 Billion for the year ending 2022.

Who are the clients of Databricks?

Databricks has around 6000+ customers worldwide. Some of their popular clients are:

  • Shell
  • CVS Health
  • Regeneron
  • T-Mobile
  • HSBC
  • Comcast

Must have tools for startups - Recommended by StartupTalky

Read more