Databricks - A Unified Platform to collaborate Data, Analytics, and AI
📄Company ProfilesCompany Profile is an initiative by StartupTalky to publish verified information on different startups and organizations.
The modern world we live in today depends massively on data and information. Everything around us like, the things we use, see, and are being surrounded by, are in one or the other way influenced by technology.
As the need for technology grows, the significance of data started flourishing. With data piling up, the need for a warehouse to store, analyze and process these data for multiple purposes emerged.
This is where Databricks surfaced their platform. Databricks serves as a cloud platform to store enormous data that can be processed and run smoothly. This is an analytic platform that is built on their popular open-sourced product called Apache Spark. They’ve occupied a 10.19% market share and stand to be the third-largest occupant in the digital analytics market.
Databricks - Company Highlights
Startup Name | Databricks |
---|---|
Headquarters | San Francisco, California, United States |
Industry | Computer Software, Data, AI |
Founders | Ali Ghodsi, Andy Konwinski, lon Stoica, Patrick Wendell, Reynold Xin, Matei Zaharia, and Arsalan Tavakoli |
Founded | 2013 |
Website | databricks.com |
Databricks - About
Databricks - Industry
Databricks - Founders
Databricks - Startup Story
Databricks - Mission
Databricks - Logo
Databricks - Business and Revenue Model
Databricks - Employees
Databricks - Funding and Investors
Databricks - Acquisitions
Databricks - Social Media Presence
Databricks - Growth and Revenue
Databricks - Products and Features
Databricks - Investment
Databricks - Partnerships
Databricks - Competitors
Databricks - Future Plans
Databricks - About
Databricks was established by the creators of Apache Spark, as a Data and Artificial Intelligence (AI) company. It acts as a warehouse for any structured or unstructured data, on the cloud. Databricks also serves as a combined platform for all your Data, AI, and Analytics functions that helps data engineers, analysts, and data scientists to perform huge workloads, seamlessly. This is done by their Lakehouse Platform powered by Apache Spark, which is the best combination of features from Data Lakes (low-cost and flexibility) and Data Warehouses (performance efficiency).
In addition to Apache Spark, Delta Lake and MLflow are the other two open-sourced projects, that are behind the effective functions of the Lakehouse Platform. Databricks provide their Unified Data services through multiple clouds namely, Google Cloud, AWS, Microsoft Azure, and Alibaba Cloud.
Databricks - Industry
Data Industry has turned to be a large and significant industry in all aspects of life and business. According to Statista, the Data Market is expected to grow to a whopping $103 Billion by 2027. It is double the size of its presence in 2018. Artificial Intelligence is another rapidly growing market that has become an essential element in modern industries.
Databricks - Founders
Databricks was co-founded by a couple of professors from the University of California and five former Berkeley Ph.D. students.
- Ali Ghodsi, co-founder and CEO of Databricks, was one of the creators of Apache Spark. He was a professor at the University of California (UC) as well as a board member in UC’s Rising Lab. He has held the primary responsibility for the growth and expansion of Databricks worldwide.
- Ion Stoica, co-founder and Chairman of Databricks, is also a professor at UC Berkeley. He’s also a co-director at AMPLab. In addition to this, he co-founded a start-up called Conviva, for video distribution on a large scale.
- Matei Zaharia, co-founder and Chief Technologist at Databricks, was earlier a part of the Spark project and now, is the Vice President of Apache Foundation. ACM Doctoral Dissertation Award was given to him in 2014 for his research in large-scale computer systems.
- Patrick Wendell, co-founder and Vice President of Engineering in Databricks, had played a major role in Spark’s operations.
- Reynold Xin, co-founder and Chief Architect and takes care of the technical operations in Apache Spark. He won the Best Demo Award in 2011 at VLDB.
- Andy Konwinski, co-founder and Vice President of management, takes care of the AI operations in Databricks. Earlier he took care of the company’s market efforts in Spark Summit creation.
- Arsalan Tavakoli-Shiraji, co-founder and Senior Vice President of field engineering in Databricks, earlier worked in McKinsey as Associate Principal. He was a former Ph.D. student at UC Berkeley.
Databricks - Startup Story
Ali Ghodsi, the CEO of Databricks was keen on coding since the age of 8 when his parents bought him a used Commodore 64. He pursued his higher education in computer engineering and a Ph.D. in distributed computing. Later, in 2009, he joined hands with Ion Stoica and they together created ‘Spark’, which was already instigated by Matei Zaharia.
They further coordinated with another team working on Machine Learning, and they together introduced ‘Apache Spark’ in the market. At first, no companies paid any attention, as the technology seemed alien. In 2013, Ben Horowitz (Co-founder of Andreessen Horowitz VC), planted some hope in them by investing $14 Million and encouraged them on creating a company, that serves as a platform to run Apache Spark. Thus, Databricks was established in 2013.
Databricks - Mission
Databricks functions with a mission to make Data Unification more efficient, by innovating new techniques to unify Data, AI, and Analytics. They strive to make the customer experience more engaging.
Databricks - Logo
Databricks logo resembles two bricks aligned perfectly like data folders organized on a shelf. It seems that Databricks intended to keep the logo with a starting and ending point without any breaks in-between. This may be done to imply that they unify data collection, storage, and analytics functions under one common platform with no need for an exit, as everything is covered here.
Databricks - Business and Revenue Model
Their business model is positioned on the web-based software that provides a platform to work with Apache Spark. It facilitates automatic group management and Python-style notebooks for Data engineers and scientists.
Databricks provides its resources in the form of Software as a Service (SaaS) and generates revenue through its subscriptions. Their major services are through three cloud platforms namely:
- Microsoft Azure
- Google Cloud
- Amazon Web Services
Though the prices vary for each cloud, there’s a common factor to be noted: “Only pay for what you use”. Costs are calculated independently of the services opted for and require no up-front payment. The customers are required to pay only for the number of resources used as they go.
Databricks - Employees
Databricks has over 5,001 - 10,000 employees around the world as of 2023. In November 2019, Databricks celebrated the milestone of having hired the 1000th full-time employee for them. It took 6 years to reach the first 1000 employees and less than 2 years to hire the rest.
Databricks - Funding and Investors
With its recent funding of $503.7 million, Databricks has raised $4 Billion through 12 funding rounds since its formation. A total of 49 investors have so far invested in Databricks.
Date | Stage | Amount | Investors |
---|---|---|---|
September 14, 2023 | Series I | $503.7M | T. Rowe Price |
July 31, 2023 | Secondary Market | - | - |
March 3, 2023 | Series H | - | - |
September 4, 2021 | Angel Round | $200K | - |
August 31, 2021 | Series H | $1.6 Billion | Counterpoint Global (Morgan Stanley), Baillie Gifford, ClearBridge Investments, UC Investments, Andreessen Horowitz, Amazon Web Services (AWS), Microsoft, CapitalG, CPP Investment Board, Coatue Management, Fidelity Management & Research, Franklin Templeton, GIC, Greenoaks, Octahedron Capital, T. Rowe Price Associates, Tiger Global Management, Whale Rock Capital Management, Insight Partners, Gaingels, New Enterprise Association, Alta Park Capital, a suite of BNY Mellon funds, Discovery Capital, Dragoneer Investment Group, Flucas Ventures, the House Fund Geodesic, and Green Bay Ventures. |
February 1, 2021 | Series G | $1 Billion | Franklin Templeton, CPP Investment Board, Fidelity Management & Research LLC, Whale Rock, Amazon Web Services (AWS), CapitalG, Salesforce Ventures, Microsoft, Andreessen Horowitz, Alkeon Capital Management, BlackRock, Inc., Coatue Management, T. Rowe Price Associates, Tiger Global Management, New Enterprise Association, Discovery Capital, Dragoneer Investment Group, Founders Circle Capital, Geodesic, GIC, Green Bay Ventures, Greenoaks Capital and Octahedron Capital. |
October 22, 2019 | Series F | $400 Million | Andreessen Horowitz, BlackRock, Inc., T. Rowe Price Associates, Tiger Global Management, Coatue, New Enterprise Association, Microsoft, Alkeon Capital Management, Dragoneer Investment Group, Geodesic, and Green Bay Ventures. |
February 5, 2019 | Series E | $250 Million | Andreessen Horowitz, Microsoft, Coatue, Battery Ventures, New Enterprise Association, Green Bay Ventures, and Geodesic Capital. |
August 22, 2017 | Series D | $140 Million | New Enterprise Association, Andreessen Horowitz, Battery Ventures, Geodesic Capital, and Green Bay Ventures. |
December 15, 2016 | Series C | $60 Million | New Enterprise Association, Andreessen Horowitz and SineWave Ventures. |
June 30, 2014 | Series B | $33 Million | New Enterprise Association, Andreessen Horowitz and DCVC. |
September 25, 2013 | Series A | $14 Million | Andreessen Horowitz, SV Angel and Alfred Chuang. |
Databricks - Acquisitions
Databricks has so far acquired seven companies. Below are the details:
Account Name | Date | Amount |
---|---|---|
Arcion | Oct 23, 2023 | $100M |
MosaicML | Jun 26, 2023 | $1.3B |
Okera | May 3, 2023 | - |
DataJoy Inc. | Oct 13, 2022 | - |
Cortex Labs | Apr 15, 2022 | - |
8080 Labs | Oct 6, 2021 | - |
Redash | Jun 24, 2020 | - |
Databricks - Social Media Presence
Databricks has good presence in Twitter and LinkedIn they utilizes these platforms to promote its products and services to gain a market advantage. They also post regarding their world tours and launch events with their latest inventions. Links to Blogs and Articles featuring Databricks or their products and information related to job openings can also be found on their social platforms.
Databricks - Growth and Revenue
Databricks was established in 2013, keeping Spark Technology as its core. Its formation was immediately succeeded by a rumor that ‘Spark Technology won’t work if your data doesn’t fit in their memory’. This discouraged businesses to use Spark.
Finally, in 2015, the founders decided to end these rumors by participating in a contest where they beat the world record for processing one petabyte of data in the lowest time and as a result, they gained media attention and popularity.
By 2017, they were valued at $500 Million but their annual revenue was way lower at $1 Million. Later, participating in the ‘sorting contest’, making some changes in employee hiring and deciding to build software with features demanded by large enterprises, turned out to be fruitful.
Since then, Databricks’s growth is only climbing uphill. Their revenue hit the $100 Million mark for the first time in 2018 and took just another year to reach $200 Million in 2019. The introduction of the Lakehouse feature was a primary factor for its success. The company’s valuation grew from $6.2 Billion in Q3 of 2019 to around $38 Billion in Q3 of 2021.
Databricks reported annual recurring revenue of $425 Million in 2020.
Databricks disclosed that during the fiscal year that concluded on January 31, 2023, it brought in over $1 billion in revenue. The business reported that it expanded by more than 60% in the previous year 2022.
Databricks - Products and Features
Some of the latest prominent launch are:
Data Unity With New Delta Lake Release
Databricks, announced a new version of its Delta Lake data storage format on June 28, 2023. According to the company, this version eliminates data silos. The latest addition to the rival open - source standards for the analytic data tables in data lake systems is Delta Lake 3.0, which includes Iceberg and Hudi from the Apache Foundation.
Dolly
Databricks unveiled an open-source language model that allows programmers to create their own chatbot applications driven by AI on March 24, 2023.
Lakehouse Federation
At its Data + AI Summit, Databricks launched what it refers to as its Lakehouse Federation function on June, 28, 2023. With this new feature, businesses can discover, query, and administer their data across a wide range of platforms by combining their disparate walled data systems.
Databricks - Partnerships
Databricks has partnered with many companies. Some of the lates prominent partnerships are:
Microsoft
With a new partnership with Databricks in August, 2023 to market AI app-development tools, Microsoft has increased the scope of its AI goals. Businesses will be able to create their own AI models from scratch using the Databricks software.
Kobai
On September 11, 2023, Databricks and Kobai partnered. Customers may take use of the power and scalability of the Databricks Lakehouse Platform, along with the simplicity and insights of knowledge graphs.
3i Infotech
In order to generate business value by combining data and AI on a single platform, 3i Infotech Ltd and Databricks has partnered on October 18, 2023.
Databricks - Investment
Databricks has invested in 24 companies. Some of the investments are listed below:
Account Name | Date | Amount |
---|---|---|
Perplexity AI | 2022 | - |
Arcion | 2018 | - |
Prophecy.io | Jan, 2017 | - |
Catalyst | Sep, 2017 | - |
Cleanlab | - | - |
Databricks - Competitors
Some of the top competitors of Databricks are:
- Snowflake
- Cloudera
- Datastax
- Qubole
- MATLAB
- Alteryx
- Dremio
- Intellicus
Here are a few comparisons with some competitors:
Snowflake - Snowflake is much larger than Databricks. They both offer similar services with few differences (Databricks processes large data while Snowflake offers elasticity of cloud data for centralized access) at a flexible price. Databricks is making a long battle to overcome its competitor.
Cloudera - Cloudera provides a common cloud storage and management platform that stores, processes, and analyses data for an organization. It is similar to that of Databricks in the form of Data Warehouse, Processing, and Distribution.
Databricks - Future Plans
It is evident that Databricks was working on two of the fastest-growing big data domains, Streaming and Deep-Learning in 2021. They were building a multi-faceted Application Programming Interface (API) to process these two domains. Databricks is also keen on accelerating the innovation of Data Lakehouse to gain a greater advantage by conquering data-driven organizations.
According to their website, Databricks plans to enable the workspace's favorites feature. Notes, dashboards, experiments, and searches may all be saved to a list of favorites, which you can then access from the homepage.
Databricks - FAQs
What is Databricks?
Databricks is a cloud-based tool for storing and processing huge quantities of data using Machine Learning models. This is done through their Apache Spark tool.
Who founded Databricks?
Databricks was co-founded by seven people namely, Ali Ghodsi, Ion Stoica, Matei Zaharia, Patrick Wendell, Reynold Xin, Andy Konwinski, and Arsalan Tavakoli-Shiraji.
How much has Databricks secured through funding?
Databricks secured around $4 Billion through 12 funding rounds.
What is the annual revenue of Databricks?
Databricks has reported an annual recurring revenue (ARR) of $1.275 Billion for the year ending 2022.
Who are the clients of Databricks?
Databricks has around 6000+ customers worldwide. Some of their popular clients are:
- Shell
- CVS Health
- Regeneron
- T-Mobile
- HSBC
- Comcast
Must have tools for startups - Recommended by StartupTalky
- Convert Visitors into Leads- SeizeLead
- Manage your business smoothly- Google Workspace
- International Money transfer- XE Money Transfer