The TL;DR
Databricks sells a data science and analytics platform – i.e. a place to query and share data – built on top of an open source package called Apache Spark.
- Apache Spark is an open source engine for running analytics and machine learning across distributed, giant datasets
- Spark is notoriously hard to run on your own infrastructure and companies often don’t have the expertise to do that
- Databricks provides a managed service for running Spark clusters, as well as notebooks for visualization and exploration, plus the ability to schedule pipelines
- More recently, Databricks has been expanding the product portfolio to include ML and data warehousing
Databricks is one of the largest private companies on the planet - $62B was their most recent valuation.