Technically
AI Reference
Your dictionary for AI terms like LLM and RLHF
Company Breakdowns
What technical products actually do and why the companies that make them are valuable
Learning Tracks
In-depth, networked guides to learning specific concepts
Posts Archive
All Technically posts on software concepts since the dawn of time
Terms Universe
The dictionary of software terms you've always wanted

Explore learning tracks

AI, it's not that ComplicatedAnalyzing Software CompaniesBuilding Software ProductsWorking with Data Teams
Loading...
I'm feeling luckyPricing
Log In

The Details: Data Warehouses

A deep dive into data warehouses, what they do, and how different ones stack up.

Last updated Mar 23, 2026analytics
Justin Gage
Justin Gage
Read within learning track:Working With Data Teams

This post is going to dive deeper into one of Technically’s most requested topics: data warehouses. We’re going to look at what they’re used for, popular options (Snowflake, BigQuery, Redshift: what’s the difference?), how data lakes fit into the picture, and how vendors are positioning themselves as platforms going forward. 

Refresher: what’s a Data Warehouse?

A data warehouse is a specially designed database that holds analytical data. It’s built to handle long, complicated queries written by data scientists, analysts, and machine learning engineers. 

If you’re iffy on what data warehouses do, now would be a good time to read the original post here. It covers:

  • Why we use separate databases for our apps vs. analytics
  • How data gets into a warehouse via ETL or ELT
  • Typical types of analytical queries

And then come back here.

🚨 Confusion Alert

Part of where data warehouses get confusing is what they actually are. And the answer is that a data warehouse involves both:A different place to store your data (a separate database server)A different way of storing data (new data sources, structures, etc.)Most companies with data teams will have at least two separate places where they work with data: their production database, and a data warehouse. The production database only has data that’s relevant to the core operations of their app (users, business concepts, etc.). The data warehouse, on the other hand, might have copies of their production data, payment data, website traffic data, and lots of other stuff - whatever the data team wants to analyze.

All good? Cool – now let’s dive into the details. 

Terms Mentioned

UI

Benchmark

Open Source

Production database

Schema

Server

Reverse ETL

Cloud

Infrastructure

Integration

API

Data lake

Data warehouse

ETL

Database

Query

Companies Mentioned

AWS logo

AWS

AMZN
MongoDB logo

MongoDB

MDB
Snowflake logo

Snowflake

SNOW

Different types of warehouses, home rolled to managed

There are basically 3 levels of data warehouse – and companies can be at any of these levels. The first two options are home rolled – meaning the company pays for and maintains their own infrastructure – while the third option is a managed service, and outsources infrastructure to someone like AWS.     

1) Home rolled: basic SQL

The most basic type of data warehouse is just a vanilla setup of a typical SQL database like Postgres or MySQL. It might be sitting on a bigger server than your production database so it can handle more complex queries, but under the hood it’s the same technology. 

What makes this a data warehouse, instead of just a regular database, is that (a) it’s more powerful if you put in on bigger servers, and (b) it has analytical data in it instead of just production app data.

🖇 Workplace Example

At one company I worked at, we had Segment automatically pipe out event data into Postgres tables, and queried them as our “warehouse” – even though it was just a basic Postgres database.

This setup is not very popular these days. An even rarer combination would be using a NoSQL database like MongoDB or DynamoDB. 

2) Home rolled: big data

Option number 2 is setting up your own infrastructure and using more specialized, analytics-focused open source like Hadoop. This is what we did at DigitalOcean before moving onto Snowflake (see section #3):

  • Our data was stored in HDFS (Hadoop), set up and running on our own servers
  • We used Apache Hive to query it using a SQL-like interface
  • Our queries ran on Presto, which distributed them across powerful servers

This was a decently complicated setup, but not uncommon. The benefit to rolling your own systems tends to be cost – we didn’t pay per query, and were able to store and query a lot of data. But the downside was maintenance: setting up and maintaining the warehouse was tough work, and there was constant downtime. 

And that’s why we eventually migrated to Snowflake!

3) Managed services: Snowflake, BigQuery, Redshift

Finally, we arrive at what’s all the rage these days: cloud native, fully managed data warehouses. When you buy something like Snowflake, they take complete care of infrastructure, and you don’t need to manage any servers whatsoever. The basic highlights:

Continue reading with an all-access subscription

Continue reading with all-access

In this post

  • 3) Managed services: Snowflake, BigQuery, Redshift
  • Snowflake, BigQuery, Redshift, oh my!
  • Snowflake
  • BigQuery
  • Redshift

More in this track

How do product analytics work?

Product analytics is how teams instrument and analyze data about their product usage.

What your data team is using: the analytics stack

A deep dive into all of the tools that data teams use to do their work.

$15/month

30-day money-back guarantee

Or use
Up Next
What's a Data Lake?

A Data Lake is an unstructured place to put data.

What does Snowflake do?

Snowflake sells a powerful cloud data warehouse for analytics and data science teams.

What does Elastic do?

Elasticsearch is a popular open source database for storing and searching unstructured data.

Content
  • All Posts
  • Learning Tracks
  • AI Reference
  • Companies
  • Terms Universe
Company
  • Pricing
  • Sponsorships
  • Contribute
  • Contact
Connect
SubscribeSubstackYouTubeXLinkedIn
Legal
  • Privacy Policy
  • Terms of Service

© 2026 Technically.