What's Kafka and what does Confluent do?

Apache Kafka is a framework for streaming real time data, and Confluent offers Kafka as a managed service.

analytics

Last updated: July 4, 2025

The TL;DR

Apache Kafka is a framework for streaming data between internal systems, and Confluent offers Kafka as a managed service.

We’re dealing with a lot of data these days – Big Data™ – and recording, storing, and moving it around is hard and expensive
Kafka helps stream that data throughout your company and distribute it to the systems that want to use it
The Kafka architecture works through a publish-subscribe pattern
Kafka 101 terminology: producers, consumers, messages, and topics

Kafka is relatively new, but it's getting pretty popular: managed service provider Confluent, founded by the original creators of Kafka, has been a public company since 2021 and continues to grow rapidly.

Terms Mentioned

Open Source

Server

Cloud

Kafka

Framework

Infrastructure

Networking

Analytics

Data warehouse

Deploy

Companies Mentioned

The core Confluent product: data streaming

Kafka, and thus Confluent, exists to solve two fundamental problems facing almost every data infrastructure team at every company.

There’s a lot of data, and it’s all happening very quickly

As storage has gotten cheaper, we’ve been collecting more and more data. Most software companies record every single website visit and click, and some go even deeper. Once you have more than few users interacting with your product, you’re talking about millions of different events per day. Storing and managing that size and velocity of data is hard.

Data needs to move around to be valuable

Even if you’re a wiz at collecting and storing your data, there’s a problem: you’re going to need to move it around for it to be valuable. Where data gets initially collected and stored is rarely where it’s going to be useful.

Kafka solves these problems by creating a central registry for all of this data – you can think of it like one of those conveyor belt sushi places. Any consumers that need to use the data (like apps, databases, or ML models) can just take the plate they need (although really, they’re just taking a copy of it). This is sometimes called a publish-subscribe model, often shortened to pub-sub.

I’ve got two problems, that’s it

Kafka exists to solve two fundamental problems facing almost every data infrastructure team at every company.

The TL;DR

Apache Kafka is a framework for streaming data between internal systems, and Confluent offers Kafka as a managed service.

We’re dealing with a lot of data these days – Big Data™ – and recording, storing, and moving it around is hard and expensive

Kafka helps stream that data throughout your company and distribute it to the systems that want to use it

The Kafka architecture works through a publish-subscribe pattern

Kafka 101 terminology: producers, consumers, messages, and topics

The core Confluent product: data streaming

Kafka, and thus Confluent, exists to solve two fundamental problems facing almost every data infrastructure team at every company.

There’s a lot of data, and it’s all happening very quickly

Data needs to move around to be valuable

What's Kafka and what does Confluent do?

The TL;DR

Terms Mentioned

Open Source

Server

Cloud

Kafka

Framework

Infrastructure

Networking

Analytics

Data warehouse

Deploy

Companies Mentioned

Elastic

AWS

MongoDB

Confluent

Segment

The core Confluent product: data streaming

I’ve got two problems, that’s it

Access the full post in a knowledge base

Analyzing Software Companies

Working With Data Teams

Where to next?

Why do developers choose different types of databases?

What does Hashicorp do?

What does JFrog do?

What's Kafka and what does Confluent do?

The TL;DR

Terms Mentioned

Open Source

Server

Cloud

Kafka

Framework

Infrastructure

Networking

Analytics

Data warehouse

Deploy

Companies Mentioned

Elastic

AWS

MongoDB

Confluent

Segment

The core Confluent product: data streaming

I’ve got two problems, that’s it

Access the full post in a knowledge base

Analyzing Software Companies

Working With Data Teams

Where to next?

Why do developers choose different types of databases?

What does Hashicorp do?

What does JFrog do?