TL;DR
A vector database is a place where developers store specially formatted data to use for machine learning and AI.
- To make large language models more accurate, you need to power them with your own unique data
- But models have a very specific data diet: they only consume vectors, which are a bunch of numbers
- Embedding is the process of turning your data (images, text, videos) into vector representations (numbers)
- Vector databases are specialized places to store these embeddings, and search through + retrieve them when you need them
Vector databases themselves are actually pretty simple, but the context for why they exist is not. So let’s start with that.