Pre-training

aiintermediate

When training an LLM, pre-training gives the model all the basic, foundational knowledge it needs to answer your prompts, by showing it lots and lots of examples of existing text from the internet.

We call this process the sentence re-arranging game. Take any sentence that exists in text, remove a word, and boom! You've got labeled training data. This is how LLMs are trained: by taking tons and tons of publicly available sentences, removing words, and teaching the model to replace them correctly:

I bought a stereo system to play my ______.
I bought a ______ to play my music.

After pre-training, a model is still not quite ready for prime time: responses will be long and rambly, and may not actually answer your question. This is why pre-training is followed by post-training, to refine the model into something we'd recognize as usable.

Read the full post ↗

How do you train an AI model?

A deep dive into how models like ChatGPT get built.

Read in the Knowledge Base →

Learn

Explore knowledge bases

Meet Technically

Solutions for Teams

Pre-training

Read the full post ↗

How do you train an AI model?

Related terms

ChatGPT

Context Window

Inference

LLM

Loss Function

Machine Learning