technically logo

Learn

Company Breakdowns
What technical products actually do and why the companies that make them are valuable
Knowledge Bases
In-depth, networked guides to learning specific concepts
Posts Archive
All Technically posts on software concepts since the dawn of time
Terms Universe
The dictionary of software terms you've always wanted

Explore knowledge bases

AI, it's not that ComplicatedAnalyzing Software CompaniesBuilding Software ProductsWorking with Data Teams
Loading...

Meet Technically

Technically exists to help you get better at your job by becoming more technically literate.

Learn more →

Solutions for Teams

For GTM Teams
Sell more software to developers by becoming technically fluent.
For Finance Professionals
Helping both buy-side and sell-side firms ask better technical questions.
General Team Inquiries
Volume discounts on Technically knowledge bases.
Loading...
Pricing
Sign In
← Back to Universe

Pre-training

aiintermediate

When training an LLM, pre-training gives the model all the basic, foundational knowledge it needs to answer your prompts, by showing it lots and lots of examples of existing text from the internet.

We call this process the sentence re-arranging game. Take any sentence that exists in text, remove a word, and boom! You've got labeled training data. This is how LLMs are trained: by taking tons and tons of publicly available sentences, removing words, and teaching the model to replace them correctly:

  • I bought a stereo system to play my ______.
  • I bought a ______ to play my music.

After pre-training, a model is still not quite ready for prime time: responses will be long and rambly, and may not actually answer your question. This is why pre-training is followed by post-training, to refine the model into something we'd recognize as usable.

Read the full post ↗

How do you train an AI model?

A deep dive into how models like ChatGPT get built.

Read in the Knowledge Base →

Related terms

ChatGPT

Context Window

Inference

LLM

Loss Function

Machine Learning

Support
Sponsorships
Twitter
Linkedin
Privacy + ToS