Technically
AI Reference
Your dictionary for AI terms like LLM and RLHF
Company Breakdowns
What technical products actually do and why the companies that make them are valuable
Learning Tracks
In-depth, networked guides to learning specific concepts
Posts Archive
All Technically posts on software concepts since the dawn of time
Terms Universe
The dictionary of software terms you've always wanted

Explore learning tracks

AI, it's not that ComplicatedAnalyzing Software CompaniesBuilding Software ProductsWorking with Data Teams
Loading...
I'm feeling luckyPricing
Log In

The beginner’s guide to AI model architectures

Unlike an onion, hopefully these neural network layers won't make you cry.

Published Aug 12, 2025ai
Justin Gage
Justin Gage
Read within learning track:
  • Architectures are the blueprints for AI models: they dictate how models are designed and built
  • Most AI today is made up of computing units called neurons linked together in complex networks
  • There are a million ways to build these networks: different algorithms, structures, and sizes
  • Researchers match different architectures to the specific problems and data constraints they face

Terms Mentioned

Training

Client

Reasoning

ChatGPT

Companies Mentioned

OpenAI logo

OpenAI

PRIVATE

Have you ever wondered how AI models get designed? Or what makes a Large Language Model like the one behind ChatGPT different from a Computer Vision model used for self-driving cars? Isn’t it all just AI under the hood?

The answer boils down to model architecture. Architectures are the blueprints of AI models – they are the sum of all the decisions whoever is building the model makes about what algorithms, data, sizes, and other stuff goes into said model. There are tons and tons of ways to build an AI model: a particular architecture just chooses one (or multiple, but more on that later).

Picking the right architecture for your domain is really important. The basic 101 tagline for how AI models get built goes something like, “AI uses really complicated math to learn patterns from really large quantities of data.” This explanation isn’t wrong, but it’s only half the story. The other half? Smart architecture design.

To understand model architectures and how they work, we have to start with the neuron – the building block of every advanced AI model out there today. This post will explain what a neuron is and how researchers and engineers piece these neurons together to build complex systems capable of incredibly challenging tasks. We’ll touch on some of the most popular architecture types, exploring what makes them really good at some tasks (and not so good at others).

Neurons: the building blocks of AI

Neurons are the basic building blocks of AI architectures, modeled after the actual biological neurons that transmit signals throughout the human brain. Remember, AI models are essentially pattern investigators; they find the underlying pattern in the data. You can think of these neurons as the mathematical functions that are doing this hard investigative work, getting into the weeds of the data and figuring out what’s going on.

Loading image...

The math performed by individual neurons is actually pretty simple – it’s usually just basic multiplication and addition that you could do with a calculator. So how are AI models able to capture such complex patterns, like the ones involved in language and vision? The trick is to string together a lot of neurons – like hundreds of millions of them.

This stringing together is where our first “decision” – and thus the early stages of an architecture – starts to come into play. Researchers can combine neurons in two ways.

First, neurons can be lined up in a sequence, so the output of one becomes the input of the next.

Loading image...

Neurons can also be stacked in layers, where they don’t interact directly but take the same input values.

Loading image...

Some special neurons can even accept their own output and use it to update their internal function, in a kind of simulated memory. This is helpful when you’re handling a sequence of data inputs, like a bunch of frames from the same video, and you want your model to use knowledge from earlier frames to contextualize what’s happening in later frames.

Building even more complex models

Continue reading with an all-access subscription

Continue reading with all-access

In this post

  • Building even more complex models
  • Convolutional Neural Networks (CNNs)
  • Transformers
  • Choose your fighter
  • A universal architecture

More in this track

What is Machine Learning?

How computers learn patterns from data: the foundation for everything from stock price prediction to ChatGPT.

How do Large Language Models work?

Breaking down what ChatGPT and others are doing under the hood

$15/month

30-day money-back guarantee

Or use
Up Next
How do AI models think and reason?Paid Plan

All about "reasoning" language models like OpenAI's o3 and Deepseek's R1.

What is RAG?Paid Plan

Retrieval Augmented Generation is a way to make AI models more personalized

The post about GPUsPaid Plan

Why these chips pair perfectly with AI, and how NVIDIA became the most valuable company in the world.

Content
  • All Posts
  • Learning Tracks
  • AI Reference
  • Companies
  • Terms Universe
Company
  • Pricing
  • Sponsorships
  • Contact
Connect
SubscribeSubstackYouTubeXLinkedIn
Legal
  • Privacy Policy
  • Terms of Service

© 2026 Technically.