Frequently Asked Questions About Parameters#
What's the difference between parameters and hyperparameters?#
- Parameters: Learned during training (the model's "knowledge")
- Hyperparameters: Set before training (learning rate, model size, training duration)
Think parameters as what the student learns, hyperparameters as how you teach them.
How much memory do parameters require?#
Roughly 2-4 bytes per parameter for storage, plus additional memory for processing. A 7B parameter model needs about 14-28GB just to load, before doing any calculations.
Can you modify parameters after training?#
Yes, through techniques like fine-tuning (adjusting parameters for new tasks) or pruning (removing less important parameters). But major changes usually require retraining.
Why don't models just keep getting bigger?#
Diminishing returns and practical limits. Training costs grow exponentially, and beyond a certain point, better data and training methods matter more than raw parameter count.
What determines how many parameters a model needs?#
The complexity of the task, available training data, and computational budget. Simple tasks might need millions of parameters, while general intelligence might require trillions.