"how numbers are stored and used in computers"
When a model's size is expressed in number of parameters, it refers to the total number of learnable weights in the model — the values that are adjusted during training to minimize error and improve performance. In neural networks, parameters are the weights between neurons and the biases for each neuron. Each parameter is a floating-point number, and together, they define the behavior of the model. The more parameters a model has, the more complex patterns it can potentially learn.
A simple linear regression model with one input and one output has two parameters (a weight and a bias).
A simple linear regression model tries to model the relationship between a single input variable
The model is trained by adjusting
Where
A multiple linear regression model attempts to captures the relationship between an input vector
Using vector notation, this might be expressed as:
Just like simple regression, we want to minimize the Mean Squared Error (MSE) across
There are
A fully-connected layer of a neural network with
More parameters typically allow the model to represent more complex functions or patterns in data, but they also carry the burden of higher computational cost for training and inference, higher memory usage, and higher risk of overfitting if not trained properly.