Linear Regression

Linear regression is the simplest and most interpretable ML algorithm. It assumes a linear relationship between input features and the target variable.

The Model

ŷ = w₁x₁ + w₂x₂ + ... + wₙxₙ + b

ŷ = w₁x₁ + w₂x₂ + ... + wₙxₙ + b

Where:

ŷ = predicted value
w = weights (coefficients) — learned from data
x = input features
b = bias (intercept)

Cost Function: Mean Squared Error

We want to minimise the difference between predictions and actual values:

MSE = (1/n) × Σ(yᵢ - ŷᵢ)²

MSE = (1/n) × Σ(yᵢ - ŷᵢ)²

How Learning Works: Gradient Descent

Start with random weights
Calculate the error (MSE)
Compute gradient (direction of steepest increase)
Update weights in the opposite direction
Repeat until convergence

w = w - α × ∂MSE/∂w

w = w - α × ∂MSE/∂w

Where α (alpha) is the learning rate — how big each step is.

Assumptions of Linear Regression

Linearity — relationship between X and y is linear
Independence — observations are independent
Homoscedasticity — constant variance of residuals
Normality — residuals are normally distributed

Evaluation Metrics

Metric	Formula	Interpretation
MSE	mean((y - ŷ)²)	Average squared error
RMSE	√MSE	Same units as target
MAE	mean(	y - ŷ
R²	1 - SS_res/SS_tot	% variance explained (0-1)

Multiple vs Simple Linear Regression

Simple: one feature (y = wx + b)
Multiple: many features (y = w₁x₁ + w₂x₂ + ... + b)
Polynomial: adds x², x³ terms for non-linear patterns

Linear Regression

Linear Regression

The Model

Cost Function: Mean Squared Error

How Learning Works: Gradient Descent

Assumptions of Linear Regression

Evaluation Metrics

Multiple vs Simple Linear Regression

Try It Yourself