Select Page

# Chapter 2: Linear Regression

Building a Linear Regression model is straightforward. However, there are many tweaks that can be added to it. We will first start with the simpler version.

import numpy as np

class LinearRegression:

def __init__(self,
fit_intercept=True):

self.fit_intercept = fit_intercept


The fit_intercept argument defines whether or not we want to use an intercept. If no intercept is used, the data is assumed to already be centred. Mathematically, the intercept is the offset that is added to a linear combination. It is $$b$$ in $$y= mx+b$$

Making predictions with Linear Regression is simple. Once the weights are known, we can do a dot product between them and the features of the data points whose target we’d like to predict.

In case there is an intercept, we need to add it to the features. This intercept is simply a feature that is always equal to 1, regardless of the data point.

    def predict(self, X, weights=None):

if self.fit_intercept is True:

if weights is None:
weights = self.coef_

return X.dot(weights)


Now that we covered what Linear Regression needs to learn, we’ll focus on how. The goal of the model is to find a set of weights which minimizes a loss function. This loss is typically defined as the Mean Squared Error of the model.

    def _loss_function(self, X, y):

prediction_loss = lambda weights: np.mean((y - self.predict(X, weights)) ** 2) * 0.5

return lambda weights: prediction_loss(weights)


Now, we could just try a lot of weights and see which work best. But there are smarter ways to do this. Thankfully, solving $$y = X \cdot w$$ also happens to minimize the loss. The equation can be rearranged like this to find the weights: $$w = (X^TX)^{-1}X^{T}y$$

    def fit(self, X, y):

if self.fit_intercept is True: