Select Page

# Chapter 4: LASSO regularization

One of Machine Learning’s big challenges is avoiding overfitting. Overfitting typically means that a model is modelling a function that is more complex than the one it is supposed to. This is often due to noise being present in the data, tricking the models into “thinking” that the problem is more complex than it really is. It also happens when the model is more complex than it needs to be.

With linear regression, it is possible to fit arbitrarily high-dimensional functions, but it doesn’t always mean it is a good idea. Let’s take a look at the example below.

The data represents a one-dimensional function, a simple “line”. However, the model tried to fit a 5-dimensional function. Whilst it seems to be fitting the regular points correctly, it is evident that the model has been affected by the two noisy points.

One way to deal with this is to use regularization. Regularization is the process of adding constraints on the weights of a model. This can be done by adding that term in the loss function and in its gradient.

```    def _loss_function(self, X, y):

prediction_loss = lambda weights: 0.5 * (y - self.predict(X, weights)) ** 2
regularization_loss = lambda weights: self.regularizer(weights)

return lambda weights: np.mean(prediction_loss(weights) + regularization_loss(weights))

features = add_dummy_feature(X) if self.fit_intercept is True else X

prediction_loss_gradient = lambda weights: (self.predict(X, weights) - y).dot(features) / len(features)

```

The ultimate goal is to “discourage” a model to be too high-dimensional. One way to define this constraint is LASSO. LASSO stands for Least Absolute Shrinkage and Selection Operator. It regularizes the model by adding the sum of the absolute weights to the loss function.

```class LASSO:

def __init__(self, _lambda):
self._lambda = _lambda

def __call__(self, theta):

return self._lambda * np.sum(np.abs(theta))