Machine Learning from Skratch

Learn how machine learning models work by building them from scratch!

Everything you need to know to get started with Machine Learning

Skratch is for anyone who is interested in machine learning, and especially those who like to build things themselves. Skratch is trying to fill a gap between “too technical” and “too high-level”. This means that we won’t delve deeply into statistics. If necessary, we will provide relevant links to other resources. On the other hand, we do not want to be too high-level and we believe that showing each step of an algorithm through code can achieve that.

Introduction to Machine Learning

Learn the difference between Machine Learning and traditional computer science

K-Nearest Neighbours

Learn about one of the most elegant and simple classification and regression model.

Linear Regression

Learn about Linear Regression, an important algorithm which brings up concepts used by neural networks.

Naïve Bayes Classifiers

Learn about one of the most popular models, often used in text mining.

Model Selection

Learn how to evaluate models for different tasks as well as guiding principles to build models.

K-Means

Learn about clustering by learning about K-Means, an intuitive algorithm to group data.

Logistic Regression

Learn about Logistic Regression, an important classification algorithm related to Linear Regression.

Mean Shift

Learn about Mean Shift, a clustering algorithm which finds applications in image segmentation.

Who am I?

My name is Valentin and I originally come from Belgium. I lived in the United States for a bit and then I went to Maastricht University to study data science. I now work as a data scientist, which means that I am able to fully embrace my passion for machine learning on a daily basis.

I am passionate about teaching and really believe that you cannot master a topic until you are able to explain it to others. This is one of the many reasons I decided to start this website. On one hand, I wanted to spread good resources about machine learning, and on the other, I wanted to further my education on the topic.

This website is and will remain a work in progress. I welcome any criticism, suggestions, and even contributions to the project with open arms.

Learn through code

Skratch offers implementations of many machine learning models in Python.

Even though the code is written in Python, the implementations are not language-specific. Python was chosen because of readability and its popularity in the machine learning community.

The code can be found on Github

Step-by-step explanations

Each piece of code that we write comes with step-by-step explanations of what it does.

Everyone writes code differently, and reading someone’s code can sometimes be daunting. That is why we made sure to create learning units around each piece of code so that you don’t have to decypher the code alone.

Pick the format that suits you!

Everyone likes to learn differently. This is why we offer our content in multiple formats!

On top of the Python code, we also created learning units that you can consult directly in your browser. Each learning unit is also available as a Jupyter Notebook so that you can run code as you go and videos going through the material are available as well. Certain content is also offered as blog posts for a more casual read.

“To truly understand something, you must first break it down into atomic parts”

Skratch’s philosophy is that to truly understand how something works, you must be able to explain every detail of it. This is why it was decided to implement machine learning algorithms from scratch and use the code as a learning tool.

We often don’t realize how many things we don’t fully understand about a topic until we have to explain it ourselves. Implementing algorithms from scratch is a great way to identify gaps in understanding and then fix them.

 

FAQ

Skratch is and will always be a work in progress. In order to improve it, suggestions, remarks, critics, or even contributions are more than welcome!

Can I reuse the code?

Yes, the project is 100% open-source. Feel free to use, modify, or even contribute to the codebase. Do remember though that the code was not written to be robust or fast.

How did you create the images/GIFs?

I used Matplotlib, a famous Python library. If you click on each image or gif, it will send you to a Python file used to generate the figure! The code can also be found directly on Github.

How do I get in touch with you if I have questions?

You can make use of the contact page, send an e-mail to skratch@valentincalomme.com, or even find me on LinkedIn!

How do I know the implementations are correct?

Skratch aims to be as transparent as possible. It is why we wrote tests for the various machine learning models, ensuring that they perform similarly to sklearn models.

I don't know Python, is it a problem?

No! Even though the code was written in Python, it is not language specific. You might need some basic understanding about Python syntax but that’s it!

How much statistics do I need to know?

Statistics are a vital part of machine learning, and I can’t ever recommend to learn it enough. However, this course is not necessarily focused on statistics, so basic knowledge should be plenty!