1 Introduction

Another big chapter from the supervised machine learning area comes to an end. In the past 4 months I wrote in detail about the functionality and use of the most common classification algorithms within data science.

Analogous to my post “Roadmap for Regression Analysis”, I will give again an overview of the handling of classification tasks.

2 Roadmap for Classification Tasks

2.1 Data pre-processing

Here are the links to the individual topics:

2.2 Feature Selection Methods

Here are the links to the individual topics:

Filter methods:

Wrapper methods:

2.3 Algorithms

2.3.1 Classification Algorithms

Here are the links to the individual topics:

Notes on the special classifiers:

As described in the SGD Classifier post, this is not a classifier. It’s a linear classifier optimized by the Stochastic Gradient Descent.

With the One-vs-One and One-vs-Rest method it is possible to make binary classifiers multiple.

Notes on ensemble methods:

Depending on the underlying problem with the predictions I choose the following ensemble method:

Bagging: Decrease Variance
Boosting: Decrease Bias
Stacking: Improve Predictions

2.3.2 Classification with Neural Networks

Of course, in addition to traditional classification algorithms, neural networks can be used to solve classification problems.

Here again are the links to the respective publications:

2.3.3 AutoML

The use of automated machine learning libraries is becoming increasingly popular. Here is a guide on how classification problems can be solved with PyCaret:

AutoML using PyCaret - Classification

3 Conclusion

The methods and algorithms shown in the overviews are described in detail in the respective publications with regard to theory and practical application. Just click on the respective link.