1 Introduction
Another big chapter from the supervised machine learning area comes to an end. In the past 4 months I wrote in detail about the functionality and use of the most common classification algorithms within data science.
Analogous to my post “Roadmap for Regression Analysis”, I will give again an overview of the handling of classification tasks.
2 Roadmap for Classification Tasks
2.1 Data pre-processing
Here are the links to the individual topics:
2.2 Feature Selection Methods
Here are the links to the individual topics:
Filter methods:
- Dealing with highly correlated features
- Dealing with constant features
- Dealing with duplicate features
Wrapper methods:
2.3 Algorithms
2.3.1 Classification Algorithms
Here are the links to the individual topics:
- Logistic Regression
- Support Vector Machines
- Perceptron
- SGD Classifier
- OvO and OvR Classifier
- Softmax Regression
- Decision Trees
- Naive Bayes Classifier
- K Nearest Neighbor Classifier
- Bagging
- Boosting
- XGBoost
- Stacking
- Stacking with scikit-learn
- Voting
Notes on the special classifiers:
As described in the SGD Classifier post, this is not a classifier. It’s a linear classifier optimized by the Stochastic Gradient Descent.
With the One-vs-One and One-vs-Rest method it is possible to make binary classifiers multiple.
Notes on ensemble methods:
Depending on the underlying problem with the predictions I choose the following ensemble method:
- Bagging: Decrease Variance
- Boosting: Decrease Bias
- Stacking: Improve Predictions
2.3.2 Classification with Neural Networks
Of course, in addition to traditional classification algorithms, neural networks can be used to solve classification problems.
Here again are the links to the respective publications:
2.3.3 AutoML
The use of automated machine learning libraries is becoming increasingly popular. Here is a guide on how classification problems can be solved with PyCaret:
3 Conclusion
The methods and algorithms shown in the overviews are described in detail in the respective publications with regard to theory and practical application. Just click on the respective link.