1 Introduction
We have seen how to train and use different types of machine learning models. But how do we proceed when we have developed and trained a model with the desired performance? Due to the fact that the training of large machine learning models can sometimes take many hours, it is a good tip to save your trained models regularly so that you can access them later.
For this post the dataset Iris from the statistic platform “Kaggle” was used. You can download it from my “GitHub Repository”.
2 Loading the libraries and the data
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.metrics import accuracy_score
#with joblib we'll safe our trained model
import joblib
iris = pd.read_csv("Iris_Data.csv")
iris = iris[['sepal_length', 'sepal_width', 'species']]
iris
3 Visualization of the data
g = sns.pairplot(iris, hue='species', markers='+')
plt.show()
4 Model training
x = iris.drop('species', axis=1)
y = iris['species']
trainX, testX, trainY, testY = train_test_split(x, y, test_size = 0.2)
clf = SVC(kernel='linear')
clf.fit(trainX, trainY)
y_pred = clf.predict(testX)
print('Accuracy: {:.2f}'.format(accuracy_score(testY, y_pred)))
print('Error rate: {:.2f}'.format(1 - accuracy_score(testY, y_pred)))
5 Safe a model to disc
Now we have trained our model (here linear SVM). It’s time to safe the trained model to disc.
# save the model to disk
filename = 'final_svm_model.sav'
joblib.dump(clf, filename)
Of course you can specify a different location under (here) filename.
6 Load a model from disc
The trained model can be reloaded at any later time.
# load the model from disk
filename = 'final_svm_model.sav'
loaded_model = joblib.load(filename)
Ok let’s test the loaded model. In advance we trained the distinction between three types of flowers. Let’s see what prediction I get for values sepal_length & sepal_width of 4.0 each.
my_flower = [[4.0, 4.0]]
my_pred = loaded_model.predict(my_flower)
my_pred
7 Conclusion
A Jupyter notebook, or whatever IDE you like to use, crashes easily. It is therefore advisable to save your trained model regularly.