Okay, let’s say you have completed your machine learning project and it’s deployed and working well for you. But after some time you withstand a situation where you need that machine learning model again in some other project. Or let’s assume you want to compare two machine learning models, measure their performances and use accordingly in your project. In the above two cases, you need to save your machine learning model somewhere and use it later. Well in this post, we will be saving a machine learning Model in python with Scikit Learn.
In python, we have two libraries to get our job done, pickle and Joblib. I have used both and both of them works fine; you can use any of these.
- Saving Machine learning models using pickle or Joblib is called as Serialization.
- Extracting saved Machine learning model is called Deserialization.
Here we will use logistic regression machine learning model and we will save it using pickle and Joblib. In this post, we won’t talk about logistic regression, I have written a separate article on this topic.
Saving Machine learning model using Joblib
The main difference between Joblib and pickle is that Joblib works faster on larger NumPy datasets. It handles array buffers of NumPy datasets very well. So I would suggest you to use it when you a nested NumPy arrays.
In the below source code, we will be saving our logistic regression model using Joblib library. Later it can be used to predict results on your datasets.
# -*- coding: utf-8 -*- """ Created on Sat Feb 21 23:10:15 2018 Saving a machine learning Model in python with Scikit Learn @author: SHASHANK """ # Importing the libraries import pandas as pd from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression from joblib import dump, load import numpy as np class Model: X = None Y = None standardScaler = None # Importing the dataset def importData(self): dataset = pd.read_csv('supermall.csv') self.X = dataset.iloc[:, [2,3]].values self.Y = dataset.iloc[:, 4].values def saveModel(self): self.importData() standardScaler = StandardScaler() self.X = standardScaler.fit_transform(self.X) # Fitting the Simple Linear Regression to the Training set classifier = LogisticRegression(random_state = 0) classifier.fit(self.X, self.Y) # Saving Model into a File dump(classifier, 'salaryPredictor.joblib') print(classifier) def isBuying(self): userAge = float(input("Enter the user's age? ")) userSalary = float(input("What is the salary of user? ")) # Loading the model from the file logisticRegression = load('salaryPredictor.joblib') # predicting the results prediction = logisticRegression.predict(np.array([[userAge, userSalary]])) print ('This user is most likely to buy the product' if prediction[0] == 1 else 'This user is not gonna buy your product.') model = Model() model.saveModel() # Call it later whenever you want model.isBuying()
Explanation:
In the above code, there are three highlighted lines on which we will talk about.
- First, we will import the
dump
andload
package fromJoblib
library. - After that, if you see in line number 37 we are using
dump
to save machine learning model by passing two parameters in it. - The first parameter will be the machine learning model object and the second parameter will be the name of the file in which you want to store the model.
- In line number 46 we are extracting the saved from
.joblib
usingload
package.
Saving Machine learning model using Pickle
To Serialize python objects pickle is one of the most used libraries. It is commonly used for serialization and deserialization. Here python objects can be anything, Pickle can serialize/deserialize python list or python dictionaries.
In the below code, we will see how we can use pickle to save and load machine learning models. Again, we will be using logistic regression to demonstrate this example.
# -*- coding: utf-8 -*- """ Created on Sat Feb 21 23:10:15 2018 Saving a machine learning Model in python with Scikit Learn @author: SHASHANK """ # Importing the libraries import pandas as pd from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression import pickle import numpy as np class Model: X = None Y = None standardScaler = None # Importing the dataset def importData(self): dataset = pd.read_csv('supermall.csv') self.X = dataset.iloc[:, [2,3]].values self.Y = dataset.iloc[:, 4].values def saveModel(self): self.importData() standardScaler = StandardScaler() self.X = standardScaler.fit_transform(self.X) # Fitting the Simple Linear Regression to the Training set classifier = LogisticRegression(random_state = 0) classifier.fit(self.X, self.Y) # Saving Model into a File pickle.dump(classifier, open('pickle.sav', 'wb')) def isBuying(self): userAge = float(input("Enter the user's age? ")) userSalary = float(input("What is the salary of user? ")) # Loading the model from the file logisticRegression = pickle.load(open('pickle.sav', 'rb')) # predicting the results prediction = logisticRegression.predict(np.array([[userAge, userSalary]])) print ('This user is most likely to buy the product' if prediction[0] == 1 else 'This user is not gonna buy the your product.') model = Model() model.saveModel() # Call it later whenever you want model.isBuying()
Explanation:
In the above code, there are three highlighted lines on which we will talk about.
- First, we will import the
pickle
library. - After that, if you see in line number 37 we are using
pickle.dump
to save machine learning model by passing two parameters in it. - The first parameter will be the machine learning model object and the second parameter will open a file
pickle.save
in write mode to save the machine learning model. - In line number 46 we are extracting the saved from
pickle.save
file usingpickle.load()
method. - This method will open the
pickle.save
fle in read mode so that the pickle library can deserialize the machine learning object.
Conclusion
In this post, we studied how to save a machine learning Model in python using Joblib and Pickle library. Again I would suggest use Jiblib library for NumPy arrays and for other small models you can use pickle library.
Do share how would you like to use these two awesome libraries the below comment section. If you want to read more machine learning articles subscribe to the Newsletter and I’ll see you in the next post.