Saving machine learning Model in python with Scikit Learn

Okay, let’s say you have completed your machine learning project and it’s deployed and working well for you. But after some time you withstand a situation where you need that machine learning model again in some other project. Or let’s assume you want to compare two machine learning models, measure their performances and use accordingly in your project. In the above two cases, you need to save your machine learning model somewhere and use it later. Well in this post, we will be saving a machine learning Model in python with Scikit Learn.

In python, we have two libraries to get our job done, pickle and Joblib. I have used both and both of them works fine; you can use any of these.

Saving Machine learning models using pickle or Joblib is called as Serialization.
Extracting saved Machine learning model is called Deserialization.

Here we will use logistic regression machine learning model and we will save it using pickle and Joblib. In this post, we won’t talk about logistic regression, I have written a separate article on this topic.

Saving Machine learning model using Joblib

The main difference between Joblib and pickle is that Joblib works faster on larger NumPy datasets. It handles array buffers of NumPy datasets very well. So I would suggest you to use it when you a nested NumPy arrays.

In the below source code, we will be saving our logistic regression model using Joblib library. Later it can be used to predict results on your datasets.

# -*- coding: utf-8 -*-
"""
Created on Sat Feb 21 23:10:15 2018
Saving a machine learning Model in python with Scikit Learn
@author: SHASHANK
"""

# Importing the libraries
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from joblib import dump, load
import numpy as np 


class Model:
    X = None
    Y = None
    standardScaler = None
   
    # Importing the dataset
    def importData(self):
        dataset = pd.read_csv('supermall.csv')
        self.X = dataset.iloc[:, [2,3]].values
        self.Y = dataset.iloc[:, 4].values

    def saveModel(self):
        self.importData()
        standardScaler = StandardScaler()
        self.X = standardScaler.fit_transform(self.X)
        
        # Fitting the Simple Linear Regression to the Training set
        classifier = LogisticRegression(random_state = 0)
        classifier.fit(self.X, self.Y)
        
        # Saving Model into a File
        dump(classifier, 'salaryPredictor.joblib')
        print(classifier)

        
    def isBuying(self):
        userAge = float(input("Enter the user's age? "))
        userSalary = float(input("What is the salary of user? "))
        
        # Loading the model from the file 
        logisticRegression = load('salaryPredictor.joblib')
        
        # predicting the results 
        prediction = logisticRegression.predict(np.array([[userAge, userSalary]]))
        
        print ('This user is most likely to buy the product' if prediction[0] == 1 else 'This user is not gonna buy your product.')

model = Model()
model.saveModel()



# Call it later whenever you want
model.isBuying()

Explanation:

In the above code, there are three highlighted lines on which we will talk about.

First, we will import the dump and load package from Joblib library.
After that, if you see in line number 37 we are using dump to save machine learning model by passing two parameters in it.
The first parameter will be the machine learning model object and the second parameter will be the name of the file in which you want to store the model.
In line number 46 we are extracting the saved from .joblib using load package.

Saving Machine learning model using Pickle

To Serialize python objects pickle is one of the most used libraries. It is commonly used for serialization and deserialization. Here python objects can be anything, Pickle can serialize/deserialize python list or python dictionaries.

In the below code, we will see how we can use pickle to save and load machine learning models. Again, we will be using logistic regression to demonstrate this example.

# -*- coding: utf-8 -*-
"""
Created on Sat Feb 21 23:10:15 2018
Saving a machine learning Model in python with Scikit Learn
@author: SHASHANK
"""

# Importing the libraries
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
import pickle
import numpy as np 


class Model:
    X = None
    Y = None
    standardScaler = None
   
    # Importing the dataset
    def importData(self):
        dataset = pd.read_csv('supermall.csv')
        self.X = dataset.iloc[:, [2,3]].values
        self.Y = dataset.iloc[:, 4].values

    def saveModel(self):
        self.importData()
        standardScaler = StandardScaler()
        self.X = standardScaler.fit_transform(self.X)
        
        # Fitting the Simple Linear Regression to the Training set
        classifier = LogisticRegression(random_state = 0)
        classifier.fit(self.X, self.Y)
        
        # Saving Model into a File
        pickle.dump(classifier, open('pickle.sav', 'wb'))

        
    def isBuying(self):
        userAge = float(input("Enter the user's age? "))
        userSalary = float(input("What is the salary of user? "))
        
        # Loading the model from the file 
        logisticRegression = pickle.load(open('pickle.sav', 'rb'))
        
        # predicting the results 
        prediction = logisticRegression.predict(np.array([[userAge, userSalary]]))
        
        print ('This user is most likely to buy the product' if prediction[0] == 1 else 'This user is not gonna buy the your product.')

model = Model()
model.saveModel()



# Call it later whenever you want
model.isBuying()

Explanation:

In the above code, there are three highlighted lines on which we will talk about.

First, we will import the picklelibrary.
After that, if you see in line number 37 we are using pickle.dump to save machine learning model by passing two parameters in it.
The first parameter will be the machine learning model object and the second parameter will open a filepickle.savein write mode to save the machine learning model.
In line number 46 we are extracting the saved from pickle.savefile using pickle.load()method.
This method will open the pickle.savefle in read mode so that the pickle library can deserialize the machine learning object.

Conclusion

In this post, we studied how to save a machine learning Model in python using Joblib and Pickle library. Again I would suggest use Jiblib library for NumPy arrays and for other small models you can use pickle library.

Do share how would you like to use these two awesome libraries the below comment section. If you want to read more machine learning articles subscribe to the Newsletter and I’ll see you in the next post.

Saving machine learning Model in python with Scikit Learn

Save working machine learning Model and use it later in other tasks

K-Nearest Neighbors (K-NN) Classifier using python with example

How to Prepare Text Data for Machine Learning with scikit-learn

Related Posts

The Hunger Games Guide to Exploratory Data Analysis plotting in Python

NLP sentiment analysis in python

How to Prepare Text Data for Machine Learning with scikit-learn

How to Prepare Text Data for Machine Learning with scikit-learn

Leave a Reply Cancel reply