Artificial Intelligence, like it or not but you can’t ignore it. From last few days, I was working on a project and that project had few requirements where I had to implement a prediction mechanism. So I decided to use machine learning into it, though my project was bit complicated, hence here I will be sharing a small piece of the code in this blog post. In this post, we will create a machine learning prediction model using the Simple Linear Regression algorithm. We will create a death age calculator model based on the number of cigarettes consumed in a day.
What is Simple linear regression?
Yes, before going any further we should understand, what is simple linear regression? Linear regression is a branch of statics. But in machine learning, we use this to minimize the error while predicting the results.
Honestly, I don’t want you to bore with the theory and equations. So, will keep this very short and easy to understand.
In the above graph, the x-axis represents the number of cigarettes consumed in a day, and the y-axis represents the death age. The dots on the graph represents the death age caused by the number of cigarettes consumed in a day. The equation of linear regression is shown below,
y = B0 + B1*x
where,
y is a dependent variable.
B0 is the Y intercept, where best-fitted line intercept with Y axis.
B1 slope co-efficient.
x is independent variable.
So in this graph, these dots are our data and based on these data we will train our model to predict the results. The red line is the best-fitted line for the given data.
The best-fitted line is a straight line that best represents the data on a scatter plot. The best-fitted line may pass through all of the points, some of the points or none of the points in the graph.
Preparing the data for training
Preparing the data set is an essential and critical step in the construction of the machine learning model. To predict the accurate results, the data should be extremely accurate. Then only your model will be useful while predicting results. In our case, the data is completely inaccurate and just for demonstration purpose only. So basically we have CSV file, which has a count of the numbers of cigarettes smoked in a day versus death age as shown in the below image.
To import this file and to use the data inside the file, we will pandas python library. To implement the Simple linear regression model we will use the scikit-learn library.
Implementing the model to predict the results
Now let’s create a model to predict the death age based on the number of cigarettes consumed in a day. The first step to construct a model is to create a smoke.py
python file and import the required libraries.
=>Let’s import the libraries in our smoke.py
file.
smoke.py:
# -*- coding: utf-8 -*- """ linear regression algorithm machine learning @author: SHASHANK """ import pandas as pd from sklearn.cross_validation import train_test_split from sklearn.linear_model import LinearRegression
=>Now we will create a class called model
shown below. In this class, we will create two methods. The first method will import the data and the second method will predict the results.
smoke.py:
# -*- coding: utf-8 -*- """ linear regression algorithm machine learning @author: SHASHANK """ class Model: X = None Y = None # Importing the dataset def importData(self): def predictAge(self): # we will call importData(), in order to import the test data. self.importData()
=>Now let’s import the data set in our model
class. Under the importData()
method add the below code as shown below,
smoke.py:
# -*- coding: utf-8 -*- """ linear regression algorithm machine learning @author: SHASHANK """ class Model: X = None Y = None # Importing the dataset def importData(self): dataset = pd.read_csv('smoke_data.csv') self.X = dataset.iloc[:, :-1].values self.Y = dataset.iloc[:, 1].values def predictAge(self): # we will call importData(), in order to import the test data. self.importData()
=>Let’s add the code under predictAge()
method. In this method will add a code to fit the train data that we have already have. Also, we will take input from the user and based on that input our model will predict the results. So in the end, your model should look like this:
smoke.py:
# -*- coding: utf-8 -*- """ linear regression algorithm machine learning @author: SHASHANK """ class Model: X = None Y = None # Importing the dataset def importData(self): dataset = pd.read_csv('smoke_data.csv') self.X = dataset.iloc[:, :-1].values self.Y = dataset.iloc[:, 1].values def predictAge(self): # we will call importData(), in order to import the test data. self.importData() # Fitting the Simple Linear Regression to the Training set regressor = LinearRegression() regressor.fit(self.X, self.Y) smokePerDay = float(raw_input("How many cigarettes do you smoke in a day? ")) if smokePerDay > 30: print "You don't need ML to predict your death age, you will die very soon." else: age = regressor.predict([[smokePerDay]]) print "Your predicted age is ", int(round(age[0])) , "Years, if you start smoking from the day one."
Executing the Model
Now your model is complete and ready to predict the result. To execute the model we will call the predictAge()
method of the class model as shown below,
# -*- coding: utf-8 -*- """ linear regression algorithm machine learning @author: SHASHANK """ Model().predictAge()
Conclusion
The very first step to learn machine learning is to create a basic regression model. In this post, we understood how to create a basic machine learning model using simple linear regression.
If you are an absolute beginner you will find this article very easy to understand. Also, I will urge you to learn more about linear regression from google. By doing so, you will understand where and when to apply simple linear regression in your project.
If you like this article share it on your social media and spread a word about it.
Till then, happy machine learning.