Introduction to Machine Learning Train/Test

January 7, 2022

In This Article, You Will Learn About Machine Learning Data Set.

Machine Learning Scale – Before moving ahead, let’s take a look at Machine Learning Scale.

Evaluate Model

When we use Machine Learning we create models to predict the outcomes of certain events, for instance, in the lesson before, where we predicted the ID of the student, when we were aware of the weight and Roll_No.

To determine whether the model is accurate enough, we could use an approach called Train/Test.

Train/Test is an approach to test how accurate your models are.

It’s called Train/Test since it splits data set into two parts: one set that is a testing set and a test set.

*You train the model using the training set.

*You test the model using the testing set.

*Train the model means create the model.

*Test the model means test the accuracy of the model.

Assume Data Set

Assume a data set you want to work/test with.

Example: Assume a data set of 50 students in a class.

				
					import numpy as pk
import matplotlib.pyplot as plt
pk.random.seed(2)

x = pk.random.normal(10, 20, 50)
y = pk.random.normal(25, 30, 50)/x

plt.scatter(x,y)
plt.show()

The x axis represents number of students at the interval of 10.

The y axis represents number of students at the interval of 5.

Split Into Train/Test

The training set should consist of 80 percent of the data.

The test set is the remaining 20 percent.

train_x = x[:80]
train_y = y[:80]

test_x = x[80:]
test_y = y[80:]

Display the Train Model

				
					import numpy as pk
import matplotlib.pyplot as plt
pk.random.seed(2)

x = pk.random.normal(10, 20, 50)
y = pk.random.normal(25, 30, 50)/x

train_x = x[:80]
train_y = y[:80]

test_x = x[80:]
test_y = y[80:]

plt.scatter(train_x,train_y)
plt.show()

Display the Test Model

				
					import numpy as pk
import matplotlib.pyplot as plt
pk.random.seed(2)

x = pk.random.normal(10, 20, 50)
y = pk.random.normal(25, 30, 50)/x

train_x = x[:80]
train_y = y[:80]

test_x = x[80:]
test_y = y[80:]

plt.scatter(test_x,test_y)
plt.show()

Fit the Data Set

What would the data set appear like? Let’s try to fit the data with Polynomial Regression So let’s draw an outline of a polynomial regression.

For drawing a straight line between all the points of data, employ to use the plot() method of the Matplotlib module.

Example: Draw a polynomial regression line through the data points.

				
					import numpy as pk
import matplotlib.pyplot as plt
pk.random.seed(2)

x = pk.random.normal(10, 20, 50)
y = pk.random.normal(25, 30, 50)/x

train_x = x[:80]
train_y = y[:80]

test_x = x[80:]
test_y = y[80:]

model = pk.poly1d(pk.polyfit(train_x, train_y, 2))

line = pk.linspace(1, 4, 50)

plt.scatter(test_x,test_y)
plt.plot(line, model(line))
plt.show()

The results can support my idea of fitting the data set to the polynomial regression model however it might yield some strange results if we attempt to predict values that are not part from the dataset.

But what is the score of R-squared? The R-squared score is an excellent indicator of how my data set fits the model.

R2

R2 is also known as R-squared.

But what is the score of R-squared? The R-squared score is an excellent indicator of how my data set fits the model.

Example: Let’s see whether data set fits well or not.

				
					import numpy as pk
from sklearn.metrics import r2_score
pk.random.seed(2)

x = pk.random.normal(10, 20, 50)
y = pk.random.normal(25, 30, 50)/x

train_x = x[:80]
train_y = y[:80]

test_x = x[80:]
test_y = y[80:]

model = pk.poly1d(pk.polyfit(train_x, train_y, 2))

r2 = r2_score(train_y, model(train_x))
print(r2)

				
					Output - 

0.03958675473193263

Start with Testing Set

Now we’ve created an acceptable model at the very least when it comes to the training data.

We are now going to verify the model by using the test data to determine if it produces the same results.

Example: Let’s find the R2 score when using testing data.

				
					import numpy as pk
from sklearn.metrics import r2_score
pk.random.seed(2)

x = pk.random.normal(10, 20, 99)
y = pk.random.normal(25, 30, 99)/x

train_x = x[:80]
train_y = y[:80]

test_x = x[80:]
test_y = y[80:]

model = pk.poly1d(pk.polyfit(train_x, train_y, 4))
r2 = r2_score(test_y, model(test_x))

print(r2)

				
					Output - 

-0.2399553815802038

Predict Values

Once we’ve confirmed that the formula is valid and we are able to begin predicting new results.

				
					import numpy as pk
import matplotlib.pyplot as plt
pk.random.seed(2)

x = pk.random.normal(10, 20, 50)
y = pk.random.normal(25, 30, 50)/x

train_x = x[:80]
train_y = y[:80]

test_x = x[80:]
test_y = y[80:]

model = pk.poly1d(pk.polyfit(train_x, train_y, 2))

line = pk.linspace(1, 4, 50)

plt.scatter(test_x, test_y)
plt.plot(line, model(line))
plt.show()

print(model(10))

				
					Output - 

2.1378158872036197

If you find anything incorrect in the above-discussed topic and have any further questions, please comment below.

Connect on:

In This Article, You Will Learn About Machine Learning Data Set.

Table of Contents

Evaluate Model

Assume Data Set

Split Into Train/Test

Display the Train Model

Display the Test Model

Fit the Data Set

R2

Start with Testing Set

Predict Values

Recent Post

Python Conditional Statements (if else) Project for Beginners

Google Dialogflow Chatbot – Renew Subscription Plan

How to Create a Google Dialogflow Chatbot – Solved Hosting Login Issues?

How to create a Chatbot with Google Dialogflow?

Google Dialogflow Chatbot Tutorial for Beginner

Popular Post

Get Started: SQL JOINS

Introduction to Python Matplotlib Bars

Python Program: Reverse a list of numbers

How to Create a Google Dialogflow Chatbot – Solved Hosting Login Issues?

Data Structure Algorithm: Introduction to Queue

Top Articles

Machine Learning: A Comprehensive Guide to Machine Learning

Get Started: Python MySQL Where

Most Important Python Deep Learning Libraries in 2023

Every Developer Should Master Python Packages In 2022

Network Security: Introduction to Network Security

Archives

Categories

Subscribe to our newsletter

Useful Links

Get Started

About

In This Article, You Will Learn About Machine Learning Data Set.

Table of Contents

Evaluate Model

Assume Data Set

Split Into Train/Test

Display the Train Model

Display the Test Model

Fit the Data Set

R2

Start with Testing Set

Predict Values

Recent Post

Popular Post

Top Articles

Archives

Categories

Tags

Subscribe to our newsletter

Useful Links

Get Started

About