Introduction to Machine Learning Decision Tree

January 11, 2022

In This Article, You Will Learn About Machine Learning Decision Tree.

Machine Learning Decision Tree – Before moving ahead, let’s take a look at Introduction to Machine Learning Train/Test.

Decision Tree

A decision tree is a process of making decisions on based on some previous information.

How Does It Work?

Example: Read the file and print the information.

				
					import pandas as pd  

info = pd.read_csv("DATA.csv") #read file

print(info)

Click to Download DATA.csv file.

As a result, it read the dataset.

Note: All values in the data set should have a numerical value.

Data is not in numerical value; therefore, convert data to numerical value before making a decision tree.

To convert the dataset into a numerical value, use the Pandas’ map() function that takes dictionary (Pair > key:value) as a parameter.

For example: Let’s take a look at “DATA.csv” file.

{‘I’: 0, ‘II’: 1, ‘III’: 2}

Means convert the values ‘I’ to 0, ‘II’ to 1, and ‘III’ to 2.

Example: Change the “Class” and “Present” columns’ values to numerical values.

				
					import pandas as pd 

info = pd.read_csv("DATA.csv")

# convert dataset to the numerical value
Dictionary = {'A': 0, 'B': 1, 'C': 2, 'D': 3,
              'E': 4, 'F': 5, 'G': 6, 'H': 7, 'I': 8, 'J': 9, 'K': 10, 'L': 11}
info['name'] = info['name'].map(Dictionary)

Dictionary = {'I': 0, 'II': 1, 'III': 2, 'IV': 3, 'V': 4, 'VI': 5, 'VII': 6,
              'VIII': 7, 'IX': 8, 'X': 9, 'XI': 10, 'XII': 11}
info['class'] = info['class'].map(Dictionary)

Dictionary = {'Y': 0, 'N': 1, 'Y': 2, 'N': 3, 'Y': 4,
              'N': 5, 'Y': 6, 'N': 7, 'Y': 8, 'N': 9, 'Y': 10, 'N': 11}
info['present'] = info['present'].map(Dictionary)

print(info)

As a result, it returned two columns after changing the dataset into numerical values.

Now we have to separate the feature columns from the target column.

The columns that are called feature are those we attempt to forecast from. The targeted column is the column that contains the data we try to determine.

Example: Separate the columns as “features” and “target.”

				
					x = feature columns; y = target column.
features = ['name', 'class', 'rollno', 'marks']

x = info[features]
y = info['present']

print(x)
print(y)

Note: Add the code mentioned above except “print(info)” before this code and then run the code.

Now, we are ready to make a decision tree.

Example: Create a Decision Tree, save it as an image, and show it.

				
					# import necessary module

import matplotlib.image as pltimg
import matplotlib.pyplot as plt
import pandas as pd
import pydotplus
from sklearn import tree
from sklearn.tree import DecisionTreeClassifier

info = pd.read_csv("DATA.csv")

# convert dataset to the numerical value
Dictionary = {'A': 0, 'B': 1, 'C': 2, 'D': 3,
              'E': 4, 'F': 5, 'G': 6, 'H': 7, 'I': 8, 'J': 9, 'K': 10, 'L': 11}

info['name'] = info['name'].map(Dictionary)

Dictionary = {'I': 0, 'II': 1, 'III': 2, 'IV': 3, 'V': 4, 'VI': 5, 'VII': 6,
              'VIII': 7, 'IX': 8, 'X': 9, 'XI': 10, 'XII': 11}
info['class'] = info['class'].map(Dictionary)

Dictionary = {'Y': 0, 'N': 1, 'Y': 2, 'N': 3, 'Y': 4,
              'N': 5, 'Y': 6, 'N': 7, 'Y': 8, 'N': 9, 'Y': 10, 'N': 11}
info['present'] = info['present'].map(Dictionary)

features = ['name', 'class', 'rollno', 'marks']

x = info[features]
y = info['present']

decision_tree = DecisionTreeClassifier()
decision_tree = decision_tree.fit(x, y)
data = tree.export_graphviz(
    decision_tree, out_file=None, feature_names=features)
graph = pydotplus.graph_from_dot_data(data)
graph.write_png('first_decision_tree.png')

Image = pltimg.imread('first_decision_tree.png')
Image_plot = plt.imshow(Image)
plt.show()

Predict Values

Now we use the Decision Tree to predict new values.

Example: Use predict() method to predict new values.

				
					import pandas as pd
from sklearn import tree
from sklearn.tree import DecisionTreeClassifier

info = pd.read_csv("DATA.csv")

# convert dataset to the numerical value
Dictionary = {'A': 0, 'B': 1, 'C': 2, 'D': 3,
              'E': 4, 'F': 5, 'G': 6, 'H': 7, 'I': 8, 'J': 9, 'K': 10, 'L': 11}

info['name'] = info['name'].map(Dictionary)

Dictionary = {'I': 0, 'II': 1, 'III': 2, 'IV': 3, 'V': 4, 'VI': 5, 'VII': 6,
              'VIII': 7, 'IX': 8, 'X': 9, 'XI': 10, 'XII': 11}
info['class'] = info['class'].map(Dictionary)

Dictionary = {'Y': 0, 'N': 1, 'Y': 2, 'N': 3, 'Y': 4,
              'N': 5, 'Y': 6, 'N': 7, 'Y': 8, 'N': 9, 'Y': 10, 'N': 11}
info['present'] = info['present'].map(Dictionary)

features = ['name', 'class', 'rollno', 'marks']

x = info[features]
y = info['present']

decision_tree = DecisionTreeClassifier()
decision_tree = decision_tree.fit(x, y)

print(decision_tree.predict([[7, 14, 45, 1]]))

print("[0] means 'Y'")
print("[1] means 'N'")

Line 1 to 3,

Imported necessary modules to predict decisions based on the dataset.

Line 5,

Used Pandas’ function read_csv to read the data file.

Line 7 to 19,

To convert non-numeric values into numeric values, used dictionary data type. Each non-numeric value is taken as Key and sequentially assigned each non-numeric value a number in the format of Key:value.

Line 21 to 24,

Specified dataset columns as features (x) and target (y).

Features – Features column is a column from which data is taken to predict.

Target – Target column is a column that will be predicted.

Line 29,

Predict whether a student will be presented or not based on the given dataset value.

Will a student be presented if his name is “G” studying in class VII with rollno 14 and marks 45?

As it is shown clearly that from dataset it returned decision tree.

If you find anything incorrect in the above-discussed topic and have any further questions, please comment below.

Connect on:

In This Article, You Will Learn About Machine Learning Decision Tree.

Table of Contents

Decision Tree

How Does It Work?

Predict Values

Recent Post

Python Mutable Default Arguments: The Coffee Machine Trap

Mutable Default Arguments in Python

Sora ChatGPT Image Tutorial: How to create AI Images using ChatGPT Sora AI Model

ChatGPT Ghibli Tutorial: How to create Studio Ghibli Style Portraits

Python Bulk Email Sender Using Gmail & Google Sheets

Popular Post

Get Started: SQL RIGHT JOIN

Get Started: SQL UNION Operator

Top Five IDEs in 2022

Introduction to Python Matplotlib Labels & Title

Get Started: Python MySQL Limit Table

Top Articles

Software Engineering: An Introduction to Software Development

Old Code: Outdated Programming Languages In 2022

Why is it essential to choose Python in 2023?

DSA: Algorithm Definition and its Characteristics

Top Programming Languages used by MNCs

Archives

Categories

Subscribe to our newsletter

Useful Links

Get Started

About

In This Article, You Will Learn About Machine Learning Decision Tree.

Table of Contents

Decision Tree

How Does It Work?

Predict Values

Recent Post

Popular Post

Top Articles

Archives

Categories

Tags

Subscribe to our newsletter

Useful Links

Get Started

About