# Decision Tree For Image Classification

This course provides an introduction to Decision Tree for Image Classification. It explains how to model, train and evaluate a classifier. You will learn about the various algorithms that can be used for Decision Tree, such as the Gini Index, Chi Squared statistic and entropy based evaluation. The course focuses on demonstrating how these algorithms are used to train a variety of models with different hyperparameters such as n_estimators, max_depth, min_samples_leaf etc.

The decision tree is a useful algorithm for classifying images. The advantage of the decision tree algorithm over other algorithms such as Bayes classifier or Support Vector Machine is that it’s simple, easy to understand and easy to implement

Decision Tree is a powerful supervised learning algorithm for classification and regression tasks. It makes decisions based on features by splitting data at certain points in the feature space and assigning objects into different groups. The process stops when new groups cannot be created anymore, or when we have reached the specific number of nodes (levels) needed for our task.

## Decision Tree For Image Classification

• Understand principles of decision trees
• Develop a rule set for classifying broad land cover categories
• Implement a decision tree classification in the EnMAP Box

# Background

Image classification is a key component of many remote sensing applications and the choice of classification algorithms relates amongst others to the nature of the classification problem (i.e. number and type of classes), the (statistical) properties and the number of input features (e.g. number of spectral bands), or computational performance.

Decision trees (DT) are arguably the most intuitive classification algorithms and provide a good entry point into the applied side of things concerning image classification. Principally, the DT is not more than a set of decision-rules which converts continuous information, such as spectral information from an image, into discrete thematic information, such as a land cover class. Each pixel will be assigned to a land cover class if its spectral information (or spectral transformations like vegetation indices) fits certain criteria. The name “Decision Tree” comes from its structure, which is organized hierarchically in simple binary (yes/no) decisions.

Take a look at the above DT as an example, starting at the top. The boxes are referred to as ‘nodes’. Each node contains a binary criterion which evaluates a statement to either TRUE or FALSE. Depending on the decision for a given data point, we proceed to the next node. Depending on the classes of interest and the spectral information available, DTs can become very complex. In the above example, we reach a final decision after only two nodes. According to this DT, every pixel having an NDVI of <= 0.6 and a swIR reflectance of <= 5% belongs to the class ‘water’. Applying this very simple DT to the Sentinel-2 image of Berlin already produces meaningful classification results for the water class.

The resulting map shows us ‘water’ pixels against all other ‘unclassified’ pixels. More decision rules are needed to discriminate the unclassified pixels and thereby produce a complete land cover map from the image.

# Session materials

Please use the Sentinel-2 summer image (acquisition date 26.07.2019, 20 m, 9 spectral bands) you prepared in session 06 and used in session 07. If you do not have it available anymore, you can find it in the materials for session 07

# Exercise

In the following exercise, we want to enhance the above DT model by adding new rules to come up with a more detailed classification containing the following classes:

## Implement a DT

• Visualize the Sentinel-2 summer image in the EnMAP-Box.
• Open the imageMath calculator to classify water bodies according to the DT presented above. Please remember: reflectance is usually expressed as a fraction between 0.0 and 1.0. The values of the image are scaled by 10000. In that sense, a reflectance of 0.05 (or 5%) corresponds to the value 500.
• The code below illustrates how the DT is implemented in the imageMath calculator. Please get familiar with the code and underlying commands by carefully reading through the #explanations.
• Copy and paste the code into the imageMath calculator and execute the DT as shown in the video below.
``````# In the EnMAP Box specify
# an input object S2_20m that holds the S2 image and
# an output object 'LC' corresponding to the land cover map.

# You can use indexing with [] to select individual bands of the image
# Note that in Python, we start counting from 0. The first band of an image
# is accessed with image, the second band with image, etc.

# Define new objects for each band that you need in the DT
red = float32(S2_20m)
nir = float32(S2_20m)
swir1 = float32(S2_20m)

# create derivatives (e.g. NDVI)
NDVI = float32((nir-red)/(nir+red))

# create an empty raster using properties (size, projection, etc.) of another raster
LC = zeros_like(NDVI) #instead of NDVI this could be "red", "nir" or " swir1" as well

# set rules for nodes
N1_T = NDVI <=0.6 # results in TRUE for every pixel meeting the condition
N1_F = logical_not(N1_T) # invert above statement
N2A_T = swir1 <= 500 # remember that reflectance is scaled by 10000

# Create the decision tree for classification
# we here use * as a logical operator AND: all positions in the classification layer where
# condition 1 AND condition 2 are TRUE will be set to the defined value (here 1)
LC[N1_T*N2A_T] = 1

# Create metadata for the output image
setCategoryNames(LC,  ['unclassified', 'water']) # specify class names, starting with 'unclassified'
setCategoryColors(LC, [(0,0,0), (50,150,250)]) # specify class colors in RGB
``````

## Discuss the DT result

• Visualize the resulting classification and establish a link with the Sentinel-2 image.
• Critically discuss the classification result:
• Are the decision rules and values for the water class well selected?
• Would you recommend an alternative rule set for classifying water?

# Assignment

## Expand the DT

• Expand the existing DT to classify the remaining classes according to the following structure.
• Use the code snippet below to develop new decision rules for the empty nodes (‘…’). Please get familiar with the code and underlying commands by carefully reading through the #explanations.
• Update the code sections “set rules for nodes” and “Create the decision tree for classification” with your modifications.
``````
# set rules for nodes
N1_T = NDVI <=0.6 # results in TRUE for every pixel meeting the condition
N1_F = logical_not(N1_T) # invert above statement
N2A_T = swir1 <= 500 # remember that reflectance is scaled by 10000
N2A_F = logical_not(N2A_T)
N2B_T = ... # fill this up with your decision rule
N2B_F = logical_not(N2B_T)
N3A_T = ... # fill this up with your decision rule
N3A_F = logical_not(N3A_T)

# Create the decision tree for classification
# we here use * as a logical operator AND:
# all positions in the classification layer where
# condition 1 AND condition 2 are TRUE will be set to the class value we define
LC[N1_T*N2A_T] = 1
LC[N1_T*N2A_F*N3A_T] = 2
LC[N1_T*N2A_F*N3A_F] = 3
LC[N1_F*N2B_T] = 4
LC[N1_F*N2B_F] = 5
``````
• Update the code section “Create metadata for the output image” to include 6 classes (1 unclassified class plus 5 land cover classes), their names and colors and execute the DT.
• Discuss the resulting classification critically and list pros and cons of the method and the decisions used, respectively.

## Submission

• Please upload a screenshot of your imageMath script, the classification (map + legend) and the discussion in bullets as pdf to moodle.
• General submission notes: Submission deadline for the weekly assignment is always the following Monday at 10am. Please use the naming convention indicating session number and family name of all students in the respective team, e.g. ‘s01_surname1_surname2_surname3_surname4.pdf’. Each team member has to upload the assignment individually. Provide single file submissions, in case you have to submit multiple files, create a *.zip archive