Take me to pcigeomatics.com

PCI Geomatics Help Center

How can we help you today?

Haze Removal and Cloud Detection

PCI Geomatics -



This tutorial outlines how to create masks for cloud, water and haze and then remove haze from a Landsat-8 image. There is also an optional step to pansharpen the haze removed image. The haze removed or pansharpened images can then be used in further analyses.



Geomatica provides powerful haze removal and cloud detection algorithms, which can be automated to some extent. This tutorial demonstrates the power and simplicity of these algorithms.



  • Geomatica 2015 or later installed
  • Python 101 (basic understanding of python coding concepts)
  • Some experience programming with Geomatica (recommended)

Data Package


  • Click Here to download the data required for this tutorial



We will be running two algorithms, masking and hazerem. The masking algorithm is used to calculate the cloud, water and haze masks. These masks are then used in the hazerem algorithm to remove haze from multispectral and panchromatic imagery. The third optional step is to pansharpen the haze removed image. The dataset used in this tutorial is a Landsat-8 image from Thailand.

Raw Landsat-8 Image: You will notice areas of haze. This will be corrected with the haze removal.

You will notice in this tutorial that many of the parameters for these algorithms are left as their defaults. You can click on any of the algorithm names in this tutorial to go to the Geomatica Help and see all of the algorithm’s parameters and default values.

1. Import necessary modules & setup inputs/outputs


The first step is to import the Geomatica modules and setting up the variables that will point to our input and output directories and files.

import os
from pci.masking import masking
from pci.hazerem import hazerem
from pci.pansharp2 import pansharp2

working_dir = r'I:\Data\Tutorial\Python\Haze_Removal'
ms_image = os.path.join(working_dir, 'LC81290492013334LGN00', 'LC81290492013334LGN00_MTL.txt-MS')
pan_image = os.path.join(working_dir, 'LC81290492013334LGN00', 'LC81290492013334LGN00_MTL.txt-PAN')
masks = os.path.join(working_dir, "masks.pix")
haze_ms = os.path.join(working_dir, 'HazeRemovedMS.pix')
haze_pan = os.path.join(working_dir, 'HazeRemovedPAN.pix')
pansharpened = os.path.join(working_dir, 'Pansharpened.pix')


The os module, which we will use to access the operating system’s file structure, is imported on line 1. Lines 2 to 4 import the three required algorithms, as outlined above. On line 6, working_dir is a variable that contains the path to the directory where your tutorial data is stored. You will need to change this to the directory on your computer where your tutorial data is saved. Two input files are set on lines 7 and 8; one for the multispectral image and one for the panchromatic image. Finally, four output images are set on lines 9 to 12. The variable masks will be the output from masking while haze_ms and haze_pan will be the outputs from hazerem. The last output image is optional, depending on whether you wish to pansharpen the images.

2. Create masks


The first algorithm we will run is masking. In this step we create bitmap masks for cloud, water and haze.

# Create cloud, water and haze masks


The masks are created using the blue, red, NIR and SWIR bands from the multispectral image. On line 15 input file (fili) to ms_image. There is channel metadata for each channel from the Landsat-8 image, so the algorithm can determine which bands are required. We then set the haze coverage percentage (hazecov) to 70. For different imagery you will likely need to experiment with this value to produce the best haze mask. This parameter should be as low as possible while still allowing for haze removal. We also need to set the cloud reflectance threshold (clthresh). These values are used to identify cloud in the image. The order of the threshold values is: Low Threshold, Seed Threshold and Dilation Factor. For more explanation on these values please see the Geomatica Help. The default values for Landsat imagery is 17,22,2. However, for this specific image the values 18,22,1 create a more accurate cloud mask. Finally, on line 18 the output file parameter (filo) is set to the output image variable masks.

3. Remove Haze


The final step is to run hazerem to remove the haze from the image.

# Remove the haze from both the multispectral and panchromatic bands


The input images (MS and PAN) are set on lines 21 and 22. You will also need to specify the input masks you will be using. On line 23 we will input the masks created with the masking algorithm. Supplying these masks will improve the haze removal results. Haze is removed over water but clouds will be ignored. If no cloud mask is provided the clouds are considered haze and will result in overcorrecting. We will then set the hazecov value to 70 again. Finally, the two output file parameters are defined on lines 25 and 26. We will set these parameters to the two variables, haze_ms and haze_pan, which we defined in the first step.

4. Optional: Pansharpen


Optionally, you can now pansharpen the haze removed images using the pansharp2 algorithm.

# Optional: Pansharpen haze removed MS image


For this algorithm you only need to set the input and output parameters. All other parameters will be left as default. In the haze removal step we generated both an MS and PAN images with haze removed. These two images can now be used for two input images (fili and fili_pan) on lines 29 and 30. The output image will be set to the variable pansharpened that was defined earlier.

You will now have a haze removed, pansharpened image that you can use in further analyses. You can integrate this haze removal workflow into many other workflows, including a batch workflow as outlined in the Batch Processing tutorial.

Have more questions? Submit a request


Powered by Zendesk