This tutorial outlines how to search through folders of images and apply a batch process. In this case each image will be pansharpened; however this same batch processing method can be used in many other scripts.
Often times when running through a workflow you wish to run the same functions on many images. Creating a batch workflow script can automate this process, saving you time.
- Geomatica 2015 or later installed
- Python 101 (basic understanding of python coding concepts)
- try, except, finally statements
- For loops
- How to call a function
- Some experience programming with Geomatica (recommended)
- Click Here to download the data required for this tutorial. Extract the files and folders from this zip file to your computer. You will want to keep the folders in the predefined order.
To demonstrate how to create a batch workflow script we will iterate through a couple folders that include Landsat-8 data. Each of these images will be pansharpened and then saved into a new folder
1. Import necessary modules & setup inputs/outputs
The first step in this workflow will be to import the required modules and set our working and output directories.
In the above block of code lines 1 and 2 import the pansharp2 Geomatica function and the PCI exceptions. Lines 3 imports the built-in python os module. We will use this module to access the operating system’s file structure in order to create new directories and iterate through current directories. We will use the final module fnmatch to find files within our input directory that are Landsat-8 images.
We then need to set out working directory working_dir and output directory output_dir. The working directory will be set to the folder which includes all of our Landsat-8 images. This folder can also include folders. The zip file data pack that you downloaded has the following file structure.
You will need to set the location of the Landsat 8 folder as working_dir. output_dir will be set to the location where you will save your pansharpened images. In this variable definition, os.path.join() is used to create a new pathname string that combines the previous pathname working_dir with a new folder name. In line 9 os.path.isdir() is used to check if output_dir is currently a directory. If it is a directory, line 10 will be skipped. If it is not a directory, line 10 will be executed. The os.makedir() module will create the new output_dir directory. The first time that you run the script, the folder Pan_Images will be added to the Landsat8 folder. If you then rerun the script, line 10 will be skipped since the folder Pan_images already exists.
2. Iterate through folders to search for Landsat-8 datasets
In order to navigate through all of the files within working_dir a list of all valid files needs to be generated.
Before you can create a list of all the files an empty list needs to be created. On line 12 an empty list input_files is created. This list will then be populated with all of the valid filenames. Line 14 starts the for loop which searches through a directory and its sub directory for any files. Three variables are extracted by os.walk: r, d and f. r is the main directory, d is any sub directories and f is the file names within r. During the first iteration of the loop r is equal to working_dir (Landsat8), d includes the three sub-folders (Markham, Ottawa, Toronto) and f is any file that is directly within working_dir (i.e. not in the sub folders). On the next iteration the first subfolder (Markham) is searched and f changes to become the files within the subfolder.
On line 15 a nested for loop is created. In this loop each file (f) is filtered depending on the file extension (*_MTL.txt).Only files that end in _MTL.txt will be added to our input_files list. On line 16 the filename that matches the filter is joined with the directory (r) that is currently being searched. This new pathname is appended to the input_files list.
3. Run pansharp2 on each valid image
Now that we have the full pathnames of all the valid _MTL.txt files in a list, we can iterate through the list and run the pansharp2 algorithm.
You can iterate through the input_files list using a for loop. The pansharp2 algorithm will be individually run on each image within input_files. When you run pansharp2, make sure that it is within a try and except statement. This ensures that if one iteration fails, the entire script will not stop running. In a try and except loop, an error will be printed for the failed iteration and then the program will continue to process the next image in the list. In this case either a PCI error message or a Python error message will be printed if pansharp2 fails to run on an image.
On line 21 ‘-’.join is used to join the string ‘MS’ to the filename image (from input_files) using a dash (-). In this case a file name of Landsat8\Markham\LC80180302014247LGN00_MTL.txt becomes Landsat8\Markham\LC80180302014247LGN00_MTL.txt-MS. On line 21 the MS bands are used as the input file (fili) and on line 22 the PAN band is used as the panchromatic input file (fili_pan). Finally the output file is set on line 23. An example of creating an output pathname and filename is outlined below assuming that we are currently running pansharp2 on the image Landsat8\Markham\LC80180302014247LGN00_MTL.txt
The code for the output file in line 23 is broken up into four parts:
- First we need to extract the basename of the file: LC80180302014247LGN00_MTL.txt
- The basename is then split at the _MTL section and the first section of the split is kept: LC80180302014247LGN00
- A period (.) then joins the split basename to the string ‘pix’: LC80180302014247LGN00.pix
- Finally, the new filename is joined to output_dir : Landsat8\Markham\LC80180302014247LGN00.pix
You can now run the script. If you wish to which image is currently being processed you can easily add a print statement between lines 19 and 20 (print image). This script will iterate through all of the folders and save each pansharpened image to the Pan_Images folder. In total there will be four pansharpened images: One from Markham, two from Ottawa and one from Toronto.