Science Cafe Posts

SEES 2022: Sentinel-2 Satellite Image Processing to Classify Water and Algae Presence

Guest blog: Ishaal V.

Selecting the right satellite images and products can be challenging! In this blog, I am sharing my learning and experience of working with satellite images for image classification and visualization.

The use of remote sensing satellite images to measure water qualities is a viable option to predict and control vector-borne diseases. Space agencies, such as NASA and ESA, provide open access to both acquired images and curated data/products for scientific research. Data is available both in the raw format as well as products (e.g. Top-of-Atmosphere (TOA) reflectance, Bottom-of-Atmosphere (BOA), etc.). Also, there are various toolsets (Earth Explorer, LandsatLook, Copernicus Open Access Hub, Sentinel Application Platform (SNAP), Google Earth Engine, etc.) available for downloading satellite images and products. 

The focus of our project centered around the identification of certain water features such as algae presence to develop our predictive model for vector-borne diseases.
For our project, we used Sentinel-2 Level-2A products for the Sacramento, CA,  region for water classification and algae detection.


 

 My learning can be summarized in the following 4 simple steps:

  1. Select the right satellite data based on the Area of Interest (AOI) and scope of the project
  2. Select the product (generally TOA or BOA reflectance) based on the complexity of the project
  3. Access and download the satellite product dataset using a JSON file (for AOI) and Python script
  4. Implement an algorithm for indices for scene classification using Python

To accomplish this using remote sensing required us to explore different satellites before finalizing our choice on the Sentinel-2 satellite. These are my findings in utilizing and processing Sentinel-2's products:

Satellites

While working with remote sensing data, selecting the right product source is essential. After all, there are various missions with either geostationary satellites or polar-orbiting satellites. The selection of the right satellite data and product will have a direct bearing on the outcome of the research and prediction model. 

Landsat 8/9

Landsat 8 and Landsat 9 work as a satellite constellation. These two nearly identical satellites work in tandem as a system. The satellites orbit the Earth completely every 16 days with an offset orbit between each satellite to cover the earth completely every 8 days. The products provided by these satellites are split into 11 bands from two different sensors (Operational Land Imager – OLI, Thermal Infrared Sensor (TIRS) as listed below.


Courtesy usgs.gov

Landsat 8 and 9 satellites provide 3 levels of data, with Level-1 data giving unprocessed data and Level-2 and Level-3 data providing processed products with more "clean" products better suited to most use cases. 

The products provided by the satellites are additionally given in a .TIF format that is easy to use and process in python. 

Sentinel-3

Primarily created for marine observation, the Sentinel-3 satellites study sea-surface topography, sea and land surface temperature, and ocean and land color. The primary instrument on the Sentinel-3 satellites is a radar altimeter that also carries optical imagers. The Sentinel-3 satellites are most useful for measuring sea, ice, and land surface temperature. Of note is Sentinel-3's superior capabilities in detecting and observing fires.

Sentinel-3's bands, with 6 new bands in addition to those it shares with the MERIS mission from European Space Agency.

 

​​​​Similar to the Landsat 8/9 products, the products from Sentinel-3's satellites are provided in a .TIF format that makes it useful in processing using python. However, the focus of the Sentinel-3 mission is not significant to observing algae and water presence in inland water bodies.

Sentinel-2

The Sentinel-2 mission's main objective is land monitoring, including inland water bodies making it a prime candidate for observing algae coverage and water presence. The Sentinel-2 mission consists of two satellites, Sentinel-2A and Sentinel-2B operating in a twin configuration giving the mission a high revisit time of 5 days. The Sentinel-2 mission outputs 3 sets of products in 10m, 20m, and 60m resolutions. The field of view of the satellites is extremely large at 290 km and offers 13 bands in its products (listed below):


 

Sentinel-2's product bands from European Space Agency. Note: 4 bands at 10m, 10 bands at 20m, and 11 bands at 60m.


 

Level-2A products are most useful due to atmospheric interference being processed out of the products and the high quality of the products. Of note is that the large 100x100km^2 products are usually larger than 1 GB in size. In addition, the metadata provided with the Sentinel-2's data set allows for things like cloud coverage and water presence. 

Due to the large size of the Sentinel-2 products, it is not contained in a .TIF format but instead, a .JP2 format that must be processed into a .TIF file to be more easily analyzed. In general, Sentinel-2 products are more difficult to retrieve and process than Landsat 8/9 and Sentinel-3 products but offer far more benefits in resolution, revisit time, and the data it provides as being more relevant in detecting the algae presence in inland water bodies. 

Methodology: Processing Sentinel-2 Products

Locating and Retrieving your Area of Interest

The first step in using the Sentinel-2 products is retrieving the data from Sentinel-2's Copernicus database. Based on the Area of Interest (AOI), go to https://geojson.io/#map and highlight the area you want to analyze. Then export this as a geoJSON file (a format for encoding geographic data structures) . This JSON file has the boundary information of AOI selected on the map. 

There are two prerequisites before downloading sentinel-2 L2A products and data.

  1. Credentials for accessing Sentinel Hub - You will need to create an account at https://scihub.copernicus.eu to access the Sentinel-2 products. Take note of your account name and password as it will need to be used by your code to access the Sentinel-2 data.
  2. Python Essential Libraries for working with satellite images – You will need geopandas, Sentinelsat, rasterio, gdal, pyproj, Fiona, and NumPy. You can use conda or pip to install packages, libraries, and dependencies. Searching on Google can provide the right steps for installation for your python version. I recommend using Python 3.8.13 or higher.

Use geopandas to read (geopandas.read-file()) from geoJSON file created earlier. Extract boundary information from the geoJSON file. Using the .read_file() command with the address to your geoJSON file as an input, copy it into a variable which I will refer to as boundary. Using for loop, a list can be created with the coordinates of your area of interest. Next is to access the satellite data.

Sentinelsat provides SentnelAPI for easy access to satellite data.

Create an instance of SentinelAPI with an input of your username, password, and 'https://scihub.copernicus.eu/dhus'. This will allow us to directly access the Sentinel-2 products using our four coordinates, date range, satellite, processing level, and cloud cover percentage as inputs. Put this into the .query method. An example of this is listed below:


For a smaller range of dates use a higher range of cloud percentages(shown above). If you are doing a longer range of dates (1 year+) you can lower that percentage significantly as there is a higher likelihood of there being a product with those requirements.

​​​​Running this will output a list of products available to download. The list provides a large range of metadata such as dates, orbit direction, and which of the Sentinel-2 satellites took the product. You will notice that certain products have SEH or SFH as their granuleidentifier. As your AOI boundaries may not exactly match with how tiles are created for Sentinel-2 products, it’s important to find the “best fit” tile for the project. In our case, 10SFH provides maximum coverage for our AIOm with a thin strip falling within 10SEH.

TIPS - To determine which best fits your area of interest you will need to install google earth pro and download the Sentinel-2 orbit path (link here). There you can go to your area of interest and determine the right fit tile for your AOI.

To download your desired product run the instance of SentinelAPI with .download(), with the UUID from the table as your input. Once downloaded as a zip file, extract all sub-directories and files in your data folder.  File structure information is available here.

Preview images are provided in JP2000 format for each band. Files are located in the folder “GRANULE”, with a subfolder for each spatial Resolution (10m, 20m, and 60m). For our project, we worked with a 20-meter spatial resolution.

It is recommended that you use a 20-meter resolution for analysis as it provides the most relevant bands while providing a high resolution. The 10-meter resolution provides very few bands to be used and the 60-meter resolution is far less precise and only offers a single band of more than 20-meter and is recommended to be only used in specialized cases. To access the data the path is: GRANULE>>L2A FILE>>IMG_DATA>>R20m>>the specific band you want to access (B01, B02, B03...). You can attain the metadata by using .meta and .read() to read the data into a variable. Then write this into a file using rasterio and the metadata that was saved (example shown below). Notice that band 5's data must be changed to a 32-bit float. 

Running this will output a list of products available to download. The list provides a large range of metadata such as dates, orbit direction, and which of the Sentinel-2 satellites took the product. You will notice that certain products have SEH or SFH as their granuleidentifier. As your AOI boundaries may not exactly match with how tiles are created for Sentinel-2 products, it’s important to find the “best fit” tile for the project. In our case, 10SFH provides maximum coverage for our AIOm with a thin strip falling within 10SEH.

TIPS - To determine which best fits your area of interest you will need to install google earth pro and download the Sentinel-2 orbit path (link here). There you can go to your area of interest and determine the right fit tile for your AOI.

To download your desired product run the instance of SentinelAPI with .download(), with the UUID from the table as your input. Once downloaded as a zip file, extract all sub-directories and files in your data folder.  File structure information is available here.

Preview images are provided in JP2000 format for each band. Files are located in the folder “GRANULE”, with a subfolder for each spatial Resolution (10m, 20m, and 60m). For our project, we worked with a 20-meter spatial resolution.

It is recommended that you use a 20-meter resolution for analysis as it provides the most relevant bands while providing a high resolution. The 10-meter resolution provides very few bands to be used and the 60-meter resolution is far less precise and only offers a single band of more than 20-meter and is recommended to be only used in specialized cases. To access the data the path is: GRANULE>>L2A FILE>>IMG_DATA>>R20m>>the specific band you want to access (B01, B02, B03...). You can attain the metadata by using .meta and .read() to read the data into a variable. Then write this into a file using rasterio and the metadata that was saved (example shown below). Notice that band 5's data must be changed to a 32-bit float. 


I will be sharing code on GitHub after we finish the project. Hope this blog helps demystify working with satellite images, more specifically with Sentinel-2 products.


About the author: Ishaan is a high school student in New Jersey. Besides being a robotics enthusiast, he loves pl​​​​​​​aying tennis, following Formula 1 races, and painting. His virtual internship is part of a collaboration between the Institute for Global Environmental Strategies (IGES) and the NASA  Texas Space Grant Consortium (TSGC) to extend the TSGC Summer Enhancement in Earth Science (SEES) internship for US high school (http://www.tsgc.utexas.edu/sees-internship/). Ishaan shared his experience this summer in this blog post. 

More Blog Entries