10  Processing imagery

This chapter is in early draft form and may be incomplete.

10.1 Overview

After selecting appropriate satellite imagery (as discussed in Chapter 9), the next step is accessing and processing that imagery for use with MOSAIKS. This chapter covers key considerations and practical approaches for working with satellite data.

10.2 Accessing satellite imagery

10.2.1 Cloud vs local

There are several cloud-based platforms that provide access to a wide range of satellite imagery and geospatial datasets. These platforms offer scalable computing resources, pre-configured environments, and often include common datasets. Some popular cloud platforms for satellite imagery processing include:

These platforms offer several advantages, including:

  • No need to download raw imagery
  • Scalable computing resources
  • Pre-configured environments
  • Often include common datasets
  • Can download to local or cloud storage

NASA Earthdata Cloud Cookbook

10.2.1.1 Google Earth Engine

Google Earth Engine provides a vast catalog of satellite imagery and geospatial datasets that can be accessed through a Python API. The GEE API is based on the Earth Engine Python API, which allows you to interact with the Earth Engine servers and run geospatial analyses in the cloud.

10.2.1.2 Microsoft Planetary Computer

Microsoft Planetary Computer provides a multi-petabyte catalog of global environmental data that can easily be accessed through APIs. The MPC API is based on the STAC (SpatioTemporal Asset Catalog) specification, which is an emerging open standard for geospatial data that aims to increase the interoperability of geospatial data, particularly satellite imagery.

For more information on using the Microsoft Planetary Computer API, see Reading Data from the STAC API.

For practical guidance on accessing the Microsoft Planetary Computer Data Catalog through their API,see the Master of Environmental Data Science (MEDS) course EDS 220 section on the STAC specification

10.2.2 Planet API

  • NICFI is free
    • Sign up at NICFI
    • Access to Visual (RGB) and Analytic (RGB+NIR) basemaps
  • Planet is paid
    • Sign up at Planet
    • API access to PlanetScope, RapidEye, SkySat, Dove, and basemap imagery
    • Institutional licensing available

10.2.2.1 Sentinel Hub

Sentinel Hub is a cloud-based platform that provides access to a wide range of satellite imagery, including data from the Sentinel-1, Sentinel-2, and Landsat missions. The Sentinel Hub API allows you to access and process satellite imagery using a simple Python interface.

10.2.3 Sentinelsat python API

The sentinelsat Python API allows you to search and download satellite imagery from the Copernicus Open Access Hub. The API provides a simple interface for searching and downloading Sentinel-1 and Sentinel-2 imagery, as well as other Copernicus data products.

10.3 Understanding the data

Once you have decided how to acquire satellite imagery, the next step is to make sure that you understand the data so you can make sure to process it correctly. Often the best resource is to consult the data provider’s documentation. Aside from that, the best way to understand the data is to open it up and explore it.

For hands on experience with satellite imagery, skip to the next chapter where you can access NICFI imagery in a Google Colab notebook (Chapter 11).

10.3.1 Storage formats

Satellite imagery can be stored in various formats, including:

  • GeoTIFF
  • NetCDF
  • HDF5
  • Zarr

There are several python libraries that can be used to read and write these formats, such as rasterio, xarray, rioxarray, and h5py.

10.3.2 Metadata

  • Data types (INT8, UINT16, FLOAT32)
  • Coordinate reference system (CRS)
  • Spectral bands (Red, Green, Blue, Near-Infrared, etc.)
  • Spatial resolution (scene size, pixel size)
  • Acquisition date (single or multiple dates)
  • Cloud cover
  • Data quality

10.3.3 Visualization

There are many considerations for visualizing satellite imagery. We will cover some of the most common ones here.

10.3.3.1 Normalization

10.3.3.2 True color

10.3.3.3 False color

10.3.3.4 Sharpening

10.3.4 Calculate Indices

Depending on your application, a good place to start is calculating vegetation or other indices. There are many to choose from and it takes some domain knowledge to know if one would be useful for your application. In agriculture, there are several indices that are commonly used to monitor crop health, such as the Normalized Difference Vegetation Index (NDVI), the Enhanced Vegetation Index (EVI), and the Soil Adjusted Vegetation Index (SAVI).

Normalized Difference Vegetation Index (NDVI) is a simple and common vegetation index that uses the difference between near-infrared (NIR) and red bands to quantify vegetation health.

Figure 10.1: NDVI values animated over Africa. Made with Google Earth Engine Python API.

10.4 Processing satellite imagery

There are two main approaches to accessing and processing satellite imagery:

10.4.1 Cloud-based platforms

Modern cloud platforms offer several advantages for satellite image processing:

  • No need to download raw imagery
  • Scalable computing resources
  • Pre-configured environments
  • Often include common datasets
  • Pay only for what you use

Popular platforms include:

  • Google Earth Engine
  • Microsoft Planetary Computer
  • Amazon Web Services
  • Planet Platform
  • Euro Data Cube

10.4.2 Local processing

Traditional local processing may be preferred when:

  • Internet connectivity is limited
  • Data privacy is paramount
  • Working with proprietary algorithms
  • Computing needs are modest
  • Frequent reuse of same imagery

Consider these factors when choosing:

  • Data volume
  • Processing complexity
  • Budget constraints
  • Time requirements
  • Security needs

10.5 Processing steps

10.5.1 Quality assessment

  • Cloud detection
  • Shadow masking
  • Bad pixel identification
  • Sensor artifacts removal

10.5.2 Atmospheric correction

  • Convert to surface reflectance
  • Account for atmospheric effects
  • Standardize across images

10.5.3 Geometric correction

  • Orthorectification
  • Co-registration
  • Projection alignment

10.5.4 Mosaicking

  • Image stitching
  • Feathering/blending
  • Gap filling
  • Color balancing

10.5.5 Temporal compositing

  • Best-pixel selection
  • Weighted averaging
  • Gap filling
  • Smoothing

10.6 Best practices

  1. Document everything
    • Processing steps
    • Parameter choices
    • Quality control decisions
    • Software versions
  2. Validate outputs
    • Visual inspection
    • Statistical checks
    • Ground truth comparison
    • Cross-platform verification
  3. Optimize resources
    • Batch processing
    • Parallel computing
    • Memory management
    • Storage efficiency
  4. Version control
    • Track code changes
    • Archive key datasets
    • Document dependencies
    • Maintain reproducibility

10.7 Common challenges

10.7.1 Storage requirements

  • Raw imagery can be massive
  • Multiple processing steps multiply storage needs
  • Intermediate products management
  • Backup considerations

10.7.2 Computing resources

  • Processing can be CPU/GPU intensive
  • Memory limitations
  • I/O bottlenecks
  • Network bandwidth

10.7.3 Quality issues

  • Cloud contamination
  • Atmospheric effects
  • Sensor artifacts
  • Geometric distortions

10.7.4 Time management

  • Processing can be slow
  • Download times
  • Quality checking
  • Iteration cycles

10.8 Future considerations

As you develop your processing pipeline, consider:

  • Scalability needs
  • Automation opportunities
  • Quality control requirements
  • Resource constraints
  • Time limitations
Looking forward

In the next section, we will load, visualize, and explore NICFI satellite imagery in Google Colab.