2 What is MOSAIKS? – MOSAIKS Training Manual

2.1 The challenge

Right now, numerous public satellite systems collect huge amounts of data about the world every day. But there is so much imagery (terabytes per day) that it’s overwhelming to sort through by hand; and it’s too complex and unstructured to be usable in its raw form for most applications.

Figure 2.1: Visual representation of satellites orbiting earth.

That is why linking satellite imaging to machine learning (sometimes referred to as SIML or SatML) is incredibly powerful. It enables vast amounts of unstructured image data to be transformed into structured information that can be used for planning, research, and decision-making.

Our hope is that people all over the world can access and use SIML technologies, but we recognize that many who would benefit from these tools don’t have the time or resources to manage enormous satellite imagery data sets and learn how to apply machine learning to them.

2.2 The solution

That’s why we developed MOSAIKS. MOSAIKS aims to lower the barriers to entry into SIML, diversifying the users of this powerful technology and the problems we solve with it.

MOSAIKS is designed to work “out of the box” for a wide array of SIML applications, for people with no SIML expertise who work on normal desktop or laptop computers. For many applications, MOSAIKS users never have to touch satellite imagery themselves and only need to have basic statistical training.

If you can run a regression, you can use MOSAIKS!

MOSAIKS empowers users to create their own new datasets from satellite imagery. We don’t control what variables users look at, and we never need to know. MOSAIKS is a system that allows users to quickly transform vast amounts of imagery into maps of new variables, using their own training data.

If you’ve ever been curious about trying machine learning with satellite imagery, but don’t know anything about machine learning or satellite imagery, MOSAIKS is for you.

And if you know a lot about machine learning and satellite imagery, MOSAIKS might still be for you, since it performs competitively with deep learning methods but is much simpler and cheaper to use.

Figure 2.2: Traditional framework of deep learning models (i.e., machine learning with artificial neural networks) applied to imagery. In this example, the model is attempting to classify what vehicle the image contains.

2.3 How MOSAIKS works

2.3.1 Separating users from imagery

The basic idea of MOSAIKS is to separate users from the costly and difficult process of transforming imagery into inputs (called “features”) to a downstream machine learning algorithm (images → X). The MOSAIKS team has computed these features globally, so in many use cases users never have to download or manage imagery themselves. Instead, users download a table of MOSAIKS features (X), link them to their own geocoded data on the outcome (Y) they are interested in predicting from satellite imagery (we call these data “labels”), and then run a linear regression (or something fancier if desired!) to predict their labels using MOSAIKS features (Y = Xβ). Importantly, this prediction can be performed in locations, time periods, and at spatial resolutions for which labels are not available.

Figure 2.3: This false-color image shows snow-capped peaks and ridges of the eastern Himalayas between major rivers in southwest China. The Himalayas are made up of three parallel mountain ranges that together stretch for more than 1800 miles (2,900 kilometers). This particular image was taken by NASA’s Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER), flying aboard the Terra satellite, on February 27, 2002. The picture is a composite made by combining near-infrared, red and green wavelengths. (Source: NASA)

2.3.2 Generalizability of MOSAIKS

Because MOSAIKS features synthesize information contained in raw imagery that is not tailored for any specific outcome (e.g., biodiversity, household income, land use), many users can use the same MOSAIKS features and simply match them to their own labels based on location. Users can run their analysis on any statistical software they are comfortable with. For most applications, the computing demands will not require users to work with specialized machines, since desktops and laptops work.

Figure 2.4: MOSAIKS is designed to solve an unlimited number of tasks at planet-scale quickly. After a one-time unsupervised image featurization using random convolutional features, MOSAIKS centrally stores and distributes task-agnostic features to users, each of whom generates predictions in a new context. A Satellite imagery is shared across multiple potential tasks. B Schematic of the MOSAIKS process. N images are transformed using random convolutional features into a compressed and highly descriptive K-dimensional feature vector before labels are known. Once features are computed, they can be stored in tabular form (matrix X) and used for unlimited tasks without recomputation. Users interested in a new task (s) merge their own labels (y^s) to features for training. Here, user 1 has forest cover labels for locations p + 1 to N and user 2 has population density labels for locations 1 to q. Each user then solves a single linear regression for β^s. Linear prediction using β^s and the full sample of MOSAIKS features X then generates SIML estimates for label values at all locations. Generalizability allows different users to solve different tasks using an identical procedure and the same table of features—differing only in the user-supplied label data for training. Each task can be solved by a user on a desktop computer in minutes without users ever manipulating the imagery. C Illustration of the one-time unsupervised computation of random convolutional features. K patches are randomly sampled from across the N images. Each patch is convolved over each image, generating a nonlinear activation map for each patch. Activation maps are averaged over pixels to generate a single K-dimensional feature vector for each image. (Source: Rolf et al. 2021 Figure 1)

2.3.3 Why it works

MOSAIKS works because MOSAIKS features capture a huge amount of information about the colors, patterns and textures that show up in satellite imagery. We don’t know what patterns/colors/textures will be important for the application that users have (since we don’t know what applications users will try), so we just try to capture all of them. The purpose of the regression step is to teach the model which patterns/colors/textures predict the labels, and then to use that understanding to make predictions in locations where users don’t have labels. In addition, MOSAIKS encodes image information in a way that allows for nonlinear relationships between labels and imagery data, even though the regression that users generally implement is a linear regression.

Figure 2.5: Example of 4 (of 4,000) MOSAIKS feature maps (right) computed from satellite imagery (left). These features were chosen at random from what is available on the MOSAIKS API.

For learn more about these features, see Chapter 12 where we attempt to provide intuition for what a feature is and how it is made.

2.3.4 Five steps to using MOSAIKS

This section is a very broad overview of the steps to use MOSAIKS. Later chapters will provide more detailed guidance on each step.

In many cases, users aiming to predict an outcome from satellite imagery can do so using pre-computed imagery features (X) in a simple linear regression framework. Later in this training course, we will detail more customized workflows that remain tractable but allow for more flexibility. In the standard case, however, the procedure for using MOSAIKS has five steps (corresponding figure from Rolf et al. is below):

Download pre-computed MOSAIKS features (X) corresponding to the locations where you have labels (Chapter 3).
Merge the features with your labels (Y) based on location (so features at position P are linked to labels at position P) (Chapter 7).
Run a cross-validated ridge regression of your labels on the MOSAIKS features (Y = Xβ + e; or any other model you choose! See ?sec-sec-model-choice).
Evaluate performance.
Use the results of the regression model (β) to make predictions (Xβ) in a new region of interest where you do not have labels, using only the MOSAIKS features (X) that correspond with those new locations.

2.4 What can MOSAIKS predict?

This question is answered in greater detail in Chapter 5

MOSAIKS has been successfully used to predict a wide range of outcomes including:

Environmental conditions (forest cover, elevation)
Population patterns (density, nighttime lights)
Economic indicators (income, house prices)
Infrastructure (road networks)

The figure below is from the original MOSAIKS publication (Rolf et al. 2021). The left maps show the input labels. The right map shows the modeled predictions. The scatter plot shows the modeled predictions against the true labels and reports the coefficient of determination (R²) as a measure of performance.

Figure 2.6: (100,000 daytime images were each converted to 8,192 features and stored. Seven tasks were then selected based on coverage and diversity. Predictions were generated for each task using the same procedure. **Left maps:** 80,000 observations used for training and validation, aggregated up to 20 km × 20 km cells for display. **Right maps:** concatenated validation set estimates from 5-fold cross-validation for the same 80,000 grid cells (observations are never used to generate their own prediction), identically aggregated for display. **Scatters:** Validation set estimates (vertical axis) vs. ground truth (horizontal axis); each point is a ~1 km × 1 km grid cell. Black line is at 45∘. Test-set and validation set performance are essentially identical; validation set values are shown for display purposes only, as there are more observations. The tasks in the top three rows are uniformly sampled across space, the tasks in the bottom four rows are sampled using population weights; grey areas were not sampled in the experiment. Source: Rolf et al. 2021 Figure 2)

Importantly, all these predictions use the same set of satellite features - there’s no need to reprocess the imagery for different tasks. MOSAIKS achieves accuracy comparable to more complex deep learning methods, but at a fraction of the computational cost. This is the power of MOSAIKS, it removes the need for reprocessing the imagery after the initial encoding.

2.5 Is MOSAIKS always the best choice?

No! MOSAIKS is a powerful tool, but it is not always the best choice for every application. In fact, it is usually not the “best” choice for any application. We aim to be competitive with leading models, so the true benefit of MOSAIKS is in its simplicity, accessibility, and scalability for the average user.

We recommend that you start by searching for existing methods developed for your application, before investing time and resources into MOSAIKS. An excellent place to begin this search is at satellite-image-deep-learning where you can find a list of deep learning methods that have been developed for satellite imagery, as well as existing datasets, tools, and tutorials.

The world of SIML is vast and rapidly evolving. This means there is a good choice you do not have to make global scale predictions yourself. Instead, you might be able to use or build off the hard work of many others in the field.

If you have a specific context where you want tailored information or a variable/outcome no one else has predicted before, then you want MOSAIKS. Not only will MOSAIKS allow you to make predictions in a new context, but it will also allow you to do so quickly and with minimal computational resources.

2.6 Lecture materials

TODO: Add recorded lecture here.

Lecture slides.

2.7 Summary

MOSAIKS is a powerful tool that allows users to predict a wide range of outcomes from satellite imagery using pre-computed features. The system is designed to be accessible to users with no prior experience in machine learning or satellite imagery. The MOSAIKS framework involves five simple steps

Download features
Merge with labels
run a regression
Evaluate performance
Make predictions

In this book we will explore all the ways in which this is an oversimplification. You will learn to adapt this framework to your own needs, and to understand the limitations and assumptions of the MOSAIKS system. Many skills presented in this training manual will be applicable to other satellite imagery and machine learning workflows.

Looking forward

In the next chapter, you will be introduced to the MOSAIKS API which is a free and open resource for accessing pre-computed MOSAIKS features.