3 Access MOSAIKS
This chapter is under review and may need revisions.
3.1 Introduction
At its core, MOSAIKS requires two main inputs: satellite features and ground truth data. Our aim is to make these features as accessible as possible so that the majority of users do not have to worry about the technical details of satellite imagery processing.
To this end, we have worked to develop multiple ways to access MOSAIKS features:
Option | Imagery Source | Spatial Coverage | Spatial Resolution | Temporal Resolution | Weighting |
---|---|---|---|---|---|
MOSAIKS API | Planet Labs Visual Basemap | Global land areas | 0.01° | 2019 Q3 | Unweighted |
MOSAIKS API | Planet Labs Visual Basemap | Global land areas | 0.1°, 1° | 2019 Q3 | Area & population |
MOSAIKS API | Planet Labs Visual Basemap | Global land areas | ADM0, ADM1, ADM2 | 2019 Q3 | Area & population |
Rolf et al 2021 | Google Static Maps | Continental United States (~100k locations) | 0.01° | 2019 | unweighted |
Chapter 14 | Any - see Chapter 9 | User-defined | User-defined | User-defined | User-defined |
The MOSAIKS API should be considered the primary way to access features. It is a user-friendly interface that allows you to download features for any location on Earth. The API is designed to be accessible to users with a range of technical backgrounds, from beginners to experts. For more details on what features are available on the API, see Chapter 13.
However, there are many settings where users will want or need to compute their own customized features. Chapter 14 guides readers through this process.
3.2 MOSAIKS API
The MOSAIKS API is a user-friendly interface that allows you to download features for any land location on Earth. The API is designed to be accessible to users with a range of technical backgrounds, from beginners to experts. To take advantage of the API, you will need to register for an account.
3.2.1 Register for an account
Visit api.mosaiks.org.

Select Register
to create an account. You will need to provide a username, an email, and a password.

Once registered, you can log in to begin downloading MOSAIKS features.

3.2.2 API resources
From the landing page, you can read additional information about MOSAIKS and access resources to help you get started.
This book is developed to provide you with all the information you need to use MOSAIKS.
The API contains the following pages:
Page | Description |
---|---|
Home | Landing page for the API. Contains general information about using MOSAIKS and the API |
Precomputed Files | Precomputed features at administrative boundary scales |
HDI | Global Human Development Index (HDI) estimates at the municipality and grid levels |
Global Grids | Precomputed and area or population features at 0.1° and 1° resolution |
Map Query | Precomputed features at 0.01 degree resolution, user defines bounding box over area of interest |
File Query | Precomputed features at 0.01 degree resolution, user uploads file with latitude and longitude coordinates |
My Files | Files you have queried from the API, available to download |
Resources | Example Python and R notebooks for using the MOSAIKS framework |
3.2.3 API features
Currently the MOSAIKS API has a single set of global features (with several aggregation levels). The features are freely available to the public for download; this is the fastest and easiest way to begin using MOSAIKS.
The API features use input imagery from Planet Labs, Inc. Visual Basemap Global Quarterly 2019 (quarter 3) product. Image quality, and therefore feature quality, may be affected by local conditions. For example, an area undergoing a rainy season in the third quarter (July to September), is likely to contain image artifacts from cloud cover. This will in turn effect the feature calculations. For more details on the input imagery Chapter 9.
Given the static nature of the API, the easiest way to get started with MOSAIKS is to have label data for a recent time period (ideally from 2019 for fast changing labels, or a close year for more steady labels).

Using MOSAIKS for time series data is possible and can work well, however this currently requires computing your own custom features. See Chapter 14 and Chapter 18 for more information.
The MOSAIKS features are created using a 0.01 x 0.01 degree latitude-longitude global land grid. These are the native features available for download from the API, but it also offers pre-aggregated features at 0.1 and 1 degree resolution, as well as administrative boundaries (ADM0, ADM1, and ADM2).
3.2.4 High resolution features
The file query and map query are the two methods to obtain the high resolution (0.01 degree) features through the API. For simplicity, we store these features in a tabular format with latitude and longitude coordinates. These coordinates are the center of each grid cell.
When you download the high resolution features (0.01 degree), you will receive them in a tabular .csv format where:
- Each row (N) represents a unique grid cell
- The first two columns contain latitude and longitude coordinates (grid centroids)
- The remaining columns represent K features (currently K = 4000 features)
Note: There is a limit of N = 100,000 records per query
3.2.4.1 Map query
- Create rectangular boxes by specifying latitude and longitude coordinates
- Multiple boxes can be created
- The system displays an estimated number of records for each box
- Note that estimates are based on box area and may not reflect actual record numbers, especially for areas containing seas and oceans

Use geojson.io to find the bounding box coordinates for your area of interest.
3.2.4.2 File query
- Submit a file with custom latitude and longitude coordinates
- The API returns features for grid cells closest to your input coordinates
- Points are allocated to the nearest grid point if they don’t exactly match
- The output file may have a different number of rows than your input
- Point ordering may change in the output

3.2.5 Aggregated features
Many users may find it easier to work with features aggregated to some level. The MOSAIKS API offers pre-aggregated features to accommodate these needs. The API offers several levels, including larger grid cells (0.1 and 1 degree) or summarized to administrative boundaries (ADM0, ADM1, and ADM2). These files are available for download as either single or chunked files depending on the resolution.

3.2.5.1 Weighting schemes
At each level of aggregation, we offer area weighted features and population weighted features. Population weights are from the Gridded Population of the World (GPWv4) population density dataset. The area weighting scheme is based on the area of the high resolution grid cells.
3.2.5.2 When to use aggregated features
The aggregated features are particularly useful in a few scenarios:
You have data at a scale larger than the 0.01 degree grid cells. Many datasets come at the country, state, or county level.
Your data has a lot of noise that can be smoothed out by aggregating to a larger scale. A common example of this might be household survey data that is noisy at the individual level but smooths out when aggregated to the village or district level.
You want to do global analysis and don’t have the computational resources to work with the full 0.01 degree grid cells.
In all cases using the pre-aggregated features can save you time and computational resources.
Scenario: You are working with a Living Standards Measurement Study - Integrated Surveys on Agriculture (LSMS-ISA). This dataset has survey data with geographic coordinates at the household level. To protect, the privacy of the respondents, the data is jittered within a 5 km radius but it always remains within local administrative boundaries. You can therefore summarize your labels to the administrative units and build a model with the aggregated features.
If you want to make high resolution predictions, with low resolution label data, you can build your model with aggregated features and use the high resolution features to make predictions. This will be covered in Chapter 17.
3.3 Using MOSAIKS features for prediction
This is a brief overview. Detailed instructions appear later in the manual (Chapter 16).
Basic workflow:
- Obtain ground truth measurements (“labels”; see Chapter 5)
- Download matching features (see Chapter 13 for more details).
- Spatially merge labels and features (see Chapter 7)
- Use regression to model relationship between imagery and outcome (see Chapter 16)
- Use regression results to predict outcome in new locations (see Chapter 16)
You can experiment with various machine learning approaches in the regression step. For beginners, we recommend starting with our example Jupyter notebook (Chapter 4) that demonstrates a simple ridge regression approach (suitable for both R and Python users).

This topic will be covered in greater depth in later chapters (see Chapter 16). In the next chapter, you will see a simple MOSAIKS workflow which replicates the results of Rolf et al. 2021.
3.4 Citation requirements
When referring to the MOSAIKS methodology or when generating MOSAIKS features, please reference: Rolf et al. “A generalizable and accessible approach to machine learning with global satellite imagery.” Nature Communications (2021).
You can use the following Bibtex:
@article{article,
author = {Rolf, Esther and Proctor, Jonathan and Carleton, Tamma and Bolliger, Ian and Shankar, Vaishaal and Ishihara, Miyabi and Recht, Benjamin and Hsiang, Solomon},
year = {2021},
month = {07},
pages = {},
title = {A generalizable and accessible approach to machine learning with global satellite imagery},
volume = {12},
journal = {Nature Communications},
doi = {10.1038/s41467-021-24638-z}
}
If using features downloaded from the API, please reference, in addition to the publication above, the MOSAIKS API.
You can cite the API using the following Bibtex:
@misc{MOSAIKS API,
author = {{Carleton, Tamma and Chong, Trinetta and Druckenmiller, Hannah and Noda, Eugenio and Proctor, Jonathan and Rolf, Esther and Hsiang, Solomon}},
title = {{Multi-Task Observation Using Satellite Imagery and Kitchen Sinks (MOSAIKS) API}},
howpublished = "\url{ https://api.mosaiks.org }",
version = {1.0},
year = {2022},
}
In the next chapter you will have a chance to try MOSAIKS on Google Colab with the data from Rolf et al 2021.