MOSAIKS Training Manual

Auteur·rice·s

Affiliations

Cullen Molitor

Center for Effective Global Action (CEGA)

Tamma Carleton

UC Berkeley Department of Agricultural and Resource Economics

Esther Rolf

UC Boulder College of Engineering and Applied Science

Welcome to Our Multilingual Website

Please select your preferred language:

Welcome

This is the first edition of the MOSAIKS Training Manual! The manual serves as a comprehensive reference for understanding MOSAIKS, its capabilities, and guidance on practical implementation. You will learn what MOSAIKS is, what can be done with it, and how to use it effectively in a variety of applications.

The skills and knowledge you gain from this manual will enable you to leverage satellite imagery and machine learning to address complex socioeconomic and environmental challenges. This can be a self taught manual or used in conjunction with a training course.

Many of the concepts and examples are broadly applicable to the world of remote sensing and machine learning, so even if you are not using MOSAIKS, you may find the content useful.

What is MOSAIKS?

MOSAIKS stands for Multi-task Observation using SAtellite Imagery & Kitchen Sinks. It is a framework designed to simplify the use of satellite imagery and machine learning for predicting socioeconomic and environmental outcomes across different geographic contexts and time periods. MOSAIKS relis on random convolutions (developed in Rahimi and Recht (2008)) applied to satellite imagery, which extract task-agnostic features and enable researchers and practitioners to easily and flexibly predict a diversity of outcomes from raw imagery.

Figure 1: MOSAIKS spelled out with imagery from the Landsat Satellite Constellation data catalog. Made with: Your Name in Landsat

Who is this for?

This comprehensive two-week program is designed for academics, professionals, and practitioners interested in leveraging MOSAIKS to better understand socioeconomic and environmental challenges. The course is particularly valuable for those working in:

Remote sensing and satellite imagery analysis
Machine learning applications with geospatial data
Agricultural and environmental monitoring and assessment
Development research and policy making

What will I learn?

Throughout this course, you’ll learn practical applications through a combination of:

Theoretical foundations and conceptual understanding
Hands-on exercises with real-world data
Best practices for implementation
Strategies for analyzing and interpreting results

The curriculum covers the complete MOSAIKS workflow, including:

Accessing and processing satellite imagery
Understanding MOSAIKS feature extraction
Working with the MOSAIKS API
Implementing machine learning models
Quantifying and communicating uncertainty
Applying models to various contexts
Working with survey data

Whether you’re new to MOSAIKS or looking to deepen your expertise, this course provides the tools and knowledge needed to effectively utilize this framework.

Course structure

This course is designed as an intensive two-week program that combines lectures, demonstrations, and hands-on sessions. Each day is structured as follows:

Time	Activity
9:00 - 10:30	Morning Session 1
10:30 - 11:00	Break
11:00 - 12:30	Morning Session 2
12:30 - 1:30	Lunch
1:30 - 3:00	Afternoon Session 1
3:00 - 3:30	Break
3:30 - 4:30	Afternoon Session 2
4:30 - 5:00	Feedback and Development

Table 1: Daily time divisions showing 2 morning sessions, 2 afternoon sessions, and a final time slot alloted for developing the course further.

Each day concludes with a Q&A and feedback session from 4:30-5:00, providing opportunities to clarify concepts and share ideas. It is expected that this first course will spur many new ideas and concepts which should be included in the future trainings. Please remember to take notes throughout each day with particular emphasis in areas you think could be explained better or that need additional topics covered. These can be areas that you struggled with or that you would anticipate could be difficult for others.

While the in-person version of this course was designed with the schedule outlined above, other users can easily navigate throughout the course content as desired, noting that the chapter content is designed for sequential learning.

Course schedule

Week 1

Day 1: MOSAIKS framework, accessing features, basic workflow
Day 2: Understanding ground truth data, data cleaning, spatial resolution
Day 3: Agricultural yield prediction, downloading features via API
Day 4: Types of satellite imagery, processing considerations, quality control
Day 5: Feature computation, theory behind RCFs, hands-on implementation

Week 2

Day 1: Martin Luther King Jr. Day - No Class
Day 2: Model selection, hyperparameter tuning, cross-validation
Day 3: Error analysis, confidence intervals, reporting uncertainty
Day 4: Survey design principles, geolocation methods, sampling strategies
Day 5: Advanced topics, emerging applications, future development

Training expectations

What you will learn

Understanding of MOSAIKS framework and capabilities
Practical skills in satellite imagery processing
Experience with machine learning applications
Hands-on practice with real-world datasets
Knowledge of survey data integration
Best practices for model implementation

Prerequisites

There are no explicit prerequisites, though this course does cover some advanced topics in:

The Python programming language
Machine learning
Geospatial data

Participant expectations

Active participation in discussions and hands-on sessions
Completion of assigned homework (particularly the Week 1 Friday assignment)
Engagement in Q&A sessions
Contribution to feedback sessions for course improvement

Computing requirements

The course includes hands-on computing sessions. You will need:

A computer with access to the internet
A Google account
Access to Google Colaboratory
Access to necessary data (details to be provided)

Homework and presentations

There will be a homework assignment at the end of Week 1, which participants will present on Tuesday of Week 2. This assignment is designed to reinforce learning and provide practical experience with MOSAIKS tools.

Book structure and content

This manual is organized into six main parts, each focusing on a critical aspect of MOSAIKS. We begin with foundational concepts and gradually progress to more advanced topics in modeling and uncertainty quantification.

Part	Description
Introduction	Computing setup, MOSAIKS overview, API access, initial demonstration
Label data	Understanding suitable labels, survey integration, data preparation
Satellite imagery	Selecting appropriate imagery, processing considerations
Features	Random convolutional features, API access, feature computation
Task modeling	Model selection, spatial analysis, temporal considerations
Model uncertainty	Uncertainty quantification, ethical considerations

Table 2: Overview of the MOSAIKS Training Manual contents

The content is designed to be both comprehensive and practical, with each part building upon previous concepts while remaining relatively self-contained. This structure allows readers to either progress through the manual sequentially or focus on specific topics of interest. Throughout each section, we provide practical examples, code demonstrations, and best practices drawn from real-world applications.

Acknowledgements

MOSAIKS was developed and is supported by a large team of researchers across multiple partner organizations:

Development Team:

Benjamin Recht, Cullen Molitor, Darin Christensen, Esther Rolf, Eugenio Noda, Grace Lewin, Graeme Blair, Hannah Druckenmiller, Hikari Murayama, Ian Bolliger, Jean Tseng, Jessica Katz, Jonathan Proctor, Juliet Cohen, Karena Yan, Luke Sherman, Miyabi Ishihara, Shopnavo Biswas, Simon Greenhill, Solomon Hsiang, Steven Cognac, Tamma Carleton, Taryn Fransen, Trinetta Chong, Vaishaal Shankar

MOSAIKS Training Manual Team: Cullen Molitor, Tamma Carleton, Esther Rolf, Sean Luna McAdams, Heather Lahr

Partner Organizations:

Center for Effective Global Action (CEGA; UCB)
Environmental Markets Lab (emLab; UCSB)
Global Policy Lab (GPL; Stanford University)
Project on Resources and Governance (PRG; UCLA)
Master of Environmental Data Science Program (MEDS; UCSB)

Funding Support:

We are grateful for the support and contributions of all team members and partner organizations in making MOSAIKS a reality. We hope to continue expanding the framework and its applications to address pressing global challenges.

Looking forward

In the first part of this book, we will cover the basics of MOSAIKS, including its framework, capabilities, and practical applications. This section is focused on exploring the original MOSAIKS publication (Rolf et al. 2021) and understanding the core concepts behind the framework.