MOSAIKS Training Manual

Auteur·rice·s
Affiliations

Center for Effective Global Action (CEGA)

UC Berkeley Department of Agricultural and Resource Economics

UC Boulder College of Engineering and Applied Science

Welcome to Our Multilingual Website

Please select your preferred language:

Welcome

This is the first edition of the MOSAIKS Training Manual! The manual serves as a comprehensive reference for understanding MOSAIKS, its capabilities, and guidance on practical implementation. You will learn what MOSAIKS is, what can be done with it, and how to use it effectively in a variety of applications.

The skills and knowledge you gain from this manual will enable you to leverage satellite imagery and machine learning to address complex socioeconomic and environmental challenges. This can be a self taught manual or used in conjunction with a training course.

Many of the concepts and examples are broadly applicable to the world of remote sensing and machine learning, so even if you are not using MOSAIKS, you may find the content useful.

What is MOSAIKS?

MOSAIKS stands for Multi-task Observation using SAtellite Imagery & Kitchen Sinks. It is a framework designed to simplify the use of satellite imagery and machine learning for predicting socioeconomic and environmental outcomes across different geographic contexts and time periods. MOSAIKS relis on random convolutions (developed in Rahimi and Recht (2008)) applied to satellite imagery, which extract task-agnostic features and enable researchers and practitioners to easily and flexibly predict a diversity of outcomes from raw imagery.

Figure 1: MOSAIKS spelled out with imagery from the Landsat Satellite Constellation data catalog. Made with: Your Name in Landsat

Who is this for?

This comprehensive two-week program is designed for academics, professionals, and practitioners interested in leveraging MOSAIKS to better understand socioeconomic and environmental challenges. The course is particularly valuable for those working in:

  • Remote sensing and satellite imagery analysis
  • Machine learning applications with geospatial data
  • Agricultural and environmental monitoring and assessment
  • Development research and policy making

What will I learn?

Throughout this course, you’ll learn practical applications through a combination of:

  • Theoretical foundations and conceptual understanding
  • Hands-on exercises with real-world data
  • Best practices for implementation
  • Strategies for analyzing and interpreting results

The curriculum covers the complete MOSAIKS workflow, including:

  • Accessing and processing satellite imagery
  • Understanding MOSAIKS feature extraction
  • Working with the MOSAIKS API
  • Implementing machine learning models
  • Quantifying and communicating uncertainty
  • Applying models to various contexts
  • Working with survey data

Whether you’re new to MOSAIKS or looking to deepen your expertise, this course provides the tools and knowledge needed to effectively utilize this framework.

Course structure

This course is designed as an intensive two-week program that combines lectures, demonstrations, and hands-on sessions. Each day is structured as follows:

Time Activity
9:00 - 10:30 Morning Session 1
10:30 - 11:00 Break
11:00 - 12:30 Morning Session 2
12:30 - 1:30 Lunch
1:30 - 3:00 Afternoon Session 1
3:00 - 3:30 Break
3:30 - 4:30 Afternoon Session 2
4:30 - 5:00 Feedback and Development
Table 1: Daily time divisions showing 2 morning sessions, 2 afternoon sessions, and a final time slot alloted for developing the course further.

Each day concludes with a Q&A and feedback session from 4:30-5:00, providing opportunities to clarify concepts and share ideas. It is expected that this first course will spur many new ideas and concepts which should be included in the future trainings. Please remember to take notes throughout each day with particular emphasis in areas you think could be explained better or that need additional topics covered. These can be areas that you struggled with or that you would anticipate could be difficult for others.

While the in-person version of this course was designed with the schedule outlined above, other users can easily navigate throughout the course content as desired, noting that the chapter content is designed for sequential learning.

Course schedule

Week 1

  • Day 1: MOSAIKS framework, accessing features, basic workflow
  • Day 2: Understanding ground truth data, data cleaning, spatial resolution
  • Day 3: Agricultural yield prediction, downloading features via API
  • Day 4: Types of satellite imagery, processing considerations, quality control
  • Day 5: Feature computation, theory behind RCFs, hands-on implementation

Week 2

  • Day 1: Martin Luther King Jr. Day - No Class
  • Day 2: Model selection, hyperparameter tuning, cross-validation
  • Day 3: Error analysis, confidence intervals, reporting uncertainty
  • Day 4: Survey design principles, geolocation methods, sampling strategies
  • Day 5: Advanced topics, emerging applications, future development

Training expectations

What you will learn

  • Understanding of MOSAIKS framework and capabilities
  • Practical skills in satellite imagery processing
  • Experience with machine learning applications
  • Hands-on practice with real-world datasets
  • Knowledge of survey data integration
  • Best practices for model implementation

Prerequisites

There are no explicit prerequisites, though this course does cover some advanced topics in:

  • The Python programming language
  • Machine learning
  • Geospatial data

Participant expectations

  • Active participation in discussions and hands-on sessions
  • Completion of assigned homework (particularly the Week 1 Friday assignment)
  • Engagement in Q&A sessions
  • Contribution to feedback sessions for course improvement

Computing requirements

The course includes hands-on computing sessions. You will need:

  • A computer with access to the internet
  • A Google account
  • Access to Google Colaboratory
  • Access to necessary data (details to be provided)

Homework and presentations

There will be a homework assignment at the end of Week 1, which participants will present on Tuesday of Week 2. This assignment is designed to reinforce learning and provide practical experience with MOSAIKS tools.

Book structure and content

This manual is organized into six main parts, each focusing on a critical aspect of MOSAIKS. We begin with foundational concepts and gradually progress to more advanced topics in modeling and uncertainty quantification.

Part Description
Introduction Computing setup, MOSAIKS overview, API access, initial demonstration
Label data Understanding suitable labels, survey integration, data preparation
Satellite imagery Selecting appropriate imagery, processing considerations
Features Random convolutional features, API access, feature computation
Task modeling Model selection, spatial analysis, temporal considerations
Model uncertainty Uncertainty quantification, ethical considerations
Table 2: Overview of the MOSAIKS Training Manual contents

The content is designed to be both comprehensive and practical, with each part building upon previous concepts while remaining relatively self-contained. This structure allows readers to either progress through the manual sequentially or focus on specific topics of interest. Throughout each section, we provide practical examples, code demonstrations, and best practices drawn from real-world applications.

Acknowledgements

MOSAIKS was developed and is supported by a large team of researchers across multiple partner organizations:

Development Team:

Benjamin Recht, Cullen Molitor, Darin Christensen, Esther Rolf, Eugenio Noda, Grace Lewin, Graeme Blair, Hannah Druckenmiller, Hikari Murayama, Ian Bolliger, Jean Tseng, Jessica Katz, Jonathan Proctor, Juliet Cohen, Karena Yan, Luke Sherman, Miyabi Ishihara, Shopnavo Biswas, Simon Greenhill, Solomon Hsiang, Steven Cognac, Tamma Carleton, Taryn Fransen, Trinetta Chong, Vaishaal Shankar

MOSAIKS Training Manual Team: Cullen Molitor, Tamma Carleton, Esther Rolf, Sean Luna McAdams, Heather Lahr

Partner Organizations:

  • Center for Effective Global Action (CEGA; UCB)
  • Environmental Markets Lab (emLab; UCSB)
  • Global Policy Lab (GPL; Stanford University)
  • Project on Resources and Governance (PRG; UCLA)
  • Master of Environmental Data Science Program (MEDS; UCSB)

Funding Support:

We are grateful for the support and contributions of all team members and partner organizations in making MOSAIKS a reality. We hope to continue expanding the framework and its applications to address pressing global challenges.

Looking forward

In the first part of this book, we will cover the basics of MOSAIKS, including its framework, capabilities, and practical applications. This section is focused on exploring the original MOSAIKS publication (Rolf et al. 2021) and understanding the core concepts behind the framework.