ECHOSAT provides the first global, temporally consistent tree height map at 10m resolution, enabling accurate monitoring of forest growth and disturbances from 2018 to 2024.

10m
Spatial Resolution
7
Years (2018-2024)
Global
Coverage
~5m
Mean Absolute Error

Abstract

Forest monitoring is critical for climate change mitigation. However, existing global tree height maps provide only static snapshots and do not capture temporal forest dynamics, which are essential for accurate carbon accounting. We introduce ECHOSAT, a global and temporally consistent tree height map at 10m resolution spanning multiple years. To this end, we resort to multi-sensor satellite data to train a specialized vision transformer model, which performs pixel-level temporal regression. A self-supervised growth loss regularizes the predictions to follow growth curves that are in line with natural tree development, including gradual height increases over time, but also abrupt declines due to forest loss events such as fires. Our experimental evaluation shows that our model improves state-of-the-art accuracies in the context of single-year predictions. We also provide the first global-scale height map that accurately quantifies tree growth and disturbances over time.


Key Contributions

Why does this matter?

Forests absorb ~3.5 Pg of carbon per year — almost half of anthropogenic fossil fuel emissions [[5]](#ref-pan). Accurate monitoring of forest carbon dynamics requires understanding not just where trees are, but how they change over time. ECHOSAT enables this for the first time at global scale.

  • First Global Spatio-Temporal Tree Height Map: We provide the first high-resolution (10m) tree height map covering the entire globe across seven years (2018–2024), enabling reliable monitoring of forest dynamics and disturbances at unprecedented scale.
  • Novel Growth Loss Framework: We introduce a self-supervised loss specifically designed for training temporal regression models with sparsely distributed and temporally irregular ground truth labels, enforcing physically realistic forest growth patterns.
  • Inherent Temporal Consistency: Our model learns realistic temporal forest height dynamics without post-processing, accurately capturing both natural growth and abrupt disturbances like fires or logging.

Method Overview

Model architecture: ECHOSAT uses a Swin Transformer-based encoder-decoder architecture that processes multi-temporal Sentinel-1 and ALOS Palsar-2 (SAR) and Sentinel-2 (optical) satellite imagery together with Tandem-X derived DEM and Forest/Non-Forest masks to predict pixel-level tree heights across time. The growth loss enforces temporal consistency by regularizing predictions to follow realistic growth curves.

Technical Details:

  • Architecture: Hierarchical Swin Transformer with spatio-temporal attention
  • Input: Monthly best image Sentinel-2 (optical), quarterly composite Sentinel-1, yearly composite ALOS Palsar-2 (SAR), Tandem-X derived DEM, Forest/Non-Forest masks
  • Labels: GEDI LiDAR measurements for supervision
  • Training: Two-stage approach with Huber loss pretraining followed by Growth Loss fine-tuning
  • Output: Per-pixel tree height predictions at 10m resolution for each month from 2018-2024

Results

We evaluate ECHOSAT against existing global tree height maps on single-year predictions using GEDI LiDAR measurements as ground truth. Our model not only achieves state-of-the-art accuracy for static height estimation, but uniquely provides temporally consistent predictions that capture realistic forest dynamics — something no existing global product can offer.

Comparison with existing global tree height maps [[1]](#ref-potapov), [[2]](#ref-lang), [[3]](#ref-tolan) on single-year predictions.

ECHOSAT captures realistic temporal dynamics: gradual tree growth and abrupt height decreases.

Quantitative Comparison (2020)

Evaluation on global GEDI test set for tree heights ≥ 5m. Lower MAE/RMSE and higher R² are better.

Method MAE ↓ RMSE ↓ R² ↑
Potapov et al. [1] 7.20m 11.35m 0.56
Lang et al. [2] 6.02m 9.66m 0.69
Tolan et al. [3] 6.57m 10.31m 0.59
Pauls et al. [4] 5.93m 9.40m 0.71
ECHOSAT (Ours) 5.85m 10.87m 0.77

Visualizations

Example tree height predictions across diverse forest types.


Temporal Dynamics: Growth & Disturbances

Top: Tree growth signal and cut trees in the forest plantation area around Le Landes, France.
Bottom: Not changing trees in the Amazon rainforest.


BibTeX

@misc{pauls2026echosatestimatingcanopyheight,
      title={ECHOSAT: Estimating Canopy Height Over Space And Time}, 
      author={Jan Pauls and Karsten Schrödter and Sven Ligensa and Martin Schwartz and Berkant Turan and Max Zimmer and Sassan Saatchi and Sebastian Pokutta and Philippe Ciais and Fabian Gieseke},
      year={2026},
      eprint={2602.21421},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2602.21421}, 
}

References

  1. P. Potapov et al., "Mapping global forest canopy height through integration of GEDI and Landsat data," Remote Sensing of Environment, vol. 253, 2021.
  2. N. Lang et al., "A high-resolution canopy height model of the Earth," Nature Ecology & Evolution, vol. 7, no. 11, pp. 1778–1789, 2023.
  3. J. Tolan et al., "Very high resolution canopy height maps from RGB imagery using self-supervised vision transformer and convolutional decoder trained on aerial lidar," Remote Sensing of Environment, vol. 300, 2024.
  4. J. Pauls et al., "Estimating Canopy Height at Scale," in ICML, 2024.
  5. Y. Pan et al., "The enduring world forest carbon sink," Nature, vol. 631, no. 7723, pp. 563–569, 2024.

Acknowledgements

This work was supported via the AI4Forest project, which is funded by the German Federal Ministry of Education and Research (BMBF; grant number 01IS23025A) and the French National Research Agency (ANR). We also acknowledge the computational resources provided by the PALMA II cluster at the University of Münster (subsidized by the DFG; INST 211/667-1) as well as by the Zuse Institute Berlin. We also appreciate the hardware donation of an A100 Tensor Core GPU from Nvidia and thank Google for their compute resources provided (Google Earth Engine). Our work was further supported by the DFG Cluster of Excellence MATH+ (EXC-2046/2, project id 390685689), as well as by the German Federal Ministry of Research, Technology and Space (research campus Modal, fund number 05M14ZAM, 05M20ZBM) and the VDI/VDE Innovation + Technik GmbH (fund number 16IS23025B).