Data Transparency & Validation

Last update: August 27, 2025

Interactive Index

At Sustax, providing high-quality, reliable climate data is paramount. We understand that the utility of climate projections relies on their predictive accuracy and the transparency of their validation. This section details our commitment to data quality and transparency, and robust statistical metrics we employ to evaluate our models.

Sustax is built on a foundation of cutting-edge climate science and rigorous data processing. We strive to provide the most accurate and reliable climate projections possible by:

Utilizing best-in-class foundational datasets (ERA5 reanalysis and CMIP6 model simulations).
Employing proprietary bias-correction and harmonisation algorithms developed by Geoskop Climate Intelligence. See validation of our proprietary bias correction algorithm [14].
Evaluating our output model performance against observational data.
Providing users with clear metrics to understand the accuracy and uncertainties associated with our data.

Thus, our commitment to transparency is absolute, we offer transparency to the users by means of two tools (1) accuracy, (2) original uncertainty

↑

Interpreting Accuracy Information

It is important to understand that no climate model is perfect, and all projections carry some degree of uncertainty. The accuracy metrics provided by Sustax are designed to give you more than just a transparent understanding of our model performance relative to historical observations; they are tools to help you improve your analysis and interpretation:

Understand Overall Model Performance: The metrics offer a clear baseline of how well our models reproduced historical climate conditions during the 1979-2022 period against ERA5.
Evaluating Comparative Reliability by Region and Variable: Use the metrics to compare the expected predictive model performance across locations and even variables. For instance, if the Mean Absolute Error (MAE) for daily temperature is lower in Region A than in Region B, it suggests the model had higher historical accuracy for temperature in Region A on average. This can inform your confidence levels when analyzing projections for different areas and/or different climate variables.
Optimizing Model Selection: Leverage Sustax’ metrics to select the most suitable SSP-RCP scenarios for your needs, ensuring lower biases, higher correlation, or the matching between distributions.
Inform Scenario Interpretation and Risk Assessment: Knowledge of historical model performance for specific variables in your region of interest can guide how you interpret future projections under various SSP-RCP scenarios. If a variable critical to your assessment showed higher historical error metrics (e.g., a larger MAE or lower Pearson R value) in your specific region, you might:
- Approach projections for that variable with an appropriate degree of caution.
- Consider a wider range for sensitivity analyses.
- Place greater emphasis on understanding the model spread for that variable.

We encourage users to actively consider these metrics when interpreting results, based on their specific needs for their use case. Note that the metrics should be viewed from a relative/comparative standpoint (between regions and between scenarios), not in absolute terms, as the years used for validation coincide with those used for bias correction.

↑

Our Accuracy Metrics

Sustax employs a suite of robust statistical metrics to evaluate the accuracy of our climate model projections against observational data (primarily ERA5 for the historical period, i.e.: 1979 – 2022). These metrics quantify model predictive skill during a validation period, providing users with a unique measure of confidence in the projections and helping them assess associated uncertainties and are available alongside the daily and monthly data. To estimate the accuracy metrics, the daily data from Sustax models (i.e.: Sustax’s harmonized model outputs) and ERA5 is employed in the same conditions, at a global scale, for each main climate variable.

Energy Distance (EngD)

Quantifies the difference between the entire probability distributions (PDF) of two datasets (in this case, Sustax model predictions and observed/ERA5 data). It considers all statistical moments (mean, variance, skewness, etc.), providing a comprehensive measure of how well the overall shape of the predicted distribution matches the observed one.

How to Interpret:
- A lower EngD value indicates a better match between the predicted and observed probability distributions, signifying higher model skill.
- A value of zero would mean the distributions are identical.
Why it’s Important: Unlike metrics that only compare means or point values, EngD assesses the overall similarity of the data distributions, which is crucial for understanding the likelihood of various outcomes across the whole probability distribution function.
How to use it: EngD helps us validate that our models are not just getting the average conditions right but are also realistically capturing the variability and range of climate parameters.

Wasserstein Distance (WassD)

Also known as Earth Mover’s Distance, Wasserstein Distance measures the “distance” or “work” required to transform one probability distribution into another (e.g., the predicted distribution into the observed/ERA5 distribution). It’s based on Optimal Transport theory. It behaves very similar to Energy Distance.

How to Interpret:
- A lower WassD value indicates a better match between the predicted and observed distributions, signifying higher model skill.
- A value of zero would mean the distributions are identical.
Why it’s Important: WassD is particularly useful for comparing distributions and is less sensitive to outliers than some other metrics. It provides a robust measure of similarity between the overall pattern of predicted and observed climate data.
How to use it: WassD complements EngD in providing a comprehensive assessment of how well our model distributions align with observations.

Mean Bias Error (MBE)

Assesses the average bias of a forecasting model. It calculates the average difference between the forecasted values and the actual (observed/ERA5) values over their respective time series.

How to Interpret:
- A positive MBE indicates that the model tends to underestimate the observational values on average.
- A negative MBE indicates that the model tends to overestimate the observational values on average.
- An MBE closer to zero suggests less systematic bias in the model’s predictions.
Why it’s Important: MBE helps identify if there’s a consistent directional error in the model, which is important for users to be aware of when interpreting absolute values.
How to use it: We use MBE to understand and potentially refine systematic tendencies in our model outputs for different variables and regions.

Mean Absolute Error (MAE)

Measures the average magnitude of the error residuals in a set of predictions, without considering their direction. It is the average over the verification sample of the absolute differences between prediction and observation.

How to Interpret:
- A lower MAE value indicates better model accuracy, meaning the predictions are, on average, closer to the observed values.
- MAE is expressed in the same units as the variable being forecast.
Why it’s Important: MAE provides a straightforward measure of the average prediction error magnitude, giving a clear indication of how far off predictions typically are.
How to use it: MAE is a key indicator we track to ensure the overall accuracy of our projections for different climate variables.

Pearson Correlation Coefficient (R)

The Pearson Correlation Coefficient (R) measures the linear correlation or strength and direction of a linear relationship between two continuous variables over their time series (in this case, the time series of Sustax model predictions and the time series of observed/ERA5 data). The coeffient R ranges from [-1 to +1], a higher absolute value of R indicates a stronger linear relationship.

How to Interpret:
- An R value closer to +1 indicates a strong positive linear relationship (as one increases, the other tends to increase).
- An R value closer to -1 indicates a strong negative linear relationship (as one increases, the other tends to decrease).
- An R value close to 0 indicates a weak or no linear relationship.
Why it’s Important: R helps assess how well the model captures the patterns and co-variability present in the observed data over time.
How to use it: We use R to ensure our models effectively reproduce the temporal dynamics and interannual variability seen in the historical climate record.

↑

Interpreting Original Uncertainty

All climate projections, regardless of the source, inherently involve a degree of uncertainty and error. These stem from:

Natural Climate Variability: The Earth’s climate system has natural, chaotic fluctuations that are difficult to predict perfectly.
Model Uncertainty: Different climate models, even when using the same input scenarios, may produce varying results due to differences in their mathematical representations of climate processes.
Scientific Limitations: Ongoing research in climate science evidences remaining uncertainties, such as how phenomena like green light-driven water evaporation might influence model accuracy and projections [22]
Scenario Uncertainty: The future pathway of greenhouse gas emissions and societal development (represented by SSPs/RCPs) is itself uncertain.

The original uncertainty is estimated from the complete set of raw CMIP6 climate simulations and ensembles that constitute each SSP-RCP scenario in Sustax. This calculation is performed before the data is processed with Geoskop’s proprietary algorithms.

The variable in Sustax containing the original uncertainty information is Model Spread, available at daily resolution for the historical and predictive period. Note that this measure of uncertainty is calculated by applying the InterQuantile Range (IQR, [23]) across the original ensembles for each SSP-RCP scenario.

How Sustax Addresses and Helps You Navigate Uncertainty

Sustax SSP-RCP scenarios / projections begin with state-of-the-art CMIP6 model ensembles and are meticulously bias-corrected against the ERA5 historical “ground truth” using Geoskop’s proprietary algorithms. This foundational process is designed to minimize systematic model errors from the outset.

Transparent Accuracy Metrics: We provide a suite of accuracy metrics (i.e.: Energy Distance, MBE, MAE, Wasserstein Distance, and Pearson R) that quantify how well our models performed against historical observations (ERA5) during the validation period (1979-2022). By reviewing these metrics, you gain a transparent understanding of our data’s historical reliability and can better contextualize future projections.
Model Spread for Projection Range: Sustax also provides “model spread” (i.e.: an uncertainty quantification of the original ensembles by means of the IQR) for its daily projections. This indicates the agreement or disagreement among the underlying CMIP6 models for a given projection. A larger spread implies greater variability or uncertainty in that specific projection from the model ensemble.
Multiple Scenarios (SSP-RCPs): By offering 7 SSP-RCP scenarios the user can explore different plausible futures driven by varying socioeconomic and emissions pathways. Analyzing across scenarios helps you understand the future range of uncertainty based on the scientific evidence of today.
Access to ERA5 model: To enable independent validation and analysis, Sustax provides access to the ERA5 reanalysis dataset covering the period from 1979 to 2022. This allows you to perform your own metric calculations or compare model uncertainty against a trusted historical ground-truth.

↑

Specifics of the Uncertainty

The original Model Spread is a powerful tool for understanding projection uncertainty, as it represents the disagreement of the raw CMIP6 model ensembles that form the basis of each Sustax SSP-RCP scenario. Overall, a wider spread in Sustax daily Model Spread indicates greater divergence among the climate models, signaling higher uncertainty for a specific projection. However, when comparing uncertainty between different SSP-RCP scenarios, it is crucial to understand a key statistical nuance:

The number of available CMIP6 climate simulations / ensembles is not the same for all SSP-RCP. Scenarios like SSP-2.45 and SSP-5.85 are built from a large number of simulations from many different global modeling centers. In contrast, scenarios like SSP-4.34 and SSP-4.60 were run by fewer centers, resulting in a smaller pool of simulations [24]. This difference creates a statistical artifact: a scenario with more contributing models will inherently tend to have a larger “Model Spread” simply because it incorporates a wider range of modeling assumptions and methodologies. Therefore, comparing the absolute spread value of SSP-5.85 directly against that of SSP-4.34 can be misleading.

To correctly interpret uncertainty, it is best to evaluate the trend of the model spread within a single Sustax SSP-RCP scenario over time. Focus on the relative change or the slope of the spread as it evolves over the coming decades. This approach provides a more meaningful assessment of uncertainty than comparing the absolute spread values between scenarios with different numbers of underlying simulations.

An increasing spread over time for a specific scenario indicates that model projections are diverging more as they look further into the future.
A stable or decreasing spread suggests growing consensus among the models for that scenario’s pathway.

For users who still wish to perform comparative analyses of model spread between different SSP-RCP scenarios, it is essential to only compare scenarios that are built from a similar number of underlying simulations. To assist with this, the table below provides a qualitative guide to the relative number of model ensembles used in Sustax SSP-RCP scenarios within the CMIP6 project.

SSP-RCP scenario	Number of simulations / ensembles per scenario (changes depending on variable)
SSP-1.19	~10
SSP-1.26	~25
SSP-2.45	~25
SSP-3.70	~20
SSP-4.34	~5
SSP-4.60	~5
SSP-5.85	~30

Data Transparency & Validation

Interpreting Accuracy Information

Our Accuracy Metrics

Interpreting Original Uncertainty

Specifics of the Uncertainty

Further Exploration

Sustax Foundations

Climate’s Past and Future

Overview of the Daily Data