Skip to content

Model Diagnostics Task Force (MDTF)
Diagnostic Package

Package Overview

The MDTF diagnostics package is a portable framework for running process-oriented diagnostics (PODs) on weather and climate model data. Each POD targets a specific physical process or emergent behavior, with the goals of determining how accurately the model represents that process, ensuring that models produce the right answers for the right reasons, and identifying gaps in the understanding of phenomena.

The package provides an extensible, portable and reproducible means for running these diagnostics as part of the model development process. The framework handles software dependency and data management tasks, meaning that POD developers can focus on science instead of “reinventing the wheel”. Development is community-driven and built on open-source technologies.

The current version of the package is available on GitHub. Documentation for users and contributors is hosted on

Current and Planned Diagnostics

The following section outlines current, in-progress and planned diagnostic modules from coordinated NOAA-MAPP research funding efforts. In addition to diagnostics developed by the Model Diagnostics Task Force, planned diagnostics are contributed by investigators funded under Climate Sensitivity, Land Climate Process Teams, and Marine Ecosystems funding efforts. Within each category, diagnostics are organized by Realm — Atmosphere, Land, and Ocean and Cryosphere.

  • Atmosphere — Existing PODs

    • The convective transition diagnostic package computes statistics that relate precipitation to measures of tropospheric temperature and moisture, as an evaluation of the interaction of parameterized convective processes with the large-scale environment. Here the basic statistics include the conditional average and probability of precipitation, PDF of column water vapor (CWV) for all events and precipitating events, evaluated over tropical oceans. The critical values at which the conditionally averaged precipitation sharply increases as CWV exceeds the critical threshold are also computed (provided the model exhibits such an increase).

    • The diurnal cycle package generates a simple representation of the phase (in local time) and amplitude (in mm/day) of total precipitation, comparing a lat-lon model output of total precipitation with observed precipitation derived from the Tropical Rainfall Measuring Mission (TRMM: satellite derived 3B42 product.

    • This computes the climatological anomalies of 500 hPa geopotential height, then calculates the EOFs using ​NCL’s eofunc​.

    • This MJO propagation and amplitude diagnostic metrics is mainly motivated by recent multi‐model studies that model skill in representing eastward propagation of the MJO is closely related to model winter mean low‐level moisture pattern over the Indo‐Pacific region, and the model MJO amplitude tends to be tightly associated with the moisture convective adjustment time scale. This package is designed to provide further independent verification of these above processes based on new GCM simulations.

    • This module computes many of the diagnostics described by the the ​US-CLIVAR Madden-Julian Oscillation (MJO) working group​ and developed by NCAR’s ​Dennis Shea for observational data​. Using daily precipitation, outgoing longwave radiation, zonal wind at 850 and 200 hPa and meridional wind at 200 hPa, the module computes anomalies, bandpass-filters for the 20-100 day period, calculates the MJO Index as defined as the running variance over the bandpass filtered data, performs an EOF analysis, and calculates lag cross-correlations, wave-number frequency spectra and composite life cycles of MJO events.

    • The teleconnection diagnostics first generate maps of MJO phase composites of 250 hPa geopotential height and precipitation for observations and several CMIP5 models, putting behavior of the candidate model within this cloud of models and observations. Then, average teleconnection performance across all MJO phases defined using a pattern correlation of geopotential height anomalies is assessed relative to 1) MJO simulation skill and 2) biases in the North Pacific jet zonal winds to determine reasons for possible poor teleconnections. Performance of the candidate model is assessed relative to a cloud of observations and CMIP5 simulations.

    • Produces wavenumber-frequency spectra for OLR, Precipitation, 500 hPa Omega, 200 hPa wind and 850 hPa Wind.

  • Atmosphere — In Progress

    • In climate models, besides background basic-state conditions, realistic representation of tropical-extratropical teleconnection depends on models’ ability to represent Rossby wave sources (RWS) in the upper-atmosphere. Following Sardeshmukh and Hoskins (1988), we have developed a process-oriented-diagnostic (POD) based on barotropic vorticity budget equation. The POD consists of 4 levels of diagnostics:
      Level 1 identifies ENSO events and performs composite analysis;
      Level 2: calculates basic state meridional gradient in absolute vorticity and wave number;
      Level 3: estimates all individual terms that contribute to anomalous RWS;
      Level 4: results are displayed as scatter plots (metrics).

    • ENSO moist static energy (MSE) diagnostic package consists of four levels. With a focus on identifying leading processes that determine ENSO-related precipitation anomalies, main module of the POD estimates vertically MSE budget and its variance analysis to account for relative contribution of each MSE term to column MSE. In that pursuit, POD is applied to monthly data (climate model or reanalysis products), and budget terms are estimated for “composite” El Niño or La Nina events. To estimate MSE budget, along with surface and radiation fluxes, 3-dimensional atmospheric variables are required. Hence, ERA-Interim is considered as “observations” here, and diagnostics obtained from ERA-Interim are used for model validation.

    • This diagnostic uses the cyclone tracking algorithm MCMS (developed by Mike Bauer) to generate Lagrangian cyclone tracks for model data. The tracks are then used for two metrics: (1) cyclone track analysis, including track location density and track strength histograms, and (2) cyclone-centered composites of water vapor, cloud fraction, and precipitation. The track characteristics are compared against reanalysis, and when possible, satellite data.

    • Using a weather-type-based approach, daily atmospheric circulation patterns are computed for both the model under study and the reference reanalysis product. The temporal and spatial characteristics of these weather types are then compared to help diagnose the source of flow-dependent (i.e., pattern-dependent) model biases. Diagnostics include frequency of occurrence of the atmospheric circulation patterns at multiple timescales (e.g., daily, sub-seasonal, seasonal, inter-annual, decadal, …), persistence, transition probability matrices, a Procrustes decomposition (i.e., scaling, rotation and translation matrices) of the spatial biases, and several teleconnection metrics.

    • This diagnostic examines key processes associated with tropical cyclones (TCs) in global models and compares them to observations, such as best-track datasets, satellite measurements and reanalysis products. Azimuthally averaged structures of wind and thermodynamic variables are computed around the TC center – which is computed from an external TC tracking algorithm – using either 3-hourly or 6-hourly datasets. It produces a series of composite diagrams of azimuthally averaged variables as a function of TC intensity. The diagnostics helps identify process-level errors in model representation of TCs.

    • Significant biases exist in model calculations of radiative forcing that remain largely undocumented since radiative forcing is rarely calculated or archived in model integrations. The Radiative Forcing and Feedback POD addresses this diagnostic gap by developing software to derive the critical radiative forcing and feedback metrics from standard CMIP model output. The Python-based software enables users to decompose surface and top-of-atmosphere radiative fluxes into contributions from the individual state variables and can be used to compute both instantaneous and adjusted forcing, as well as feedbacks.

    • The surface temperature extremes and distribution shape package computes statistics that relate to the shape of the two-meter temperature distribution and its influence on extreme temperature exceedances. These metrics evaluate model fidelity in capturing moments of the temperature distribution and distribution tail properties, as well as the large-scale meteorological patterns associated with extreme temperature exceedance days.

    • TBD

  • Atmosphere — Proposed

    • Extend the current POD to include precipitation contribution as a function of the water vapor-temperature environment. Results for CMIP6 models will be added for reference. Possibly adding the dependence of the precipitation-water vapor relationship on moisture convergence and mesoscale organization.

    • Moist static energy variance budget including radiative and surface flux feedbacks computed in 10 degree box following tropical cyclone tracks. Two types of composites over many tropical cyclones are included: life cycle composites relative to the time of lifetime maximum intensity of each storm, and intensity-bin composites. Observational reference is computed from reanalysis data. Tropical cyclone tracks are a prerequisite for the diagnostic.

    • The precipitation-buoyancy diagnostics module relates precipitation to a measure of lower-tropospheric averaged buoyancy and its two components: an empirical measure of Convective Available Potential Energy (CAPE) and lower-tropospheric sub saturation (SUBSAT). The module evaluates the model precipitation sensitivity to thermodynamic variations by conditionally-averaging tropical oceanic precipitation by CAPE and SUBSAT, and visualizing the result in 3D as a precipitation surface. A metric that captures the CAPE vs. SUBSAT precipitation sensitivity is used to assess model performance compared to observations and a suite of CMIP6 models.

    • Measures of the precipitation probability distribution are evaluated in current climate simulations as a function of region and season in comparison to observations, informed by process understanding. In observed probability density functions (PDFs), a percent change in a cutoff scale dominates changes in probability of extremes. The shape of simulated PDFs through the high-precipitation range relative to the cutoff scale and the simulation of changes in the cutoff scale across regions and seasons provide information relevant to projections of precipitation extreme changes.

    • In climate models, realistic representation of tropical-extratropical teleconnection depends on models’ ability to represent Rossby wave sources (RWS) in the upper-atmosphere. Version 1 of our POD is based on barotropic vorticity budget equation. In version 2, a POD based on the full primitive equation as described in Ji, X., D. Neelin and C.R. Mechoso (2016) will be developed, and will consist of 4 levels of diagnostics:
      Level 1 identifies ENSO events;
      Level 2: calculates basic state meridional gradient in absolute vorticity;
      Level 3: estimates all individual terms that contribute to anomalous RWS;
      Level 4: results are displayed as scatter plots (metrics).”

    • Conditional sampling (i.e., averaging one variable when conditioned by another) of surface fluxes and rainfall are used to diagnose model surface fluxes and surface flux feedbacks to diabatic heating (i.e., rainfall). Fluxes and rainfall are sampled by 1) low-level wind speed and humidity or temperature vertical gradients to characterize model bulk flux algorithms; 2) 500 hPa pressure velocity to identify large-scale circulation regimes with strong surface flux feedbacks; and 3) ISCCP cloud state to illustrate how surface flux feedbacks relate to cloud radiative effects. Statistical descriptors of flux distributions and their relationship to large-scale circulation regimes are quantified.

    • The vertical profiles of diabatic heating have important implications for large-scale dynamics, especially for the coupling between the large-scale atmospheric circulation and precipitation processes. We adopt an objective approach to examine the top-heaviness of vertical motion, which is closely related to the heating profiles and a commonly available model output variable. The diagnostic/metric can also be used to evaluate the top-heaviness of diabatic heating.

    • This metric is to evaluate the variability processes of TCs in a global model. Early studies focused primarily on tropical processes in regulating TC activity, while recent studies suggested also some long-range impacts of extratropical processes. Our analysis showed that the tropical upper-tropospheric troughs (TUTTs) integrate tropical and extratropical impacts on TC activity. TUTTs are subject to the modulation of diabatic heating in various regions and are the preferred locations for extratropical Rossby wave breaking. TUTT index represents the variability of vertical wind shear and tropospheric humidity and is significantly correlated to TC activity in the tropical Atlantic and Pacific.

    • In climate models, investigating inter-related formulations that impact convective triggering, precipitation partitioning and vertical profiles related to cloud and convection schemes across different convective regimes and during different large-scale forcing conditions are necessary. POD aims to assess model parameterizations that account for “inter-related” formulations. Correct precipitation partitioning, for example, has impacts on the tendencies from the moist processes, net radiative flux divergence, vertical gradient in Q1, and to ENSO-teleconnection. A set of metrics that physically link the skill in modeled ENSO teleconnection to inter-related formulations will be brought out.

  • Land — Existing PODs

    • The Soil moisture-Evapotranspiration (SM-ET) Coupling Diagnostic Package evaluates the relationship between SM and ET in the summertime. It computes the correlation between surface (top-10cm) SM and ET, at the interannual timescale, using summertime-mean values. Positive correlation values indicate that, at the interannual time scale (from one summer to the next), soil moisture variability controls ET variability. This can generally be expected to occur when soil moisture availability is the limiting factor for ET. Conversely, negative values indicate that ET variations drive variations in soil moisture levels, which can be expected to occur in regions where soil moisture is plentiful and the limiting factor for ET becomes atmospheric evaporative demand (radiation, temperature); it also reflects the anticorrelation between precipitation, which drives soil moisture, and radiation, which drives ET. In addition to its sign, the correlation value quantifies how much of ET interannual variability is explained by soil moisture variations (if the correlation is positive; vice versa if it is negative)—in other words, the tightness of the SM–ET relationship. Considering seasonal means removes issues associated with the coseasonality of soil moisture and ET, while still reflecting the overall (i.e., seasonally integrated) dependence of ET on soil moisture throughout the whole season. See Berg and Sheffield (2018) for further details.

  • Ocean and Cryosphere — In Progress

    • The AMOC 3D structure diagnostic currently consists of the following functionalities:
      – Calculation of volume transport from velocity;
      – Calculation of yearly mean from monthly data;
      – Calculation of long-term mean from yearly data;
      – Calculation of stream function on depth coordinate;
      – Interpolation of tracer grid to velocity grid;
      – Projecting transport (V) onto temperature/salinity (T/S) plane;
      – Calculation of stream function on density coordinate;
      – Transport of weighted temperature and salinity;
      – Meridional Heat Transport (MHT) and Freshwater Transport (MFWT);
      – Plotting results;
      – Plotting observation or high-resolution HYCOM results.

    • Arctic sea ice concentration and thickness mean, variability and persistence.

    • Sea level in the tropical Pacific is one of the important heat storages for the entire Earth system. There is a high correlation between the sea level in the tropical Pacific and the global mean surface temperature at the inter-annual time scale. Therefore, any bias in tropical Pacific sea level could imply a bias in the heat exchange between atmosphere and ocean. A wrong representation of the heat distribution could lead to over or underestimate of the future temperature changes in the atmosphere. The diagnostic tool helps to identify the cause of the sea level bias in the model.

  • Ocean and Cryosphere — Proposed

  • Atmosphere — Proposed

    • Diagnostics of boundary layer and clouds. These are still in planning stage, but we envision including boundary layer depth, estimated entrainment flux, and a measure of decoupling. We also want to include diagnostics of the cloud physics, including adiabaticity of boundary layer clouds, warm rain processes, and warm rain fraction.

    • This process-oriented diagnostics package will calculate so-called cloud controlling factors (CCFs) from model output and compare CCFs for each model to calculated cloud feedbacks, Equilibrium Climate Sensitivity, and observational constraints.

    • This process-oriented diagnostics package will compute a range of emergent constraints deemed to have robust skill in predicting Equilibrium Climate Sensitivity and compare to a range of observable quantities.

    • The humidity cloud circulation diagnostics present the spatial distributions of water vapor and clouds in relation to large-scale dynamic and thermodynamic conditions. The model representation of these spatial structures is closely related to climate sensitivity. Observations of these spatial structures serve as emergent constraints on models’ equilibrium climate sensitivity.

    • We propose a process-oriented diagnostics package that uses six existing radiative kernels to calculate global radiative feedbacks (Planck, lapse rate, water vapor, surface albedo, and cloud feedbacks) from model output. The use of six kernels will allow an assessment of feedback dependence on the mean state radiative properties of models.

    • Standardized metrics for characterizing the variability of low clouds and the relevant meteorological conditions governing their formation and variability. These can be used to assess the ability of models to capture low cloud variability compared to observations, to help identify the sources of discrepancies between model and observational low cloud fields and to develop emergent constraints based on low cloud variability.

    • The diagnostic uses a new observational constraint derived from CloudSat- Cloud-Aerosols Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) observations, which discriminate Sc from Cu clouds, to evaluate the interannual variation of the low clouds in response to surface temperature forcings, in regions dominated by Sc and Cu (including the vertical change), for present-day simulations of climate models.

  • Land — Proposed

    • Convective Triggering Potential (CTP) is the area on a skew-T log-P diagram between the atmospheric temperature profile and the moist adiabat that intersects the temperature profile 100 hPa above the ground – the integral is performed between 100 hPa and 300 hPa above the ground.
      Humidity index (HI) is the sum of dew point depressions at 50 and 150 hPa above the surface.
      The combination of two indices is meant to capture both the thermal and moisture contributions to cloud formation and convection, and is typically applied to early morning profiles before a daytime boundary layer has penetrated the surface inversion and begun to grow.

    • The potential for convection and the potential for the land surface via fluxes of heat and moisture to act as a control on the initiation of convection. Surface heating mixes dry adiabatically up to intersect the temperature profile at the potential mixed level (PML). Evaporation adds humidity that is mixed to the same depth.
      Metrics include:
      1) Buoyant condensation level (BCL): level at which the well-mixed atmosphere heated from below reaches saturation.
      2) Buoyant mixing potential temperature, at which BCL is reached.
      3) Temperature deficit and specific humidity deficit – amounts by which a well-mixed boundary layer needs to increase in temperature or humidity to reach LCL.

    • Employing a suite of in situ, remote sensing, and reanalysis datasets, the ILAMB package performs comprehensive model assessment across a wide range of land and atmospheric forcing variables. We plan to extend the ILAMB benchmarks to assess land-atmosphere coupling metrics through incorporation of novel dataset as well as development of promising and new evaluation metrics. ILAMB currently generates annual diagnostics; in this effort we will extend to seasonal diagnostics to assess the performance of model improvements. In addition, new metrics and corresponding datasets will contribute back to ILAMB and a wider research community, providing a lasting imprint of our project.

    • The difference between lifting condensation level (LCL) and height of the planetary boundary layer (PBLH) represents the shortfall of a growing boundary layer to reach the level where clouds can form, due to lack of buoyancy from insufficient heating and/or insufficient moisture content. It quantifies as a continuously varying metric that leads up to a threshold occurrence – cloud formation. It can be calculated in meters or millibars, where zero or negative deficit means the PBL penetrated the LCL. Hourly data are preferable to resolve the diurnal cycle.

    • Boundary layer evolution, including entrainment rates at the top of the PBL, can be diagnosed from hourly surface flux measurements and the evolution of near-surface temperature and humidity, assuming a well-mixed boundary layer using sub-diurnal data spanning the daylight hours.
      Change in PBL moisture or heat content during the day is estimated by the change in heat and moisture content over period of growth of the boundary layer (nominally 12 hours, but could be less) based on 2m temperature and humidity, assuming thorough mixing, which is the sum of:
      1) Surface fluxes into PBL air mass;
      2) Lateral advection;
      3) Entrainment (estimated as a residual).
      This is also applicable to model output.

    • Rate of change of relative humidity at the top of a growing boundary layer determines time to cloud formation, and depends on properties of the boundary layer itself, the free atmosphere and the surface (namely EF and “non-evaporative terms”). From an initial RH, the tendency can be integrated to determine if/when clouds will form.
      The key metric is the critical EF above which the PBL will moisten instead of dry. Increasing specific humidity will not lead to cloud if RH at PBL top is not increasing. Dry vs. wet soil advantage regime transitions occur when d(RH_PBLH) / d(ET)=0.

Examples of Package Output

Downloading and Running

Step-by-step installation instructions are provided on GitHub; see also the Getting Started Guide (PDF). To summarize, installation requires:

  • Downloading the code from GitHub;
  • Downloading supporting observational data and (optionally) sample model data, currently available via anonymous FTP from UCAR;
  • Using an included script to install required version-controlled third-party libraries and dependencies via the conda package manager;
  • Configuring paths to the above data in a settings file and (optionally) conducting a test run of the framework on the sample data to verify the installation.

For GFDL users: The Diagnostics and Evaluations Team maintains an up-to-date, site-wide installation of the package for GFDL users, accessible from the post-processing/analysis cluster and workstations. See the GFDL-specific documentation for using this installation interactively or as part of GFDL’s workflow.

Developer’s Information

In addition to the funded diagnostics listed above, the MDTF welcomes any contributed diagnostics that meet the science goals and coding requirements. Adapting an existing script for use in the package requires no code changes, only the addition of a configuration file and output template.

To get started developing a new diagnostic or adapting existing code, consult the developer information section of the documentation site or the Developer’s Walkthrough (PDF).


Development of this code framework for process-oriented diagnostics was supported by the National Oceanic and Atmospheric Administration (NOAA) Climate Program Office Modeling, Analysis, Predictions and Projections (MAPP) Program (grant # NA18OAR4310280). Additional support was provided by University of California Los Angeles, the Geophysical Fluid Dynamics Laboratory, the National Center for Atmospheric Research, Colorado State University, Lawrence Livermore National Laboratory and the US Department of Energy.

Many of the process-oriented diagnostics modules (PODs) were contributed by members of the NOAA Model Diagnostics Task Force under MAPP support. Statements, findings or recommendations in these documents do not necessarily reflect the views of NOAA or the US Department of Commerce.

Data Citations

Guo, Huan; John, Jasmin G; Blanton, Chris; McHugh, Colleen; Nikonov, Serguei; Radhakrishnan, Aparna; Rand, Kristopher; Zadeh, Niki T.; Balaji, V; Durachta, Jeff; Dupuis, Christopher; Menzel, Raymond; Robinson, Thomas; Underwood, Seth; Vahlenkamp, Hans; Bushuk, Mitchell; Dunne, Krista A.; Dussin, Raphael; Gauthier, Paul PG; Ginoux, Paul; Griffies, Stephen M.; Hallberg, Robert; Harrison, Matthew; Hurlin, William; Lin, Pu; Malyshev, Sergey; Naik, Vaishali; Paulot, Fabien; Paynter, David J; Ploshay, Jeffrey; Reichl, Brandon G; Schwarzkopf, Daniel M; Seman, Charles J; Shao, Andrew; Silvers, Levi; Wyman, Bruce; Yan, Xiaoqin; Zeng, Yujin; Adcroft, Alistair; Dunne, John P.; Held, Isaac M; Krasting, John P.; Horowitz, Larry W.; Milly, P.C.D; Shevliakova, Elena; Winton, Michael; Zhao, Ming; Zhang, Rong (2018). NOAA-GFDL GFDL-CM4 model output historical. Version YYYYMMDD[1].Earth System Grid Federation.

Krasting, John P.; John, Jasmin G; Blanton, Chris; McHugh, Colleen; Nikonov, Serguei; Radhakrishnan, Aparna; Rand, Kristopher; Zadeh, Niki T.; Balaji, V; Durachta, Jeff; Dupuis, Christopher; Menzel, Raymond; Robinson, Thomas; Underwood, Seth; Vahlenkamp, Hans; Dunne, Krista A.; Gauthier, Paul PG; Ginoux, Paul; Griffies, Stephen M.; Hallberg, Robert; Harrison, Matthew; Hurlin, William; Malyshev, Sergey; Naik, Vaishali; Paulot, Fabien; Paynter, David J; Ploshay, Jeffrey; Schwarzkopf, Daniel M; Seman, Charles J; Silvers, Levi; Wyman, Bruce; Zeng, Yujin; Adcroft, Alistair; Dunne, John P.; Dussin, Raphael; Guo, Huan; He, Jian; Held, Isaac M; Horowitz, Larry W.; Lin, Pu; Milly, P.C.D; Shevliakova, Elena; Stock, Charles; Winton, Michael; Xie, Yuanyu; Zhao, Ming (2018). NOAA-GFDL GFDL-ESM4 model output prepared for CMIP6 CMIP historical. Version YYYYMMDD[1].Earth System Grid Federation.