GFDL - Geophysical Fluid Dynamics Laboratory

Reanalysis Ideas — New Ideas (Jan 2012)

Some Ideas For Future Reanalysis Efforts

John Lanzante

New Ideas (Jan 2012)


Minimal Steps That I Believe Will Be Necessary


  1. Select which raw input datasets to use (e.g., radiosonde, satellite,
    sfc T, SLP, cloud, etc.). This might be based on what is known about
    the quality and homogeneity of each dataset, it’s temporal and spatia
    extent, and model-based experiments conducted to see how useful or
    influential each dataset is. For example, one might conduct experiments
    like the Hadley Centre did for HadAT2 in which climate model
    output from historical runs is used to create artifical data with the
    characteristics of each data type (radiosonde, satellite, ISCCP, OWS,
    etc.). These data would be sampled spatially and temporally the same
    as the real counterpart, and would have inhomogeneities introduced
    mimicking reality. One could then ingest these data into the reanalysis
    system, withholding different types one at a time, and see how it
    affects the ability to perform steps 4 and 5 (below), and ultimately
    how well trends or other low frequency variability can be recovered
    from the final product. It may turn out that some data types do more
    harm than good and should just be excluded. In addition to selecting
    particular datasets, here one would also chose which station records
    or satellites, etc. would be used. For example, some stations might
    have very short records, and others with longer records that do not
    span the full period may be close to data rich areas, and some
    satellites may have short records or little overlap with nearby
    satellites in the sequence; it may turn out better to simply exclude
    these incomplete records. Model/assimilation experiments can be used
    to determine which to keep and which to discard.

  2. Determine the best way to use multiple versions of each input dataset.
    For example, there are 5 homogenized radiosonde, 3 satellite and 3
    surface for temperature. There are also unhomogenized versions of each.
    Should all available datasets be ingested, just some, or the one “best”?
    Should each of these inputs be homogenized by the reanalysis system
    and then ingested or compared to see which is “best”?

  3. Use the reanalysis system to homogenize each input data type
    (radiosonde, satellite, surface, etc.), even if the input has been
    homogenized via other means. In this step, one particular data type
    (e.g., radiosonde T) will be homogenized by excluding this type and
    ingesting other types of data (e.g. sfc T, SLP and radiosonde winds)
    into the reanalysis system. The purpose is to create a reference
    series, which will be used to homogenize the excluded data type. This
    process will be repeated, in turn for each data type (i.e., exclude
    only the type for which a reference series is desired). At the
    conclusion of this step, each input data type will have been
    homogenized using reanalysis-generated reference series.

  4. Perform “final” reanalyses using the homogenized data sets created in
    step 3. There will necessarily be several “final” versions, varying
    by the types of input. One may contain all data inputs (determined
    from step 1). Other versions may include or exclude certain types.
    For example, one version might be based on sfc and radisonde data,
    but no satellite data at all. Another version might use just satellite
    data along with sfc data (and of course would be limited to the
    satellite era). Other versions might exclude or include cloud,
    humidity, OWS, etc. The notion here is that just as we currently have
    multiple homogenized radisonde and satellite datasets, none of which
    can be unambiguously declared as the “best”, we might have multiple
    climate reanalyis products. Depending on the application, the user
    might have to use and compare results from several of these products,
    although some users might just be interested in the one version based
    on the maximal amount of data.

  5. For output datasets from step 4 it will be necessary to perform an
    additional form of homogenization related to datasets that do not
    span the entire period of record. For example, since satellite T
    starts in 1979, there is a potential discontinuity in 1979 from the
    sudden introduction of these data into the input stream. One way to
    deal with this would be to examine a version of the reanalysis based
    only on radiosonde data, and use it to derive adjustments that need
    to be applied at 1979. There will be multiple corrections, to account
    for different datasets (OWS, MSU, etc.) that don’t span the entire
    period of record.


The Final Output Will Consist Of


  1. The one “best” version of climate reanalysis (based on the most
    complete set of suitable inputs) generated in steps 4/5.

  2. Several alternate versions of climate reanalysis, based on more
    limited inputs, from steps 4/5.

  3. The homogenized versions of the inputs created in step 3. These will
    potentially represent the successors to GISS/NCDC/CRU for sfc T,
    RATPAC/HadAT2/IUK/RAOBCORE/RICH for radiosonde T, and UAH/RSS/STAR
    for satellite T. For many other data types the benefits will be
    even greater, as no homogenized products currently exist.


What Will It Take To Get This Accomplished?


  • In my opinion all of this is feasible. It will require considerable
    resources, human and computing, and cross-collaboration amongst
    disparate communities with different areas of expertise. Three main
    areas of expertise are needed:

    1. Analysis/Assimilation/Modelling
      Since the initial NCEP/NCAR effort much has been learned, both at
      NCEP and other institutions around the country and world regarding
      how to do this.

    2. Handling/Processing Of Multiple Large Data Sets
      Likewise the expertise, such as that provided by NCAR for the first
      US reanalysis would be required.

    3. Data Homogenization
      Unlike other 1st and 2nd generation reanalyses, this community
      would play a central role. There would necessarily need to be much
      back and forth interaction between these folks and those from (2)
      and especially (1).

  • There would be many new types of hurdles that would have to be
    overcome that were not pertinent to 1st and 2nd generation
    renanalyses, but I believe could be accomplished. The biggest
    impediment is obtaining, large, long-term funding, not any easy task
    given the current economic and budgetary situation in the US and
    around the world.

  • I would equate the completion of the 1st NCEP/NCAR reanalysis to that
    of landing humans on the moon. It was a wondrous accomplishment, with
    incredible benefits that seemed like fantasy only a generation earlier.
    At the time of the first landing on the moon, many people probably
    envisioned colonization of the moon and landing humans on Mars as
    almost certainly occurring in the next generation. Unfortunately,
    human space exploration has not advanced much since then and does not
    seem likely anytime soon. The question is, will the realization of a
    true Climate Reanalysis suffer the same fate?

Back to the main page