Data Quality & Intercomparison

This is an endeavor which I did not enter by design and I sometimes get the feeling that I stumbled into a patch of quicksand. Although I did a considerable amount of data preparation and processing in the past, it was not until my arrival at GFDL that I engaged in “hardcore” data quality control. I started by updating and re-writing a series of Fortran programs which were the basis for the generation of a variety of atmospheric circulation statistics over the past several decades: (Oort,1977) and (Oort,1983). I also assisted Ameet Raval, a former member of my group, and Bram Oort in the updating the ANAL68 analysis scheme to ANAL95. In addition I began a collaboration with Roy Jenne of NCAR in the preparation of a data set (TD54) containing older (1940’s – 1960’s) global radiosonde observations. This data set is one source of radiosonde data for the NCEP/NCAR Reanalysis Project which is described in a BAMS Paper. From 1994-97 I served on the CDAS/Reanalysis Advisory Committee.

Because of my experience assisting Bram Oort in the preparation of his gridded radiosonde data set, I was asked by Duane Waliser (SUNY, Stony Brook) to help him in his work involving Bram’s data. This work (Waliser et al. 1999) was a collaborative effort lead by Duane, and also involved Zhixiong Shi (SUNY, Stony Brook) along with my retired GFDL colleague Bram Oort. It presents a comparison of the Hadley circulation in the NCEP/NCAR Reanalysis vs. an Oort (radiosonde only) analysis. In it we find that in the areas most lacking in radiosonde observations there is a biasing of the radiosonde only product. The NCEP product is less biased overall, but does appear to overestimate the strength of the Hadley circulation in some seasons.

An unrelated study (Bauer et al. 2002) followed a similar vein in considering differences in spatial sampling between different datasets (in this case observations and GCM). It found that the temperature-humidity relationship, used a a measure of water-vapor feedback, is not nearly as disparate between observations and model when one matches the spatial sampling. This counters some of the conclusions made in some earlier studies conducted at GFDL (Sun and Oort, 1995; Sun and Held, 1996).

In the course of examining TD54, managing the GFDL historical radiosonde data collection and examining the climate record from these data I became motivated to explore alternative statistical methods [as described above in STATISTICS & DATA ANALYSIS TECHNIQUES and in [Lanzante (1996) Int. J. Climatolog. manuscript] which could be used in quality control as well in climate studies. Many of these have proven valuable used in followup projects, such as Gaffen et al. (2000), Lanzante et al. (2003a), and Lanzante et al. (2003b).

Much effort has been devoted towards the reclamation of radiosonde temperature data which has been tainted by artificial discontinuities induced by historical changes in instruments and practices. After a number of years of effort, our initial formal attempt at producing a remedy had limited success(Gaffen et al. 2000). A comparison of attempts by several different research groups did not yield any reason for optimism either (Free et al. 2002). Finally, as reported in a two-part manuscript (Lanzante et al. 2003a; Lanzante et al. 2003b) our efforts produced more promising results, in the form of an improved radiosonde dataset as well as better understanding of the problems that afflict the data. Subsequently we extended the original dataset using a different technique (Free et al. 2004) in creating an operational product (Free et al. 2005), known as RATPAC, that is now available online from NOAA’s National Climatic Data Center.

The issue of data quality has also been examined in atmospheric water vapor datasets. Soden and Lanzante (1996) provides a comparison of the satellite and radiosonde climatologies of upper tropospheric humidity (UTH). It also demonstrates the utility of the satellite data for identifying and diagnosing some of the systematic deficiencies of the radiosonde humidity data. For example, a map of the radiosonde minus satellite difference in UTH for summer (JJA) 1989 shows that for most of the world the satellite is more moist than the radiosonde (red-orange-yellow). The opposite occurs in regions under the control of the former Soviet Union or China where an older, more slowly responding radiosonde humidity sensor was used (purple-blue-green). Since humidity generally decreases with altitude, as the balloon rises the measurement from a slower responding sensor will not decrease quickly enough thereby yielding a moister reading than for a more quickly responding one.

Another project (Lanzante and Gahrs, 1997) involved Gregory Gahrs who was a participant in the Princeton summer student program during 1995. This study took a more in-depth look at the relative biases between satellite and radiosonde measures of humidity (UTH). One of the issues addressed is the satellite “clear sky bias”, the study of which has been spun off into another manuscript (Lanzante and Gahrs, 2000).

Plans
While we feel that we have been successful in making progress at quantifying the severity of the problems as well as coming up with some remedies for the radiosonde temperature data (Lanzante et al. 2003a; Lanzante et al. 2003b; Free et al. 2005) our current approach, as well as competing approaches offered by other investigators, have their limitations. I have raised the possibility of using data assimilation/analysis in the context of future Reanalysis to come up with a more general and robust method to adjust radiosonde temperature time series. I’ve formulated some preliminary ideas which may further the discussion.

Satellite data are an alternative to radiosonde data for studying long-term climate change. However, these data have problems analogous to those afflicting radiosonde data, although distinctly different in nature, that limit their utility in this regard as well. Some time ago I started a collaborative project with Carl Mears of Remote Sensing Systems(RSS), a private company based in Santa Rosa California, aimed at diagnosing problems in satellite MSU (Microwave Sounding Unit) temperatures. The goal of this project is to try to use radiosonde data in a novel fashion to determine which of two MSU datasets are more realistic. The initial results are quite interesting, however, further analyses are required to establish the robustness of our results. Due to other commitments I’ve unfortunately “back-burnered” it and hope to revive it some time down the road.

Return to John Lanzante’s Home Page.