# Geophysical Fluid Dynamics Laboratory

If you are using Navigator 4.x or Internet Explorer 4.x or Omni Web 4.x , this site will not render correctly!

# Toward a standard description of grids

What are grids, and why would we want a standard description?

As stated in the Draft Gridspec,

The comparative analysis of output from multiple models, and against observational data analysis archives, has become a key methodology in reducing uncertainty in climate projections, and in improving forecast skill of medium- and long-term forecasts. There is considerable momentum toward simplifying such analyses by applying comprehensive community-standard metadata to observational and model output data archives.

The principal motivating factor is to be able to examine and analyze gridded data, of which there is an already petabyte-scale and rapidly growing public archive.

What do we mean by a grid? It's the underlying discrete representation of physical space that you'll find in a numerical model code or model output dataset. In the context of Earth system models, it can be further specialized to the representations commonly used in climate and weather models, the atmosphere, ocean and land surface. Observations as well, especially observational analysis datasets, where in-situ and remote observations have been assimilated into a set of gridded fields.

To be somewhat (but not very much) more formal, I quote my friend Bob Numrich, of Co-Array Fortran fame:

A grid is a finite set of cells in Rn that span Rn in some sense. In [the] particular case [of Earth system models and data], Rn is most likely R4 = R2 x R x R for the 2D horizontal part, the 1D vertical part, and the 1D time part. Usually, the 2D horizontal part is the one that has the largest number of different approaches.

at which point of course, you might well ask, "Well then, what is a cell?"

I'll not attempt here to become any more formal, and instead take the Potter Stewart approach, and claim that we all more or less know what a grid is, and a cell is, or at least, we will when we've looked at a few examples. As you see, there are quite a variety of approaches not only for horizontal grids, as stated in the quote above, but for vertical ones as well.

Assuming you know a grid when you see one, why would you be interested in a standard description of a grid?

#### First use case

A good starting point to answer this question would be the IPCC Model Documentation Table. If you click through any of the documentation links of any model in the table, you will find some description of the model grids, under item IV. The main point to make here is that it isn't straightforward even for an initiated user of the data to discover what the model grid is. This example, based on Guilyardi (2006) shows how this is done at present. Some grid may be described by the string `T106L31`, which, to someone in the know, means that the horizontal grid is a Gaussian grid corresponding to a triangular truncation of spherical harmonics with a maximum wavenumber of 106, and 31 vertical levels. If we were to formalize this, we would develop a controlled domain vocabulary to state this. The vocabulary would probably somewhat less cryptic to the uninitiated user than `T106L31`, and use terms such as `gaussian_grid`, `tripolar_grid`, and so on, and similarly `sigma_coordinate` and the like to describe the vertical discretization.

The controlled vocabulary has two parts: keys that state what is being described (e.g `horizontal_grid_descriptor`) and values belonging to another vocabulary that contains terms such as `cubed_sphere_grid`. The vocabulary would have to be designed to be extensible to allow for innovation and development in grids and algorithms. Terms like `T106L31`, on the other hand would remain free strings, part of a private shorthand.

The idea of the controlled vocabulary is the beginning of standardization. The vocabulary of the keys would belong to a common schema for describing grids. The vocabulary of the values would be part of a convention of agreed-upon standard terms for the description.

#### Second use case

How much further would we like to take the idea of controlled vocabularies and standardization? As it turns out, quite a lot. For many applications, it is necessary to have on hand not only a general description of the grid (such as `tripolar_grid`) but actual geospatial information is needed: grid point locations, bounding boxes and cell edges, and so on: information that allows you to perform certain operations such as making volume integrals.

The CF Conventions permit the description of grids to some extent: certain operations can find necessary information in an `attribute` of a variable describing some aspect of the data. For instance, the taking of global integrals (volumetric, or even 4D: space and time) can take advantage of a `cell_methods` attribute of a variable, that might tell you whether the data represents point or instantaneous data, or instead a finite volume time average, and if the latter, what the bounds of the cell are, for which this value is representative.

The cell bounds and related information represent the beginnings, in CF, of providing a comprehensive set of grid metrics, wuantities that permit taking the integral, or similar analytic operations on a physical field. A complete set of grid metrics would permit operations not currently possible on anonymous netCDF data. By anonymous data, we mean

In addition, the CF Conventions also provide a `grid_mapping` attribute, that adds some information permitting

created by v. balaji (balajiprinceton.edu) in emacs using the emacs-muse mode.