PRINTSCRIPT; print $script_style; include "/var/www/html/core/partc"; $linkpage = <<< PRINTLINK gfdl homepage > people > v. balaji's homepage > this page PRINTLINK; print $linkpage; // GFDL header include "/var/www/html/core/partd"; $titlepage = <<< TITLEPAGE Gridspec: A standard for the description of grids used in Earth System models TITLEPAGE; print $titlepage; // GFDL header include_once( '/var/lib/php/counter.inc' ); error_reporting(E_ERROR); require_once('../magpierss/rss_fetch.inc'); require_once('../magpierss/rss_utils.inc'); include "/var/www/html/core/parte"; $pagecontent = <<< ENDCONTENT

3.   Representing the grid vocabulary in the CF conventions

The CF conventions have been developed in the context of the netCDF data format. The current momentum is toward using technologies such as OpenDAP to achieve format neutrality for data; and to develop the conventions themselves toward a standard through a mechanism such as OGC. As the standardization process continues, it is likely that much of CF metadata will be stored in databases in a readily-harvested form such as XML. For the purposes of this paper, however, we will continue to represent the contents of the grid standard using netCDF terminology, as now.

The current CF standard covers data fields for single grid tiles very well. As there are considerable data archives already storing data in this form, we have tried to do the least violence to existing data representations of variables on single grid tiles. The proposed extensions serve as enhancements to CF that will allow a full expression for data discretized on grid mosaics. Features to highlight include:

The general approach is as follows. Datasets are generally archived in a way whereby one approaches the dataset following metadata that describes the experiment to which it belongs. The gridspec forms part of the experiment metadata. For Earth System models, comprehensive model metadata is under development. A gridspec describing the complete grid mosaic of an entire coupled model (shown schematically in Figure 13) will be stored under the experiment, and we expect software processing any dataset associated with the experiment to have access to the gridspec.6

Datasets holding physical variables will not themselves refer to the gridspec; the connection is made at the metadata level above.

Physical variables discretized on a mosaic of more than one grid tile may be stored in multiple files, where each file contains one or more grid tiles.

3.1.   Linkages between files

We propose that links be directed and acyclic: e.g grid mosaic files point to constituent grid tile files, but the “leaf” files do not point back.

Files may be described using local pathnames or remote URIs (URLs, OpenDAP IDs). File descriptors may be absolute or relative to a base address, as in HTML.

When pointing to an external file, attributes holding the timestamp and MD5 checksum7 may optionally be specified. If the checksum of an external file does not match, it is an error. The timestamp is not definitive, but may be used to decide whether or not to trigger a checksum.
|-------------------------------------------------------------------|
|                                                                   |
| dimensions:                                                       |
|      string    = 255;                                             |
| variables:                                                        |
|      char   base(string);                                         |
|      char   external(string);                                     |
|      char   local(string);                                        |
| base   =
(1)

Encoding pathnames, checksums and timestamps carries a penalty: the system is brittle to any changes. The use of relative pathnames is recommended: this at least permits whole directory trees to be moved with little pain.

Summary: two new standard names link_base_path and link_path. Optional attributes: link_spec_version, md5_checksum and timestamp.

3.2.   Grid mosaic

The grid mosaic specification is identified by a unique string name which qualifies its interior namespace. As shown schematically in Figure 13, its children can be mosaics or grid tiles. Contact regions are specified between pairs of grid tiles, using the fully qualified grid tile specification mosaic:mosaic:...:tile.
|-------------------------------------------------------------------------|
|                                                                         |
|  dimensions:                                                            |
|      nfaces    =  6;                                                    |
|      ncontact     = 12;                                                 |
|      string    =  255;                                                  |
|  variables:                                                             |
|      char   mosaic(string);                                             |
|      char   gridfaces(nfaces,string);                                   |
|      char   contacts(ncontact,string);                                  |
|  mosaic   =
(2)

Summary: a new standard names grid_mosaic_spec. Grid mosaic specs have attributes mosaic_spec_version, children and contact_regions. Optional attributes children_links and contact_region_links may point to external files containing the specifications for the children and their contacts.

The grid_descriptor is an optional text description of the grid that uses commonly used terminology, but may not in general be a sufficient description of the field (many grids are numerically generated, and do not admit of a succinct description). Examples of grid descriptors include:

The grid descriptor could additionally contain common shorthand descriptions such as t42, or perhaps could go further toward machine processing using terms like triangular_truncation.

3.3.   Grid tile

|---------------------------------------------------------------------------|
|                                                                           |
| dimensions:                                                               |
|      string   =  255;                                                     |
|      nx  = 90;                                                            |
|      ny  = 90;                                                            |
|      nxv  =  91;                                                          |
|      nyv  =  91;                                                          |
|      nz  = 24;                                                            |
| variables:                                                                |
|      char  tile(string);                                                  |
|          tile:standard_name        =
(3)

Horizontal vertex location specifications may be of different rank depending on their regularity or uniformity. (Note that the geo-referencing information may still be 2D even for regular coordinates).

An irregular horizontal grid requires a 2D specification of vertex locations:
|-----------------------------------------------------------------------------|
|                                                                             |
| variables:                                                                  |
|      float   geolon(ny+1,nx+1);                                             |
|          geolon:standard_name         =
(4)

The vertical geo-mapping is expressed by reference to “standard levels”.

Summary: several new standard names to describe properties of a grid: distances, angles, areas and volumes. The arc type is a new variable with no equivalent in CF. Currently, we are considering values of great_circle and small_circle, but others may be imagined. The small_circle arc type requires the specification of a pole.

The grid tile spec has attributes geometry (Section 2.1), projection (Section 2.3: a value of none indicates no projection) and discretization (Section 2.5). The optional attributes regular, conformal and uniform may be used to shrink the grid tile spec.

3.4.   Unstructured grid tile

The unstructured grid tile is an UTG. The current specification follows an actual example used by the FVCOM model (missing ref: Gross; Signell). While in the LRG example above, the number of vertices can be deduced from the number of cells, it cannot in the unstructured case.

Each cell is modeled as triangular. Distances, arc types, angles and areas are cell properties. Additional elements of the UTG specification are variables with standard names of vertex_index and neighbor_cell_index to contain the indices of a cell’s 3 vertices and its 3 neighbours, respectively. The ordering line segments, neighbors, etc., all follow the ordering of vertices.
|----------------------------------------------------------------------|
|                                                                      |
| dimensions:                                                          |
|     string    =  255;                                                |
|     node   =  871;                                                   |
|     nele   =  1620;                                                  |
| variables:                                                           |
|     char   tile(string);                                             |
|          tile:standard_name        =
(5)

|-----------------------------------------------------------------------------|
|                                                                             |
| variables:                                                                  |
|      float   geolon(node);                                                  |
|          geolon:standard_name         =
(6)

3.5.   Contact regions

|-------------------------------------------------------------------------|
|                                                                         |
| dimensions:                                                             |
|      string    = 255;                                                   |
| variables:                                                              |
|      int  anchor(2,2);                                                  |
|           anchor:standard_name        =                                 |
|
(7)

|-----------------------------------------------------------------------------|
|                                                                             |
| dimensions:                                                                 |
|      string   =  255;                                                       |
|      ncells   =  1476;                                                      |
| variables:                                                                  |
|      double   frac_area(2,ncells);                                          |
|          frac_area:standard_name          =                                 |
|
(8)

3.6.   Variables

Variables are held in CF-compliant files that are separate from the gridspec but can link to it following the link spec in Section 3.1. Variables on a single grid tile can follow CF-1.0, with no changes. The additional information provided by the gridspec can be linked in, as shown in this example of a U velocity component on a C grid (Figure 9).
|-----------------------------------------------------------------------|
|                                                                       |
| dimensions:                                                           |
|      nx  = 46;                                                        |
|      ny  = 45;                                                        |
| variables:                                                            |
|      int  nx_u(nx);                                                   |
|      int  ny_u(ny);                                                   |
|      float   u(ny,nx);                                                |
|          u:standard_name       =
(9)

The staggering field expresses what is implicit in the values of nx_u and ny_u, but is useful nonetheless8. Possible values of staggering include:

Using this information, it is possible to perform correct transformations, such as combining this field with a V velocity from another file, transforming to an A-grid, and then rotating to geographic coordinates.

5The HDF5 specification, with which netCDF will merge, takes a filesystem-within-a-file approach to this problem, which by all accounts is not very efficient (missing ref: ). The proposed approach will allow very efficient approaches to dataset aggregation.

6As the gridspec is also intended for use as model input, said software might indeed be an Earth system model.

7MD5 checksums are standard practice. One can intentionally generate, by bit exchanges, erroneous files that give the same MD5 checksum, but the probability of this occurring by coincidence is vanishingly small. MD5 checksums have been measured to take about a minute for a 10Gb dataset.

8In general, there may be a lot of redundancy in the gridspec, which poses a consistency problem. In general, consistency checking and validation are relatively simple, as in the instance here.


TeX4HT created by v. balaji (balaji@princeton.edu) in emacs using Tex4HT.
ENDCONTENT; print $pagecontent; print "last modified: ". date( "d F Y", getlastmod() ); print "
this page visited: ".getCount(). " times "; include "/var/www/html/core/partf"; include "/var/www/html/core/partg";