next up previous contents
Next: Coding for performance Up: The FMS Manual: A Previous: Overview   Contents

Subsections


General design specification

This chapter describes a design specification for FMS. It first lays out the general design principles which have informed our choices in code design, in FMS:DesignPrinciples. The different elements that constitute FMS models are laid out in FMS:Elements. The standards followed by FMS are described in subsequent sections. These include a section on the conventions we employ, a section on programming practice, a section on the representation of physical information and a section on programming style. General documentation standards are then described, also with sections on conventions, practice and style. Finally, in FMS:Organization we describe the organization of the FMS, and how it is delivered as a product to the user/developer. This includes details on the organization of the source, version control, compliance verification and review, compilation and data processing requirements, test scripts, and guidelines for interaction with the FMS Development Team.


FMS design principles

The construction of FMS is guided by several principles of design, outlined here.

The principles outlined here - modularity, portability, flexibility, extensibility and community - are in our view the vital elements of a successful distributed development model. The conventions and standards described in the following sections all embody these principles.


Elements of FMS

The design of FMS is principally aimed at the construction of coupled climate models running as a single executable on vector and parallel high-performance computers. The FMS source tree is divided into three principal sections:


Component models

Component models are models for the atmosphere, ice (or ocean surface), land surface, and ocean subsystems. Any component model conforming to FMS practice can be used as a component model. FMS permits each component model to be run either solo or coupled.

Each component model comes with:

  1. a solo driver, a program to run the model standalone, if appropriate;
  2. a coupled model driver, a routine for communicating with the coupler layer described below;
  3. a directory of physical parameterizations of unresolved phenomena. Different choices of parameterization of the same process (e.g moist convection in the atmosphere) must use an identical interface;
  4. an unspecified number of component model cores, each in its own subdirectory. These are choices of a core representation of the component model's role in the coupled system. One of these cores in conjunction with the items above provide a comprehensive representation of that climate model component in a coupled system for a particular experiment. These could be dynamical cores (e.g the B-grid or spectral atmosphere) or simplified representations of a model component (e.g a mixed-layer or slab ocean) or even routines that merely read in a dataset (e.g an AMIP dataset for sea ice). Different choices of model for the same component must use an identical interface.

Coupler

The coupler consists of the main program driving a coupled model, as well as the exchange grid software for communicating data between component models, which can be on independent grids. Component models communicate only with the coupler, which mediates all interactions between them.


Shared utilities

These consists of fairly general purpose utility routines that are common to the component models and the coupler layer. These include:


Organization of FMS


The FMS source

CVS is the software used for FMS version control. The FMS source is available under CVS as described in the FMS distribution webpage. The directory tree reflects the modeling system structure described in FMS:Elements.


The FMS executable

An FMS experiment is an instantiation of a source code subset used to run a climate model simulation, solo or coupled. In either case, an experiment consists of a single executable. This executable will call component models as routines. On a parallel system, component models may run serially on the same processor set, or concurrently on independent sets of processors.

The executable must be configurable at runtime as much as possible. Features may be frozen at compile time if a clear performance advantage is demonstrable. In particular, the choice of model size, grid, and domain decomposition must be runtime-configurable.


Compilation

FMS is designed to be written in a high-level abstract language, with the current choice being f90 (see FMS:Lang). The source is split up among many files, and will contain many inter-procedural dependencies. Compilation is potentially slow, and in the case of f90, must be performed in a certain order, following a hierarchy of use statements. The use of Makefiles is thus strongly recommended.

The mkmf utility supplied with FMS performs source file dependency analysis, with particular attention to f90, and will generate Makefiles for the task at hand.


FMS coding conventions


Language

The FMS code is currently written in Fortran 90[Metcalf and Reid (19991)##]. f90 has many of the high-level features required to satisfy the design principles, while retaining adequate numerical performance.

The programming standards below in FMS:ProgrammingStandards all are written specifically for f90. Should code in other languages become part of FMS, these standards will be appropriately extended.

  1. All elements of the ANSI f90 standard are permitted, with a few listed exceptions whose use is discouraged or prohibited. These are enumerated below in FMS:Dontuse.
  2. Language extensions are severely restricted. They may be used in limited fashion, provided a pressing reason exists (e.g major performance enhancement using a particular proprietary software system), and an alternate formulation is provided for compiling environments that do not permit the extension.
  3. The language of FMS may change in the future, to Fortran 95 or Fortran 2000, or any other, after review.


Preprocessing

FMS uses preprocessor directives based on cpp. The use is intended for language extensions, and in some circumstances, it is used to generate module procedures under a generic interface for variables of different type, kind and rank (thus circumventing f90's strict typing), while maintaining a single copy of the source.

The use of preprocessor directives in FMS is permitted under the following conditions:

  1. Where language extensions are used (see FMS:Lang), cpp #ifdef statements must be used to shield lines from compilers that may not recognize them.
  2. Use is restricted to the builtin preprocessor of the f90 compiler (based on cpp), and cannot be based on external preprocessors such as m4. This condition may be relaxed on platforms where the builtin preprocessor proves to be inadequate.
  3. Use is restricted to short code sections (a useful rule of thumb is that an #ifdef and the matching #endif must both be visible on a single 80x24 editor window).
  4. Owing to restrictions in certain compilers, preprocessor variable names may not exceed 31 characters.


Code units

FMS source is divided into software modules. A software module consists of one of the following:

  1. An f90 module;
  2. An f90 program unit;
  3. A group of f90 modules constituting a package. A package is a software unit that has been separated into multiple f90 modules and source files for convenience, but intended to be used through a single interface, with unified documentation, a single set of public interfaces, a single I/O point, and so on. Examples of packages in FMS include the spectral transform package (which includes a separate source file for the Legendre transform code) and the diagnostics manager, which has been divided for convenience into multiple files. Direct use of the f90 modules within a package is discouraged, as the individual modules may not adhere to the standards specified here. Packages are identified by the f90 module at the head of the package tree.

    f90 modules and source files which are part of a package will be explicitly identified as such, both in the source and in the documentation.

Subsequent discussion only refers to software modules. The manual will explicitly identify items that specifically refer to f90 modules.

  1. A module is responsible for its own I/O, including diagnostics, restarts, and input namelists.
  2. A module has a well-defined set of public interfaces, including its own procedural interfaces, file I/O interfaces, and public derived types.

Each source code file defines a single program or f90 module.

The general coding standard for a software module is described below in FMS:ProgrammingStandards.


Filenames

The basename for the f90 module module_name is module_name. (Note that the module name extension _mod is omitted from the basename). All filenames associated with this module use this basename. The basename for a package is the name at the head of the package tree.

  1. The source file for module_name is module_name.f90 (or .F90 if it contains preprocessor directives).
  2. Compilers produce object code for each source file, usually with a .o extension (module_name.o). During linking, it is required that each object file have a unique name. The module_name must be carefully chosen to prevent name collisions. Extremely generic names must be avoided. The recommended practice is to use suitable prefixes identifying the package to which a file belongs (such as mpp_ or diag_).
  3. The namelist file, if any, associated with module_name is module_name.nml. The namelist itself is called module_name_nml. Any namelist may also appear as an entry in the file input.nml, the general namelist file.
  4. The restart file, if any, associated with module_name is module_name.res. If more than one restart file is present, a unqiue number is appended, thus module_name#.res. If the restart is written in netCDF, The extension is .res.nc.
  5. An ASCII text output file has the extension .out.
  6. An raw binary output file has the extension .data.
  7. The documentation associated with instructions for use of module_name is preferably formatted for web access, and is named module_name.html. Additional (detailed) technical documentation may also be present in other formats, with the basename module_name.technotes and a standard file extension (e.g module_name.ps for a PostScript file, module_name.pdf for PDF). PDF is recommended since PDF files are now indexed by many web search engines.
  8. If the documentation was generated from LATEX source, the file module_name.tex may also be distributed. The use of non-standard LATEX packages is discouraged.
  9. Distributed datasets are datasets where each processor has written its portion of some global data to a separate file, intended for later assembly offline. These should be identified by the 4-digit processor ID following the standard extensions described here (e.g module_name.nc.####) so it is evident that this is an incomplete file.

Files are sorted in subdirectories below the working directory. The convention calls for input restart files to be read from the INPUT/ directory, the output restarts to be written to the RESTART/ directory, and input datasets and namelists to be in the DATA/ directory. Documentation for a module will reside in the same directory as its source code.


Binary data formats

The preferred format for binary data in FMS is netCDF, a self-describing dataset format widely used in the climate modeling community. netCDF follows the IEEE standards for binary data representation. We currently follow the COARDS convention for netCDF metadata. We anticipate that very soon we will adopt the CF convention currently under final review. The CF convention is fully backward-compatible with COARDS.

The conventional extension for netCDF files from FMS is .nc.


Indices

By convention, spatial indices (x,y,z) should use indices (i,j,k).


Programming Standards


Scope

Each module in FMS must have private scope by default. Each public interface therefore needs to be explicitly published.


Typing

The use of implicit typing is forbidden. Every module must contain the line:

implicit none

in the module header, and every variable explicitly declared.

Variables are generally assumed to be of default KIND. There may sometimes be reason to specify the KIND of a variable:

  1. If KIND must be specified for reasons of precision, the f90 intrinsics SELECTED_REAL_KIND and SELECTED_INT_KIND must be used.
  2. If KIND must be specified in order to control the storage size (bytelength) of a variable (typically in communication and I/O code) it must be done using the parameters r8_kind, r4_kind, i8_kind, i4_kind and i2_kind, supplied by the module platform_mod, which sets various values that are specific to the computing platform. The platform module sets these values to the appropriate KIND values for FP and integer variables of the required bytelength.


Character variables

There are a few restrictions on the length of a character variable:

  1. Character variables that are arguments to routines must be declared with (len=*). It has been observed that compilers are inconsistent in their ``padding'' practices, and the standard is silent on the subject.
  2. It is recommended that other character variables be declared with length a multiple of 4, or preferably 8. This is a requirement for variables that are components of derived types, since it has been observed that without these restriction, there are occasional word alignment fault errors generated.


Arguments

The intent of arguments to subroutines and functions must be explictly specified.


Arithmetic

FMS requires the use of a default real variable kind that is equivalent to IEEE 64-bit floating-point arithmetic.


Constants

Constants shared across packages must never be hardcoded: instead mnemonically useful names are required. This applies to physical constants such as the universal gas constant, gravity, and so on, but also for flags used to select code options. In particular, this coding construct:

subroutine advection(flag)
integer, intent(in) :: flag
...
if( flag.EQ.1 )then
    call upwind_advection( ... )
else if( flag.EQ.2 )then
    call smolar_advection( ... )
...
endif
end subroutine advection
...
call advection(1)

is forbidden. This should instead be written as:

integer, parameter :: UPWIND=1, SMOLAR=2
...
subroutine advection(flag)
integer, intent(in) :: flag
...
if( flag.EQ.UPWIND )then
    call upwind_advection( ... )
else if( flag.EQ.SMOLAR )then
    call smolar_advection( ... )
...
endif
end subroutine advection
...
call advection(UPWIND)


Intrinsics

The f90 language provides a number of intrinsic functions for performing common operations. The use of the standard intrinsics is generally encouraged. The following conditions apply:

  1. The generic form of the intrinsic (e.g max()) must be used rather than the specific one (e.g dmax0()). This permits flexibility to later changes of type.
  2. Many of the intrinsic array operations have been found to be poorly optimized for performance (e.g reshape(), matmul()) since they have to be perfectly general. These must be used with care in code regions that are critical for performance. (Restate this in performance chapter).
  3. Several older standard intrinsic names have been declared obsolescent, and the current names are preferred (e.g modulo() instead of mod(), real() instead of float()).


Deprecated language elements

Deprecated language elements include:

  1. common blocks. Use module global variables instead.
  2. implicit typing: every code unit must be implicit none (see FMS:Typing).
  3. STOP statements (see FMS:Error).


Error exit

Error exit in a parallel environment requires additional care for a graceful exit on all processors. The FMS standard requires that:

  1. the STOP statement not be used anywhere, including for the scheduled exit, since this may cause one processor to exit, and the others to hang.
  2. the exit print an adequate account of its reasons, ID number of the processor where the error occurred, and a call stack traceback if one is accessible, to stderr.
  3. the error exit return a non-zero status to the operating system, so that job scripts are made aware of a failure.

It is strongly recommended that error exits be made through the mpp_error interface, which satisfies all of these conditions.


The module statement

  1. The module name must be an unambiguous description of the module's function, with a _mod extension.
  2. The module statement must appear on the same line as the module name, i.e, do not use:

    module &
      module_name
    

    This is to be consistent with the dependency analysis performed by mkmf outlined in FMS:Compilation.


use statements

  1. The use statement must appear on the same line as the module name, i.e, do not use:

    use &
      module_name
    

    This is to be consistent with the dependency analysis performed by mkmf outlined in FMS:Compilation.

  2. The use, only: clause is required so that all imported elements are explicitly declared.
  3. Variables imported by a use statement must not be modified by the importing module.
  4. Modules cannot publish variables and interfaces imported from another module. Thus, each public element of a module is only available through that module. This does not apply to modules in a package, where the package interface may provide all the required interfaces.


Version identifier

Since the FMS is expected to be in constant evolution, each revision being used must have a unique identifier. We use CVS keywords for this purpose. Each module must contain the following lines:

character(len=128) :: version = '$Id$'
character(len=128) :: tagname = '$Name$'

The first entry returns a unique identifier to a particular revision of the source file. The second entry returns the tag that was used to checkout the code from the CVS repository. The author is expected to make these entries exactly as shown prior to the first import of the code into the repository. Subsequently, CVS will expand the keywords and keep the names current.

Additional information can be included in the version and tagname strings if desired. In particular, if your are compiling using a file that has been modified from the repository version, this fact should be signalled by adding a string such as ``modified'' to the version string.


The logfile

FMS maintains a logfile logfile.out that can be used for an exact reconstruction of the source and inputs used in a run. Each module must make an entry in the logfile on initialization. The entry includes the revision information from FMS:Version, the contents of namelists, and identifiers for input files used. See FMS:Init.


Model fields

  1. A field is a function of space representing the instantaneous state of a model field.
  2. A field is represented by a floating-point array, either declared as such or as a component of a type.
  3. Arrays containing fields may be of higher rank, with the extra dimensions representing, say, tracer number, or timestep. These dimensions must follow the spatial dimensions.

Memory management

  1. The FMS is runtime-configurable, in that all the work arrays are dynamically allocated. Also, the processor count and domain decomposition must be specifiable at runtime.
  2. All fields must be allocated on the data domain of the associated decomposition. In particular, the allocation of 3D fields on the global domain is prohibited. This standardization permits all or most of the allocation to be done at initialization, and reduces the use of assumed-size arrays.
  3. The considerations related to domain decomposition do not apply to modules that entirely column-oriented and have no horizontal dependencies.


Parallelism

The parallelism discussed here refers to the message-passing between nodes on a cluster. If in addition the user has access to on-node shared memory parallelism, this can easily be applied on top of the existing parallelism interfaces in thread-safe regions of code (ThreadSafety).

  1. All parallel processing is done through the mpp_mod, mpp_domains_mod, and mpp_io_mod interfaces.
  2. Domain decomposition is generally in the horizontal only. For the logically rectilinear grids (see FMS:Hgrid) under consideration, most inter-processor communication can be formulated as halo updates and data transposes, both of which can be handled mpp_update_domains procedure. It is anticipated that direct use of mpp_transmit should rarely be necessary.

The parallel processing modules are documented in MPP.

The MPP layers are designed to work on single processors with negligible overhead.


I/O

I/O must flow through the standard I/O packages provided by FMS (the diagnostics manager, the FMS I/O manager, or the MPP I/O layer underlying these). In particular, the direct use of Fortran I/O or other I/O APIs for opening and closing files is forbidden.

A particularly dangerous practice is using Fortran I/O units without checking if they are already in use. The use of the FMS I/O standard interfaces prevents this.

Procedural interfaces

Procedural interfaces are the public interfaces to subroutines and functions provided by a module.

  1. Procedures that perform the same function on different datatypes (e.g of differing type, kind or rank) must have a single generic interface. When the generic public interface exists, all the module procedures that constitute it must be private.
  2. Optional arguments, if any, should follow the required arguments, so that the procedure may be called without explicit argument keywords.
  3. Argument lists should be short. If necessary related elements of an argument list should be encapsulated in a public derived type.


Module constructor

Each module or package must have an initialization procedure called a constructor. It is generally called once in the course of the run.

  1. The constructor is conventionally a subroutine named module_name_init.
  2. The constructor may be responsible for allocating global storage.
  3. The constructor reads the input data, if any is required. This includes namelists, restart files and any other data files. In every case, the constructor must be capable of generating internal defaults if the input file is not present. It must terminate gracefully if it is neither capable of proceeding without an input file, nor of generating internal defaults.
  4. The constructor writes entries to the logfile logfile.out so that the model output contains a permanent record of the exact state of the code that was used to generate it (see FMS:Logfile). The FMS I/O package returns the unit number stdlog for this. Entries include the version identifier (see  FMS:Version), the namelist contents, and identifiers for any input files.
  5. There must be a private logical variable in the module generally called module_name_initialized that is initialized at runtime to .FALSE. and is set to .TRUE. by the constructor. All module procedures can subsequently check if the module had been properly initialized. If the constructor is called and this variable is .TRUE., it must exit cleanly and silently.
  6. The constructor must attempt to call the constructor of each module that it uses.

The constructor may be omitted from a module if none of the initialization functions described here (accessing input data, allocating storage, logging) are required.


Module destructor

It is recommended that each module have a termination procedure called a destructor. It terminates use of the module, not of the run. It is generally called once in the course of the run.

  1. The destructor is by convention a subroutine named module_name_end.
  2. The destructor is responsible for deallocating module global storage.
  3. The destructor closes any open files associated with the module.
  4. The variable module_name_initialized (see FMS:Init) is set to .FALSE. by the destructor.
  5. Restart files save the state of a module upon exit. The destructor is responsible for the writing of restarts. Restart files are written in full 64-bit precision to preserve the bitwise exact model state. These are currently being written in fortran unformatted I/O in IEEE64 format, and will eventually also be written in netCDF.


Coding style recommendations

Style is somewhat personal, and it would be needlessly restrictive to attempt to impose style requirements. These are recommendations which we believe will lead pleasant interactions for developers with clear, legible and understandable code. The only style requirement we place is that of consistency: a single code unit is required to be rigorous in using the author's preferred set of stylistic attributes. It is not onerous to follow a style: modern editors have many language-aware features designed to produce a consistent, customizable style.

Style recommendations include the following:

  1. The use of free format;
  2. The use of do...end do constructs (as opposed to numbered loops as in Fortran-IV);
  3. The use of proper indentation of loops and blocks;
  4. The liberal use of blank lines to delimit code blocks;
  5. The use of comment lines of dashes or dots to delimit procedures;
  6. The use of useful descriptive names for physically meaningful variables; short conventional names for iterators (e.g (i,j,k) for spatial grid indices);
  7. The use of uppercase for constants (parameters), lowercase for variables;
  8. The use of verbose syntax on end statements (e.g subroutine sub...end subroutine sub rather than subroutine sub...end);
  9. The use of short comments on the same line to identify variables; longer comments in well-delineated blocks to describe what a portion of code is doing;
  10. Compact code units: long procedures should be split up if possible. 200 lines is a rule-of-thumb procedure length limit.


Module documentation standard

There are three categories of documentation:

Internal documentation
consists of comments in the code, expected to be reasonably descriptive but terse. These include:
  1. Descriptions of module and interface functionality, including brief descriptions of interface arguments.
  2. Descriptions of important internal variables.
  3. Frequent comments before sections of code.
User guide
This is external documentation distributed alongside the code. This section of the Manual describes in more detail the standards and conventions to be followed by a user guide.
Technical and scientific documentation
This contains a technical and/or mathematical description of the process or algorithm being solved and should be referenced by the user guide. It may take the form of a scientific paper. As described in FMS:Filenames, these may be in PDF or PostScript, with PDF preferred.

Each FMS module is required to have a user guide, with the exception of modules that are always invoked as part of a package (FMS:Modules).

Language

The user guide documentation is written in HTML. A standard format is required as it is automatically processed by scripts.

Sections

Sections MUST be delimited by the following HTML comments:

<!-- BEGIN section_name -->
... section text ...
<!-- END section_name -->

The section_name must be in uppercase.

The section names are given below (items marked with an asterisk are required). An HTML anchor should be placed before the section title. The form of this anchor should be:

<A NAME="SECTION NAME">

where the section name should be all capitals (see the list of section names below). The section titles should not appear between the delimiters.

HEADER *
module name, contact person, tags link (see template)

OVERVIEW *
A brief description of what the module does.

DESCRIPTION *
A more detailed description of what the module does, including links to technical/scientific documentation.

OTHER MODULES USED
A list of other modules used. It is recommended that the list also include a version number that the module was tested with.

PUBLIC INTERFACE *
A brief description of the entire public interface. This includes all public data and routines. Should also mention whether a namelist interface exists, if data sets are needed, and how any restart data might be used. One line summaries should suffice here. (This section could include much more information???)

PUBLIC DATA
A detailed description of all public data and data types (includes units, variable types, and dimensions).

PUBLIC ROUTINES
A detailed description of public routines and operators (all arguments must be described including their units, type, and dimensions).

NAMELIST
A detailed descriptions of all namelist variables (includes units, type, and default value).

DIAGNOSTIC FIELDS
A list of possible netcdf diagnostic output fields (includes short name, units, and description).

DATA SETS
Data sets used.

CHANGE HISTORY *
Link to the CVS log history for this module.

ERROR MESSAGES
A list of all error messages in this module with a brief description and solution of the error.

REFERENCES
A list of references and/or link to technical documentation.

COMPILER SPECIFICS
A list of compiler recommendations (might include recommended compiler version or optimization options for a particular system).

PRECOMPILER OPTIONS
A list of precompiler options.

LOADER OPTIONS
A list of loader options (e.g., libraries) and/or recommendations (note that this may be machine dependent).

KNOWN BUGS
A list of known bugs.

NOTES
Developer notes.

FUTURE PLANS
Future plans.


Hyperlinks

Hyperlinks within a document or across documents follow these rules:


Embedded scripts

The use of embedded scripts is forbidden. This includes:

  1. dynamic HTML;
  2. Java and Javascript;
  3. server-side scripts, with the exception of webCVS.


Style

As for the source, we do not place stringent style requirements for documentation, except to require consistency. Issues specific to HTML files:

  1. Browsers vary widely in their adherence to standards, so the HTML standard itself is not much use. Testing on different browsers is recommended;
  2. Leave as much as possible of the choice of fonts and sizes to the reader;
  3. Use cascading stylesheets to provide a uniform look and feel across multiple HTML files. FMS stylesheets are stored in a separate directory and must be invoked in the HTML header using a fully-qualified URL.


Template

A template is provided for simple generation of conformant user guide HTM documentation. Steps to be followed using the template are:

  1. Enter an appropriate name for contact person (line 25).
  2. Change the string sample to the name of your module (lines 3, 21).
  3. Complete the URL to the WebCVS Log link (line 28) and Change History link (line 147) by appending the path in the CVS repository to the source file. For example, for mpp.F90, you would change the default:
    http://www.gfdl.gov/fms-cgi-bin/cvsweb.cgi/FMS/
    
    to:
    http://www.gfdl.gov/fms-cgi-bin/cvsweb.cgi/FMS/shared/mpp/mpp.F90
    
  4. Remove sections that you do not expect to use if they are not applicable to your source file.


next up previous contents
Next: Coding for performance Up: The FMS Manual: A Previous: Overview   Contents
Author: V. Balaji
Document last modified