My interests lie at the intersection of science and software: the use of information technology to facilitate the science we currently do, as well as enabling new kinds of science that haven’t been dreamt of so far.
I work at Princeton University’s Cooperative Institute for Climate Sciences (CICS), heading a group providing Modeling Services to the developers of Earth System models at GFDL and Princeton University.
We provide a software environment where scientific groups can develop new physics and new algorithms concurrently, and coordinate periodically. This is the GFDL Flexible Modeling System (FMS), operational here since 1999, and the basis of our flagship models CM2.0 and CM2.1 used in the IPCC AR4 campaign.
The environment allows algorithms to be expressed on a variety of high-end computing architectures using common and easy-to-use expressions of the underlying platforms, spanning distributed and shared memory, as well as vector architectures. This is the MPP layer of the FMS, developed here with the advent of the Cray T3E in 1998, and still in active use and development toward new architectures and new algorithms.
The management of long and large climate model runs requires a complex runtime environment for managing sequences of long batch jobs on a supercomputer, and the post-processing and archival of model output. The management is done using the FMS Runtime Environment (FRE) which encapsulates complete model configurations in a single file. FRE allows the submission of model runs and archival of data on the petabyte-scale internal archive at GFDL, as well as for the delivery of data to our public server. A curator is a name given to such a software framework that provides end-to-end management from model configuration to data archival, and queries on an archived dataset to provide complete scientific information on the model it emanates from. The GFDL Curator database currently provides these services for our IPCC AR4 data holdings.
Many of these ideas at the border between science and software have been influential in the wider community.
Modeling frameworks for the construction of coupled models out of independent model components are now a prevalent idea across our field. I work closely with two such efforts: the US multi-agency effort known as the Earth System Modeling Framework, as well as a similar European effort known as PRISM. I serve as one of the two leads on ESMF’s Joint Specification Team, as well as on the Executive Committee. I similarly serve on PRISM’s System Specification Workgroup (SSW).
I serve as a PI on the NSF-funded Earth System Curator (ESC) project which is attempting to develop prototype curator software spanning multiple research and modeling groups.
The development of curators requires metadata to describe the various elements of observation and model output data. This includes metadata to describe the physical information content of data, information about the associated materials and methods (models, observational platforms), and finally the science experiment with which the data are associated. ESC and other projects are attempting to define a structure for holding such metadata, especially with a view to facilitating data management of large national and international modeling campaigns such as the IPCC. Metadata development in our field is done formally through the Climate and Forecasting (CF) conventions, and is coordinated through an informal consortium known as the Global Organization of Earth System Science Portals (GO-ESSP). I serve on the CF Conventions Committee and the GO-ESSP Steering Committee.
Free software has been extraordinarily beneficial to us, not only in terms of the myriad free software tools we use everyday, but also in providing innovative ways of working within large, diverse, distributed communities, and openly sharing our results… a lot like science, at least science as it ought to be. To the extent that the large institutions I am obliged to work with will let me, I work with free software, and aim to provide free software.