PRINTSCRIPT; print $script_style; include "/var/www/html/core/partc"; $linkpage = <<< PRINTLINK gfdl homepage > people > cobweb homepage > people > v. balaji homepage > this page PRINTLINK; print $linkpage; // GFDL header include "/var/www/html/core/partd"; $titlepage = <<< TITLEPAGE FREPlatform.pm? TITLEPAGE; print $titlepage; // GFDL header include_once( '/var/lib/php/counter.inc' ); error_reporting(E_ERROR); require_once('../magpierss/rss_fetch.inc'); require_once('../magpierss/rss_utils.inc'); include "/var/www/html/core/parte"; $pagecontent = <<< ENDCONTENT

This page outlines elements of a platform abstraction. The idea is to provide an abstraction inside which site- and platform-specific operations can be hidden/contained. Currently this is widely scattered across FRE perl code, especially in frerun and frepp, but also see site/fre.cshrc. The principal platform we are targeting is the GFDL HPCS, but the platform module should be able to support computing on other sites as well (Oakridge, Berkeley).

The platform abstraction should handle interactions between FRE and the scheduler and storage.

Scheduler

The scheduler submits jobs to a batch system. (Doing man qsub on the HPCS gives a nice flavour of what a scheduler can do.) In particular

  • FRE jobs must be able to submit other jobs (e.g FREPlatform.pm could contain a batch_submit subroutine, which would generate platform-specific code).
  • FRE must be able to query/assign resources to jobs: resources include CP-time, processor count, priority, potentially disk space and memory space (e.g a subroutine called cp_time_used). You might also want to direct jobs to a specific host (e.g submit this other job to the same host as the parent job). We are currently maintaining several versions of frerun, each with some different hardcoded qsub flags.
  • FRE must be able to retrieve job IDs and declare dependencies between jobs (e.g see qsub -hold_jid).

Storage

As outlined in our overhaul plans, we plan to work with a 3-level storage model: in HPCS parlance we think of these as vftmp, ptmp and archive. vftmp is job scratch space, which may (and on the HPCS, does) vanish when the job ends; ptmp is a longer-term scratch space which is used for continuity between jobs in a job stream, but is not "permanent"; archive is for "results" which are archived and will be indefinitely needed. Users must of course be able to specify these layers: for instance, someone doing a quick test where they don't want results archived may point archive to ptmp.

Data staging between these layers is hugely important to job efficiency and system efficiency.

Each of these layers has different amounts of storage available, maybe disk quotas or inode quotas. Optimizing data transfer may mean reducing file counts (by doing mppnccombine or making tar/cpio archives) and may use different data motion commands (e.g cp, rcp, bbcp, rsync, or platform-specific ones such as cxfscp: many of these also have tunable parameters for bandwidth optimization, e.g some internal buffer size).

Ideally, data transfer in and out of vftmp should be asynchronous or non-blocking, to allow the high-CP-count running job to maximize CP utilization.

This is the weakest part of the current FRE: choices of where to store things, whether to combine into single files, etc are hardcoded all across frerun and frepp. Refactoring the code into a few storage subroutines with flags that tell what to do would be a giant step forward.

ENDCONTENT; print $pagecontent; $url = 'http://cobweb.gfdl.noaa.gov/~vb/weblogs/FRENews.rdf'; $rss = fetch_rss($url); if( $rss ) { echo "\n"; foreach ($rss->items as $item) { $href = $item['link']; $title = $item['title']; if ( preg_match( "/\b$subj\b/i", $title ) ) { echo "\n"; } } } $subj = 'FRE'; $url = 'http://www.gfdl.noaa.gov/~vb/weblogs/journal2009.rdf'; $rss = fetch_rss($url); if( $rss ) { echo "\n"; foreach ($rss->items as $item) { $href = $item['link']; $title = $item['title']; if ( preg_match( "/\b$subj\b/i", $title ) ) { echo "\n"; } } } $url = 'http://www.gfdl.noaa.gov/~vb/weblogs/journal2008.rdf'; $rss = fetch_rss($url); if( $rss ) { echo "\n"; foreach ($rss->items as $item) { $href = $item['link']; $title = $item['title']; if ( preg_match( "/\b$subj\b/i", $title ) ) { echo "\n"; } } } $url = 'http://www.gfdl.noaa.gov/~vb/weblogs/journal2007.rdf'; $rss = fetch_rss($url); if( $rss ) { echo "\n"; foreach ($rss->items as $item) { $href = $item['link']; $title = $item['title']; if ( preg_match( "/\b$subj\b/i", $title ) ) { echo "\n"; } } } $pagecontent = <<
emacs-muse-mode created by v. balaji (balaji@princeton.edu) in emacs using the emacs-muse mode.
ENDCONTENT; print $pagecontent; print "last modified: ". date( "d F Y", getlastmod() ); print "
this page visited: ".getCount(). " times "; include "/var/www/html/core/partf"; include "/var/www/html/core/partg";