|
Tue, 11 Dec 2007
FRE: hsmfiles
<div class="www">
This item is GFDL-internal only: if you are
authorized and set up to have access to cobweb, change www to cobweb
in the URL bar above.
</div>
<div class="cobweb">
FRE was originally conceived with a 2-layer storage model, there was
\$TMPDIR and there was \$ARCHIVE . The former was local disk attached to
the compute node, and only lasted for the duration of the job; the
latter was connected to tape and "deep" (linear) storage.
Depending on the filesystem and archive configuration and performance,
there may be different ways to achieve this. Over time we have
struggled with multiple optimization targets:
- minimize file transfer time
- minimize file count (i.e inode count)
- minimize movement of "intermediate" files
- minimize redundant copies of the same data
- optimize file size for tape storage
- minimize total archive size (eliminate redundant copies)
At various times, we have privileged one or the other of these
optimization targets. We have tried to achieve an "optimal" file size
(100Gb, as advised in 2004... is that still useful?) by using cpio to
make file archives. We once did mppnccombine "online" (i.e within the
runscript) to minimize the amount of intermediate data that was
archived; later we moved it into a separate job because it
mppnccombine is not parallelizable to O(100) CPUs, and was taking too
much time; later we moved it back in because file transfers were
taking too long.
For a while now we have been proposing a 3-layer storage model: the
third layer, called ptmp , is a fast scratch area for storing
intermediate files. We propose to use this to minimize archive use,
and for fast access to "intermediate" files from multiple jobs. ptmp
could be a shared filesystem (like /ptmp or /work ), or might be in
local disk (vftmp : in this case all the jobs sharing this data have to
run on the same IC node...).
Tim proposes this high-level design for a tool called hsmfiles to
manage data transfers on a 3-level storage model:
FILESYSTEMS
/vftmp 4 TB/host
/ptmp 44 TB
/work 22 TB
DCM 260 TB (/archive nearline disk = 4 x 65 TB)
files > 500 MB do not go to DCM. revise this?
FILESYSTEM PERFORMANCE
dd bs=16M of 4 GB from /dev/zero to:
/vftmp 976 MB/s
/ptmp 625 MB/s
/arch2 423 MB/s
/arch3 115 MB/s
dd bs=16M of 4 GB from /vftmp file to:
/vftmp 192 MB/s
/ptmp 166 MB/s
/arch2 147 MB/s
/arch3 110 MB/s
run on ic4 in cpuset of 4 contiguous cpus
/arch2 is nearly idle
/vftmp needs pre-allocation, bigger allocation units?
HSMFILES DESIGN
storage levels
/archive, "disk", and "workdir"
"disk" is /ptmp now, could change to /archive nearline disk (DCM)
production job puts to /ptmp, post-processing job puts to /archive
/ptmp and /archive use same directory structure
hsmfiles does mkdir -p in /ptmp and /archive as needed
container types
directory, heirloom cpio -K file, gnu tar file
default = directory
non-blocking parallel copies
default for -put to directory container
in future, could extend to cpio/tar containers
hsmfiles -get -link allows reading from /ptmp
hsmfiles -get -workdir=path allows /ptmp scratch dir
in future, could leave files in /vftmp/hsmfiles
hand off to file transfer daemon
reload job makes soft request for same exec host
if host crashes, files are stranded
HSMFILES OPTIONS
"do what i mean semantics" throughout
get destination/put source is:
-workdir if specified
\$cwd, if a subdirectory of /vftmp
otherwise, \$TMPDIR
USAGE:
hsmfiles operation [options] [container_type] [container_path] [filelist]
operation
-get get from /ptmp if there, otherwise from /archive
-get=archive get from /archive only
-get=stage copy /archive files to /ptmp
-put put to /ptmp
-put=archive put to /archive
-put=migrate put /ptmp files to /archive
-wait with -put, do blocking copies
alone, wait for job's non-blocking copies to complete
-list ls -l, cpio -itv, tar -tv
options
-workdir=name workdir = subdir of \$TMPDIR
-workdir=path workdir = arbitrary pathname
-link symlink to destination
-keep don't remove source files after put
-append append to container
-strict exit if any files in filelist are missing
-dryrun just print what would be done
container_type
-dir directory (default)
-cpio heirloom cpio -K file
-tar gnu tar file
container_path
.cpio/.tar extension added/removed as needed
if omitted for -dir, use \$cwd if in /archive or /ptmp
filelist
basename[=workdirname] ... (no paths, no patterns)
if omitted and -get, get all files in container
INTERNAL UTILITIES
hsmget, hsmput
run multiple hsmcp's in parallel
in cpuset = private /sge/jobid.1/hsmfiles? shared /sge/hsmfiles?
non-blocking (default) or blocking
process name or arguments includes jobid
hsmfiles -wait uses ps to check non-blocking processes
hsmcp
16 MB I/O's
cxfscp or custom?
hsmcpio, hsmtar
custom heirloom cpio and gnu tar
seek, instead of reading thru data, for -A, -t, -x
would this allow parallel invocations?
hsmtype
return container type
hsmmigrate
if /ptmp fills, migrates /ptmp to /archive
may use xfs extended attributes
RELATED USER COMMANDS
/usr/local/OSoverlay/bin/cpio
enforces blocksize = n*64KB
adds -K
/usr/local/OSoverlay/bin/tar (planned)
gnu tar with mods, for use outside hsmfiles
default blocksize = 64KB for write, 512KB for read
enforce blocksize = n*64KB
</div>
Mon, 3 Dec 2007
FRE: HPCS data transfer speeds
<div class="www">
This item is GFDL-internal only: if you are
authorized and set up to have access to cobweb, change www to cobweb
in the URL bar above.
</div>
<div class="cobweb">
FRE was originally conceived with a 2-layer storage model, there was
\$TMPDIR and there was \$ARCHIVE . The former was local disk attached to
the compute node, and only lasted for the duration of the job; the
latter was connected to tape and "deep" (linear) storage.
Depending on the filesystem and archive configuration and performance,
there may be different ways to achieve this. Over time we have
struggled with multiple optimization targets:
- minimize file transfer time
- minimize file count (i.e inode count)
- minimize movement of "intermediate" files
- minimize redundant copies of the same data
- optimize file size for tape storage
- minimize total archive size (eliminate redundant copies)
At various times, we have privileged one or the other of these
optimization targets. We have tried to achieve an "optimal" file size
(100Gb, as advised in 2004... is that still useful?) by using cpio to
make file archives. We once did mppnccombine "online" (i.e within the
runscript) to minimize the amount of intermediate data that was
archived; later we moved it into a separate job because it
mppnccombine is not parallelizable to O(100) CPUs, and was taking too
much time; later we moved it back in because file transfers were
taking too long.
For a while now we have been proposing a 3-layer storage model: the
third layer, called ptmp , is a fast scratch area for storing
intermediate files. We propose to use this to minimize archive use,
and for fast access to "intermediate" files from multiple jobs. ptmp
could be a shared filesystem (like /ptmp or /work ), or might be in
local disk (vftmp : in this case all the jobs sharing this data have to
run on the same IC node...).
There are some questions that arise:
- is it worthwhile making archive files? If it's the archive creation
and extraction itself that's taking time, perhaps it's better to
leave files as they are.
- if we are making archive files, should we continue to use
cpio , or
should be use something else, e.g tar ? (note that we aren't using
GNU/linux standard cpio , but a version that's been locally hacked
for performance. We are very likely moving toward multiple
computing sites: one requirement must be that any archive files
created must be portable to systems without custom tools).
- where should intermediate files be stored? (in
/vftmp itself, /ptmp
or /work , or maybe /archive ? (the last option is the idea of
making a large filesystem on /archive , but marking "intermediate"
files in a way that prevents them from being written to tape).
- what's the fastest way to transfer files to various destinations?
cp , rcp , rsync , cxfscp ? Should we make archives and stream single
large files, or stream files as they are (using the -r option that
all of the above support? (The reason I include rsync is that it
seems to be clever about interrupted or double transfers, and only
transfers the needed increments).
I've done some measurements, which may be useful here.
<span>
The script I used it /home/vb/bin/cpspeed.csh . Please feel free to use
it if you feel the urge.
</span>
The test assumes you're in the middle of a running production job, and
have a top-level directory that contains a number of files.
- There are 3 source directories, of size 3 GB (small, called
INPUT ),
25 GB (medium, called om3p25 ) and 155 GB (large, called
A19_history ).
- There are 3 destinations:
/vftmp , /work and /archive .
- There are 10 transports:
cp , rcp , rsync and cxfscp ; the same but
with the -r option; and then cpio and tar directly to the
destination. The four with the -r option have the additional
advantage that no time is need for packing/unpacking the archive.
- the tests have been run on various nodes.
The results so far are very disappointing. So far no method anywhere
has yielded within an order of magnitude of the expected bandwidth,
which should be in the single-digit GB/sec. Is there any basic
procedural flaw in the tests?
host |
size |
target |
cp |
cp -r |
rcp |
rcp -r |
rsync |
rsync -r |
cxfscp |
cxfscp -r |
cpio |
tar |
ic10 |
3 GB |
/vftmp |
6.50 |
7.15 |
7.57 |
7.94 |
30.91 |
44.50 |
45.20 |
33.03 |
46.87 |
41.20 |
ic10 |
3 GB |
/work |
37.70 |
23.77 |
149.83 |
153.53 |
67.53 |
34.17 |
22.29 |
21.17 |
59.05 |
40.85 |
ic10 |
3 GB |
/archive |
90.19 |
60.34 |
402.67 |
482.44 |
455.43 |
242.62 |
94.44 |
92.50 |
270.28 |
365.99 |
ic10 |
25 GB |
/vftmp |
61.08 |
88.49 |
69.87 |
70.99 |
247.37 |
286.38 |
164.68 |
189.78 |
326.38 |
182.93 |
ic10 |
25 GB |
/work |
420.17 |
138.81 |
1154.92 |
1123.40 |
756.88 |325.64 |
118.50 |
120.73 |
205.33 |
256.84 |
ic10 |
25 GB |
/archive |
573.23 |
722.37 |
3582.61 |
3198.38 |1589.06 |
2143.48 |
582.37 |
694.26 |
1566.29 |
1008.74 |
ic10 |
166 GB |
/vftmp |
3226.53 |
1557.12 |
1444.46 |
1350.96 |
2880.95 |
2738.83 |
713.86 |
565.55 |
1543.45 |
1959.81 |
ic10 |
166 GB |
/work |
1394.47 |
1650.04 |
7537.16 |
778..55 |
3362.38 |
3048.66 |
1000.01 |
1150.76 |
1831.88 |
1953.08 |
ic10 |
166 GB |
/archive |
2890.00 |
2564.34 |
14717.04 |
32917.24 |
20611.61 |
13848.50 |
4189.66 |
5133.96 |
13514.89 |
|
</div>
Wed, 14 Nov 2007
gridspec: second version of gridspec-tools released.
A second version of the gridspec tools has been released, at this
starting point. This now includes a full example of coupled model grid
creation.
Mon, 12 Nov 2007
gridspec: first version of gridspec-tools released.
A first version of the gridspec tools has been released, at this
starting point.
Tue, 6 Nov 2007
linux: dual computer, single keyboard and mouse.
Thanks to Chan Wilson, I can now use a single keyboard and mouse when
my desktop and laptop sit side-by-side. I make my desktop machine run
a VNC server that allows another machine to control it; and I let my
laptop's keyboard and mouse control it. This has to be done over the
daisy tunnel.
- Download
x11vnc for the desktop; and x2vnc for the laptop.
- Set up the
daisy tunnel to do this port-forwarding;
LocalForward 5910 vb.gfdl.noaa.gov:10001
- Create and store (encrypted) a password for this connection.
vncpasswd
x11vnc -storepasswd
- Copy this passwd (which will be in
\$HOME/.vnc/passwd to the same
location on your laptop.
- Start the desktop VNC server listening on the
10001 port.
x11vnc -nofb -rfbport 10001 -usepw &
- Start the laptop VNC client sending on local port
5910 (the number
on the command line is the port you listed in the tunnel, minus 5900,
don't ask...)
x2vnc -passwdfile ~/.vnc/passwd -west localhost:10 &
The west means that when you slide off the west edge ofyour laptop,
you'll be on your desktop.
Tue, 6 Nov 2007
linux: dual display
Dual display is supported by the NVidia card, a feature they call
TwinView . Their own site hosts detailed instructions on configuring
TwinView .
It works best for me if the displays are side by side to make a single
wide display; or as a clone display used in conjunction with a
projector.
First, make sure you're using NVidia's own driver instead of the
generic one distributed with most linux distros. (Unfortunately, this
violates free software purity). You will probably find something
called nvidia-glx or something like that: that's the name of the
Ubuntu package for it).
You can tell if you look at the Driver section of your
/etc/X11/xorg.conf file, the driver should be called nvidia and not
nv .
Clone display, laptop and projector
This runs a clone display suitable for most projectors:
Section "Screen"
Identifier "laptop-projector"
Device "NVIDIA Corporation NVIDIA Default Card"
Monitor "Generic Monitor"
DefaultDepth 24
Option "TwinView" "True"
Option "TwinViewOrientation" "Clone"
Option "UseEdidFreqs" "True"
Option "MetaModes" "DFP-0: 1920x1200, 1024x768 @1920x1200 +4+124"
...
DFP-0 identifies the primary (laptop) display, the second clones a
1024x768 window within it to the second display with a +4+124 offset
from the top left corner. The offset makes allowance for window
manager decorations, which you don't want to see on the projector.
If X is already running when you connect to the projector, you may
need to restart X to detect the secondary display.
Dual display
Side-by-side, you can have two monitors showing one giant display,
like so:
Section "Screen"
Identifier "dual-1600"
Device "NVIDIA Corporation NVIDIA Default Card"
Monitor "Generic Monitor"
DefaultDepth 24
Option "TwinView" "True"
Option "TwinViewOrientation" "RightOf"
Option "UseEdidFreqs" "True"
Option "MetaModes" "1920x1200, 1600x1200"
makes a single 3520x1200 display. Note that this only effective if the
two monitors have a mode share a Y resolution.
Fri, 2 Nov 2007
FRE: checkpointing, add user control
We currently signal checkpoints by having Ops create a flag in the
directory /home/gfdl/flags :
if ( -f /home/gfdl/flags/fre.checkpoint.\$HOSTNAME || \
-f /home/gfdl/flags/fre.checkpoint.all || \
-f /home/gfdl/flags/jobs/fre.checkpoint.\$JOB_ID ) then
#exit gracefully...
I propose we extend this to allow users also to bring down their own
jobs, by creating the file \$HOME/fre.checkpoint.\$JOB_ID .
The frerun -generated jobscript can be made to delete this file on its
way down, so you don't get a backlog of these files sitting around in
\$HOME . (Though even if you did, it's a one-in-a-million chance, quite
precisely, against your checkpointing another job inadvertently...
that's how long it takes for JOB_ID to get recycled).
Fri, 2 Nov 2007
FMS: online checkpointing bug
Online checkpointing is now working and has been checked in on the
omsk_vb branch.
There is a bug with the debug template (checks and warnings turned
on). An attempt to read the value of the function checkpoint , which
looks like the logical variable checkpoint , results in an RTL error.
This needs to be reported to Intel.
Here's how to recreate the error:
cvs co -r omsk shared
cvs up -r 1.1.2.4 shared/platform
cd shared/platform
mkmf -t /home/fms/bin/mkmf.debugtemplate.ia64 \
-c"-DGFDL_HPCS -Dtest_checkpoint -Duse_libMPI" ../include ../time_manager ../fms \
../mpp ../mpp/include ../constants ../memutils
make
cat << EOF > input.nml
&checkpoint_nml day=1 /
EOF
mpirun -np 1 a.out
This returns
forrtl: severe (193): Run-Time Check Failure. The variable
'checkpoint_mod_mp_checkpoint_\$CHECKPOINT' is being used without being
defined
...
MPI: #10 0x400000000113ccd0 in CHECKPOINT_MOD::checkpoint(time=struct time_type { ... }) "checkpoint.F90":101
MPI: #11 0x400000000113d550 in checkpoint_test() "checkpoint.F90":127
...
That "variable" is actually a function return value and is correct.
The same code will work corrctly if you turn off error checking in the
compiler (i.e use mkmf.template.ia64 instead).
Sun, 14 Oct 2007
FMS: possible time manager bug
So here's a relevant code fragment (from checkpoint_mod ):
integer :: sec, min, hr, dy, mon, yr
type(time_type) :: time, last
call get_date( time-last, yr, mon, dy, hr, min, sec )
When I print the results of this, I get:
checkpoint time= 1982 Jan 01 12:00:00
checkpoint last= 1982 Jan 01 00:00:00
checkpoint diff= 1 1 1 12 0 0
checkpoint time= 1982 Jan 02 00:00:00
checkpoint last= 1982 Jan 01 00:00:00
checkpoint diff= 1 1 2 0 0 0
where the diff line is printing out yr, mon, dy, hr, min, sec .
I think the first diff ought to read 0 0 0 12 0 0 and the second ought
to read 0 0 1 0 0 0 .
What's going on? I find the time_manager_mod code too convoluted to
follow.
Sat, 13 Oct 2007
FMS: Online checkpointing
Checkpoint/restart or CPR is a mechanism to intervene in a running
job, to cause it to exit gracefully upon receiving a signal, and then
be able to restart and continue exactly from the state where it left
off. It's a useful feature that's recently taken on some urgency, as
we're finding the system idles when a large PE-count job is waiting to
be loaded, and Ops is waiting for that many PEs to empty.
Currently checkpoints are enabled in FRE, at the script level. Online
checkpointing refers to checkpoints at the code level, which can be
finer-grained: FRE can only intervene at intervals of a run segment,
which can be several months or years of model time.
To enable online checkpoints I've added a new module checkpoint_mod to
FMS. It's currently added on the omsk_vb branch of shared/platform ,
but perhaps should go elsewhere... shared/fms maybe?
The idea is that the main loop in coupler_main checks every coupling
timestep whether a checkpoint is triggered. The trigger is
site-specific and non-portable; it's thus enclosed in an ifdef . It
currently works only on the HPCS, and the shared code needs to be
compiled with -DGFDL_HPCS to turn it on.
The interval at which the model is checkpointable (yes, I know this
isn't a word) is specified in namelist checkpoint_nml . It should be an
exact multiple of the coupling timestep, obviously, but more subtly,
should be a multiple of any time-averaging interval specified in the
diag_table . Neither of these is currently enforced in the code... you
have to be sure and specify a valid checkpoint interval. We recommend
month=1 in checkpoint_nml .
The trigger at this time is the existence of a file that is created by
Ops when they want to signal to jobs to come down.
Initial tests show that there are problems with frerun . It still
seems to apply the wrong timestamp to some files.
Steps remaining:
- make sure
frerun can handle these unscheduled exits.
- restart a checkpointed job and make sure it passes the restart
test.
- CPR a production job to make sure
frepp can handled unpredictable
and uneven segments.
- add some more information to
stdout and stderr about how a
checkpoint was triggered.
It can be checked out using
cvs up -r omsk_vb coupler shared/platform
This is a branch tag and is still evolving. I will send out a static
tag and instructions when it's ready for wider testing.
Mon, 16 Jul 2007
FMS Revised data estimates
Model name |
Restarts (GB/y) |
History (GB/y) |
Thruput (y/d) |
#runs |
Output (GB/d) |
source |
MOMp25 |
40 |
76 |
| |
| jwd mail 26 Jun 2007 |
M180 |
17 |
66 |
| |
| bw hurrell (nor NARCCAP) run, see note |
M90 |
4 |
16.5 |
| |
| |
MOMp25+M90 |
44 |
92 |
2 |
2 |
547 |
|
C192 |
18 |
70 |
4 |
1 |
354 |
|
C384 |
72 |
280| 2 |
1 |
708 |
|
Total data volume output per day is 1600 GB/d.
Minimum sustained bandwidth to stream this data back is 150 Mbps.
Associated spreadsheet is /home/vb/proposals/ornl/DataVolume.ods .
Wed, 11 Jul 2007
FMS: Data output estimates
Estimates are in GB for a model year:
Model name |
Restarts |
History |
source |
MOMp25 |
40 |
76 |
jwd mail 26 Jun 2007 |
M180 |
17 |
91 |
bw NARCCAP run, see note |
M90 |
4 |
23 |
scaled from M180 |
MOMp25+M90 |
44 |
99 |
see above |
C192 |
18 |
96 |
scaled from M180 (1.06) |
C384 |
72 |
192| scaled from C192 |
- Source for M180 is
/archive/bw/fms/memphis/narccap/m180_narccap_hist
C192 implies 192x192x6 points; M180 implies 576x360
- 2 job streams of
MOMp25+M90 , one each of C192 and C384 running at 2
years a day imply a total data output of 664 GB/day (or 61 Mbit/s). This
is the sustained bandwidth requirement between ORNL and /archive . In
addition, we should probably require at least 10 days worth of
staging disk and 10 days worth of scratch: 14 TB. At 3y/day it's all x1.5.
Thu, 7 Jun 2007
FRE: how FRE jobs get checkpointed
The design developed by Amy and Tim is that FRE scripts will
periodically check for the existence of a system file; and if it
exists, bring the model down gracefully, like a deflating Zeppelin.
The scripts resubmit themselves to the queues again.
Here is the relevant fragment of code from frerun generated script
code:
#checkpoint -- if system requests jobs exit, resubmit this script and
t
if ( -f /home/gfdl/flags/fre.checkpoint.\$HOSTNAME || \
-f /home/gfdl/flags/fre.checkpoint.all || \
-f /home/gfdl/flags/jobs/fre.checkpoint.\$JOB_ID ) then
qsub \$scriptName
unset echo
echo exiting early by HPCS request, resubmitting...
Mail -s "job \$JOB_ID \$name has been checkpointed" \$USER <<END
Your FRE production job ( \$JOB_ID ) has been stopped and
resubmitted to the batch queue. It will be re-run by the operators
as soon as possible.
Job details:
\$name (run \$ireload, loop \$irun) running on \$HOST
Batch job stdout:
\$SGE_STDOUT_PATH
END
sleep 30
exit 5
endif
- Ops will create one of those
fre.checkpoint files to signal to FRE
scripts that they are asking for a shutdown: which might be of the
single HPCS node HOSTNAME , or the entire HPCS ("all "), or of only
the job JOB_ID .
- FRE then will
qsub itself, update any state files (such as
recording the current timestamp), and exit.
- Whether the resubmitted job goes to the top or bottom of the queues
is a matter of site policy, and outside the scope of FRE. However,
it's possible for Ops or SGE (the HPCS scheduler) to recognize
checkpointed jobs, so any policy is implementable. Site policy
will be designed to provide incentives for users to make their jobs
checkpointable.
- Since the mechanism is so simple, you could easily modify your
non-FRE script, or even model code, to check for the existence of
these files, and take appropriate action.
Tue, 29 May 2007
FRE: reducing queue wait time for frepp
A very simple change to FRE might be for the runscript to do its qsub
frepp at the beginning of the job (with -hold_jid \$JOB_ID ) at the
beginning of the job, rather than the end? See discussion below...
Tue, 29 May 2007
FRE: breaking up FRE scripts into scriptlets.
Several times in the past, we have considered breaking up the
FRE-generated shell scripts into smaller scripts. A higher granularity
of scripts submitted to the scheduler would in principle allow a
tighter control of system load: visualize the queuing problem as a
Tetris puzzle with time on the Y axis, and processors on X.
A simple example of how this might be done would be consider the
script generated by frerun to have 3 stages: data load ; parallel run ;
data offload . The load and offload stages move data between archive
(slow access) and scratch (fast access) and are essentially single-PE,
multiple-IOstream, large-memory. The run stage is high-PE-count and
runs directly to and from scratch storage. As the resource requirements
are different, we could consider splitting the runscript up into 3
jobs or scriptlets, where only the run scriptlet would hold down many
processors.
The implications of so configuring a job are several:
scratch storage is required to be persistent: our current model of
data storage is based on scratch being a job-specific directory
created on local disk.
- There would be 3 times as many jobs on the system. On a system
dominated by queue wait times (as is the case now on the HPCS) this
is certainly a consideration.
- Total residence time of data on
scratch would be from the beginning
of the first scriptlet to the end of the third.
There are two ways to create persistent scratch :
- one is to make your own subdirectory under
/vftmp/\$user instead of
the \$TMPDIR issued to you by SGE. There are two disadvantages: one
is that /vftmp/\$user is private to a node; all scriptlets seeking
to share an instance must execute on the same node. The second is
that the user becomes responsible for deleting the workspace,
unlike \$TMPDIR which is automatically scrubbed at the end of a job.
The residence time of data in scratch storage (GB-hours) must be
modeled and compared to the total available.
- Alternatively you create it in the "semi-permanent" storage areas
/work and /ptmp . (Current policy is that users cannot directly use
/ptmp ; use is reserved for caching archiver written by Tim Yeager,
called hsmfiles .) Semi-permanent storage is on CXFS, so using it as
scratch (where there may be many small writes to disk...) carries
the risk of creating a lot of CXFS metadata traffic.
We can model the behaviour of this by expressing three quantities; the
total time TT (hours), CPU residence time PT (CP-hours) and scratch
data residence time DT (GB-hours) of the load/run/offload scriptlet
sequence. We cannot of course simultaneously minimize all three.
Broadly speaking, TT is a measure of an individual scientist's
satisfaction; PT is a measure of overall system health and efficient
use; and DT is constrained by the maximum disk size. DT need not be
minimized if it stays below DTMAX: but filling the disk is likely to
result in painful disruption in terms of lost jobs and recovery time.
Call the scriptlets L, R, and O; each has an associated queue wait
time Q, runtime T, processor count P and scratch space S.
The processor count for the L and O scriptlets is set to 2: the
minimum on the current HPCS.
When this is run as a single job:
- TT = QL + TL + TR + TO
- PT = P*(TL + TR + TO)
- DT = TL + S*(TR+TO)
When this is run as 3 scriptlets independently submitted to a
scheduler:
- TT = QL + TL + QR + TR + QO + TO
- PT = P*TR + 2*(TL + TO)
- DT = TL + S*(QR + TR + QO + TO)
While there is a clear advantage in PT for the second method
(especially when P is large) there is a disadvantage in the other two
measures of resource usage, which arises from the additional queue
wait times associated with the later jobs in the sequence, QR and QO.
SGE allows you to minimize QR and QO by submitting all three jobs at
once (rather than having each job submit the next), and declaring a
dependency (see the qsub manpage for the hold_jid flag). We submit the
jobs all at once in sequence, getting jobs j1 , j2 and j3 submitted as
follows:
qsub load_expt (qsub returns a job ID j1 on stdout )
qsub -hold_jid j1 run_expt (returns j2 )
qsub -hold_jid j2 offload_expt
Job j2 and j3 are placed in a hold state until their dependencies are
fulfilled, but continue to advance in the queue. Thus, while queue
wait times may not go to zero, some fraction of QR and QO overlap with
prior jobs in the sequence. Another limiting factor on reduction of
queue wait times is that the three jobs have to go on the same node if
scratch is local disk. Making scratch a shared disk is a significant
re-architecture of the HPCS, and should be approached with caution.
There is also an advantage in making TR very large: this can be
achieved by increasing the maximum job time beyond the current 10
hours.
We are currently exploring this method to see if we can empirically
get ranges of values for some of the numbers above, for typical FRE
production jobs.
It is to be noted that production jobs resubmit themselves, and with
careful design, this sequence of jobs with dependencies and overlaps
can be very long indeed. It can be extended to frepp as well, which is
a separate discussion.
Tue, 29 May 2007
FMS: use of dplace on the HPCS
On the newer installation of the OS on the HPCS, known as Propack5SP1 ,
there is a new version of the command dplace , which has been shown to
significantly improve performance of some codes. It is also unlikely
ever to degrade performance.
The way to tell if a node is running Propack5SP1 is to type cat
/etc/sgi-release :
ic9% cat /etc/sgi-release
SGI ProPack 5SP1 for Linux, Build 501r2-0703010508
ic5%] cat /etc/sgi-release
SGI ProPack 5 for Linux, Build 500r1-0607180902
On Propack5SP1 , use the syntax
unsetenv MPI_DSM_DISTRIBUTE
mpirun -np \$npes dplace -o -r bl -s 1 -c 0-`expr \$npes - 1` foo.x
On Propack5 , use the syntax
unsetenv MPI_DSM_DISTRIBUTE
mpirun -np \$npes dplace -o -s 1 -c 0-`expr \$npes - 1` foo.x
... to run the program foo.x on \$npes processors.
There seems to be some complication due to the use of GFDL's mpirun
"wrapper": please follow the progression of the ticket #23330 on the
GFDL HelpDesk to see how this resolves.
Please note the unsetting of MPI_DSM_DISTRIBUTE : it appears that the
dplace arguments have no effect if this is set.
Mon, 28 May 2007
FRE: fremake bugfix for top-level Makefile failure
Several users had reported a sometimes failure from the library-based
compilation script from fremake . The symptom was an incomplete library
list on the ld line, and usually a failure to find any main program:
This turns out to be because the GNU Make automatic variable \$?
behaves somewhat differently from how I was interpreting it... using
\$^ instead seems to fix it:
\$(LD) \$^ \$(LDFLAGS) -o \$@
This has been updated on the branch nalanda_vb_arl of fremake .
Another minor bugfix shows up when using fremake -t foo . Then the
top-level Makefile was being created in \$HOME and not in \$rootdir/foo .
Also fixed on the same branch.
Thu, 17 May 2007
FRE: strange perl behaviour of \$!
In many places in FRE.pm we have error-checking of shell commands of
the form:
qx/something/;
croak "nevermore: \$!" if \$!;
... however, this doesn't seem to be a good idea, because \$! seems
often to be non-null even when no error.
So, I noticed qx/foo/ resolves to the value of stdout at the end of
command foo ... so I replaced the ones in fremake with
print STDERR qx/something/;
(I also noticed that in mkmf I have lines like
unlink \$object or die "\aERROR unlinking \$object: \$!\n";
... which works fine. Perhaps "\$! is non-null if error"; but not
"error if \$! is non-null "?
Wed, 16 May 2007
FMS: cube sphere and lat lon AMIP code
Here is some version 3 FRE to express the cube sphere compilation...
<component name="fms" paths="shared" includeDir="\$FREROOT/c48_amip_test/src/shared/include">
<source versionControl="cvs" root="/home/fms/cvs">
<codeBase version="nalanda_2007_04">shared</codeBase>
<csh>cvs update -r nalanda_cube_vb shared/amip_interp shared/topography</csh>
</source>
<compile>
<cppDefs>-Duse_libMPI -Duse_netCDF</cppDefs>
<srcList>
/home/ck/nalanda/src/UPDATES/topography.F90
</srcList>
</compile>
</component>
<component name="atmos_phys" paths="atmos_param atmos_shared" requires="fms" includeDir="\$FREROOT/c48_amip_test/src/shared/include">
<source versionControl="cvs" root="/home/fms/cvs">
<codeBase version="nalanda_2007_04">atmos_shared atmos_param</codeBase>
<csh>
cvs update -r latlon2d_bw `/home/fms/bin/list_files_with_tag latlon2d_bw`
cvs update -r merge_rwh_latlon_bw_wfc `/home/fms/bin/list_files_with_tag merge_rwh_latlon_bw_wfc`
</csh>
</source>
<compile>
<srcList>
/home/ck/nalanda/src/UPDATES/interpolator.F90
/home/ck/nalanda/src/UPDATES/mg_drag.F90
</srcList>
</compile>
<compile target="debug">
<cppDefs/>
<mkmfTemplate>\$FREROOT/site/mkmf.debugtemplate.ia64</mkmfTemplate>
</compile>
</component>
<component name="atmos_dyn" requires="fms atmos_phys" paths="atmos_coupled
atmos_cubed_sphere/driver/coupled
atmos_cubed_sphere/model
atmos_cubed_sphere/tools">
<source versionControl="cvs" root="/home/fms/cvs">
<codeBase version="nalanda_2007_04">
atmos_coupled atmos_cubed_sphere/driver/coupled atmos_cubed_sphere/model atmos_cubed_sphere/tools
</codeBase>
<csh>
cvs update -r latlon2d_bw `/home/fms/bin/list_files_with_tag latlon2d_bw`
cvs update -r merge_rwh_latlon_bw_wfc `/home/fms/bin/list_files_with_tag merge_rwh_latlon_bw_wfc`
rm -f atmos_cubed_sphere/tools/pp
</csh>
</source>
<compile>
<cppDefs>-DSPMD -DZERO_ZS</cppDefs>
<srcList>
/home/ck/nalanda/src/UPDATES/atmos_model.F90
</srcList>
</compile>
<compile target="debug">
<cppDefs>-DSPMD -DZERO_ZS</cppDefs>
<mkmfTemplate>\$FREROOT/site/mkmf.debugtemplate.ia64</mkmfTemplate>
</compile>
</component>
<component name="ice" paths="ice_amip ice_param" requires="fms">
<source versionControl="cvs" root="/home/fms/cvs">
<codeBase version="nalanda_2007_04">ice_amip ice_param</codeBase>
</source>
<compile>
<srcList>
/home/ck/nalanda/src/UPDATES/ice_model.F90
</srcList>
</compile>
<compile target="debug">
<cppDefs/>
<mkmfTemplate>\$FREROOT/site/mkmf.debugtemplate.ia64</mkmfTemplate>
</compile>
</component>
<component name="land" paths="land_lad land_param" requires="fms">
<source versionControl="cvs" root="/home/fms/cvs">
<codeBase version="nalanda_2007_04">land_lad land_param</codeBase>
</source>
<compile>
<cppDefs>-DLAND_BND_TRACERS</cppDefs>
<srcList>
/home/ck/nalanda/src/UPDATES/land_model.F90
/home/ck/nalanda/src/UPDATES/land_properties.F90
/home/ck/nalanda/src/UPDATES/rivers.F90
/home/ck/nalanda/src/UPDATES/soil.F90
/home/ck/nalanda/src/UPDATES/vegetation.F90
/home/ck/nalanda/src/UPDATES/numerics.F90
/home/ck/nalanda/src/UPDATES/climap_albedo.F90
</srcList>
</compile>
<compile target="debug">
<cppDefs>-DLAND_BND_TRACERS</cppDefs>
<mkmfTemplate>\$FREROOT/site/mkmf.debugtemplate.ia64</mkmfTemplate>
</compile>
</component>
<component name="ocean" paths="ocean_amip" requires="fms">
<source versionControl="cvs" root="/home/fms/cvs">
<codeBase version="nalanda_2007_04">ocean_amip</codeBase>
</source>
<compile/>
<compile target="debug">
<cppDefs/>
<mkmfTemplate>\$FREROOT/site/mkmf.debugtemplate.ia64</mkmfTemplate>
</compile>
</component>
<component name="coupler" paths="coupler" requires="fms atmos_phys atmos_dyn ice land ocean">
<source versionControl="cvs" root="/home/fms/cvs">
<codeBase version="nalanda_2007_04"> coupler </codeBase>
<csh>
cvs update -r latlon2d_bw `/home/fms/bin/list_files_with_tag latlon2d_bw`
cvs update -r merge_rwh_latlon_bw_wfc `/home/fms/bin/list_files_with_tag merge_rwh_latlon_bw_wfc`
</csh>
</source>
<compile>
<srcList>
/home/ck/nalanda/src/UPDATES/coupler_main.F90
/home/ck/nalanda/src/UPDATES/flux_exchange.F90
/home/ck/nalanda/src/UPDATES/surface_flux.F90
</srcList>
</compile>
<compile target="debug">
<cppDefs/>
<mkmfTemplate>\$FREROOT/site/mkmf.debugtemplate.ia64</mkmfTemplate>
</compile>
</component>
As you can see, almost every component is updated, so you cannot share
libraries with the m45_am2p14 experiment, not even libfms.a !
The sooner we can apply testing to the latlon2d_bw and
merge_rwh_latlon_bw_wfc branches, the better... see
Task 119.
PS. Also note the line rm -f atmos_cubed_sphere/tools/pp under the
component atmos_dyn . The tools/ directory should be parallel with
atmos_cubed_sphere maybe?
Wed, 16 May 2007
FRE: cvs updates and the new fremake
I am having a current problem with the new fremake . The cube-sphere
currently is running with two "update" tags, latlon2d_bw and
merge_rwh_latlon_bw_wfc , both of which span multiple components.
So if you're going to add a line under <source> like:
<csh>cvs update -r foo `list_files_with_tag foo`</csh>
... which component should it go under? Ideally it should only be
executed once, but since the order of components is unpredictable
(it's ordered by keys %codeBase and perl hash order is undefined).
Currently I've repeated it in all components, which is wasteful. A
change requires the ability to restrict list_files_with_tag to a
single component: currently it's invoked in src/ so it updates
everything under there. See Task 117.
Second: The <csh> tag above is quite inelegant (that whole
list_files_with_tag construct is annoying to the code aesthetics
mafia...)
I propose:
<component>
<source>
<codeBase version="nalanda">
<codeUpdate version="nalanda_2007_04">
<codeUpdate version="latlon2d_bw" type="delta">
</source>
</component>
The first codeUpdate replaces entire modules (i.e it's equal to cvs
update -r nalanda_2007_04 component . The type=delta says
that the second one is incremental:
cvs update -r latlon2d_bw `list_files_with_tag latlon2d_bw`
See Task 118.
Tue, 15 May 2007
FRE: invoking dplace from frerun
Currently the run command is something like
time -p mpirun -np \$npes a.out
There's a need to be able to customize this for that dplace stuff
Chris spoke about on 14 May 2007.
I am currently doing this by hand using an environment variable called
MPIRUN_EXEC_PREFIX and then using it on the mpirun line:
setenv MPIRUN_EXEC_PREFIX "dplace -r bl -s 1 -c 0-`expr \$npes - 1`"
...
time -p mpirun -np \$npes \$MPIRUN_EXEC_PREFIX a.out
MPIRUN_EXEC_PREFIX points to something that will be invoked by mpirun
and which in turn will invoke the a.out . This is how you invoke many
performance profilers as well.
Task 116 requests that this behaviour be built into frerun .
Tue, 15 May 2007
FRE: frestatus redesign
As we go through and redesign the FRE scripts, the key issue is to
make the perl code itself site-independent, and retrieve
site-dependent stuff from fre.cshrc .
How do we do that for frestatus ? What it does, mainly, is to grep for
a status string within an output file.
I believe this can be achieved using two environment variables, to be
set in fre.cshrc :
\$FRE_OUTPUT_FILE_REGEXP_LIST is a list of regular expressions that
match output filenames. For instance, the current output file for
experiment \$expt looks like \$ARCH/\$expt/*/ascii/stdout for frerun
and \$root/\$expt/exec/stdout for fremake . The new style of fremake
output files looks like \$FREROOT/stdout/compile_\$expt.csh.o* or
something. This file regexp will be a function of the system
queuing software.
\$FRE_STATUS_STRING_REGEXP_LIST is a list of regular expressions
that match the status string you look for. For instance fremake
reports either the string ERROR: make failed or NOTE: make
successful . The status string is a function of the FRE software.
Some pseudocode (which definitely does not work, but presents an
interface in the new style:-) is checked in on the nalanda_vb tag of
frestatus (14.1.4.1). The request for implementation is
Task 115.
Mon, 30 Apr 2007
FMS FRE: ensemble parallelization
Notes from an exchange with Rich Gudgel about parallelizing
post-processing: can some of this make its way into frepp ?
The question is about simulatenously managing output from a large
number of runs: typically ensembles. In this case it's using the EAKL
ensemble filter.
The first question is about where to store intermediate files. The
recommendation is to use /work :
Running mppnccombine on /ptmp (or actually, /work , because /ptmp
is intended for use by FRE itself...) has the same downsides as
running it in /archive . It generates many small disk writes on
CXFS which bogs things down on the filesystem.
You could copy your uncombined files to /work/rgg after your 200p
job exits, and then launch the mppncombine from a separate script.
But that second script should still copy to /vftmp , run
mppnccombine , then copy the output back to /work (or /archive ).
That's what ptmp -enabled FRE does. /work is underutilized at this
point (3TB used out of 19TB available) and everyone already has
their /work/user directory, so please use it! You can save DMF
traffic by using /work as storage for your "pre-mppnccombined
files".
The second question is about parallelizing mppnccombine . Rich wonders if:
I can utilize a method Ron Stouffer said is available where I can
use my existing requested processors (200) to distribute each
mppnccombine to the desired files (with 16 files per 20 cpus this
actually allows for a pretty reasonable usage of each cpu as long
as I can assign a cpu to each file.
But the answer is:
You cannot parallelize one mppnccombine , but what you could do is
to parallelize across your ensemble. (so, for your 10-member
ensemble, 10 jobs).
How to automate the process? One way of course is for FRE to become
"ensemble-aware", so this will be done by frepp . We have this on
the list of future requirements for the "new FRE", so stay tuned
(but not with bated breath...)
What you can do in the interim:
One mode of parallelism that's underexploited is qsub task
parallelism. (Do man qsub and look for the -t option).
Write a script to do one ensemble member, which is identified by some
shell variable, e.g \$member , so that if \$member=1 , you
process ensemble member 1, etc.
The script should set
set member = \$SGE_TASK_ID
Then launch 10 of these in parallel using qsub -t 10 script . You will
then have a 10-PE job running, and qsub will set SGE_TASK_ID
differently on each PE, so you'll process your whole ensemble.
You could further parallelize by month, if you wanted to get fancy:
set member = \$SGE_TASK_ID/12 + 1
set month = \$SGE_TASK_ID%12 + 1
and use qsub -t 120 script to launch a 120-way parallel job to do all
the months for all the ensemble members.
Mon, 30 Apr 2007
FRE: bugs in new FRE
fremake -n turns on the nocheckout option which means you never
enter the createSrc routine in FRE.pm . If you don't, then the hash
\$srcDir{\$expt} is not set, which causes createExec to fail.
- we previously had one instance of the perl variable
\$! returning an
error code, even though there was no error: this was in L309 of
FRE.pm , now commented out. I also got the same error (command
succeeded but \$! returns "No such file or directory") on L247. I
changed the error to a warning, which still isn't satisfactory, but
at least it keeps going...
- L398 substitutes instances of
\$FREROOT with \$rootdir{\$fre} . which
is incorrect... isn't it? At least there's no requirement that the
user set <directory type="root"> to FREROOT ...
And these two items are design issues and not really bugs, I think
- New
fremake does not understand shell variables like \$root ...
should it?
frerun -t behaviour needs to be changed so old frerun can be used
with new fremake .
And this is a feature not a bug!
<pathNames> is used at compilation, not at checkout, so it's a
subnode of <compile> not <source> .
Tue, 24 Apr 2007
FMS: conventions for unit test programs
Unit test programs are now bundled with the module whose function they
test. They are usually an appendage at the bottom of the source file.
Thus, for module modname , we have the file modname.F90 , which has, at
the end:
#ifdef test_modname
program test
...
#endif
If unit tests are stored in their own file, the proposed convention
is:
- the unit test for module
modname_mod should be named test_modname.F90
and be enclosed in #ifdef test_modname...#endif
- if it needs a namelist, it should be
test_modname_nml
Zhi Liang will be implementing this in mpp .
Wed, 11 Apr 2007
FRE FMS: the modules
Here is a list of modules available in the current CVSROOT/modules
file, grouped by the component they might match:
- fms:
-
shared , may need a version of shared without shared/coupler
- atmos_phys:
-
atmos_param_am2 , atmos_param_am3 , atmos_param_zetac along with
atmos_shared ; may need a module atmos_param_hs for the Held-Suarez
- atmos_dyn:
-
fv , bgrid , spectral , ...
fv should be given a module consisting of
atmos_fv_dynamics/driver/coupled atmos_fv_dynamics/model
atmos_fv_dynamics/tools atmos_coupled , similarly for the others
Tue, 10 Apr 2007
latex: tex4ht getting better!
I am now able to process images directly following instructions on
Gurari's Troubleshooting page: see section on direct processing of EPS
figures. The relevant lines in balaji.cfg are
\Configure{graphics*}
{eps}
{\Needs{"make -f \$TEXROOT/Makefile -r \csname Gin@base\endcsname.png"}% see \$TEXROOT/Makefile
\Picture[\csname Gin@base\endcsname.eps]{\csname Gin@base\endcsname.png class="graphics"}%
}
and the Makefile knows how to convert eps to png .
setenv TEX4HTENV \$TEXROOT/tex4ht.env
settles the issue of a common .env file. (which can also be done using
the -e argument to tex4ht and t4ht ).
Gurari's Q/A page is another useful page to look at. In general,
stuff is not easily linked up, but a google search on
site:http://www.cse.ohio-state.edu/~gurari foo seems to work...
Many things are fixed directly in \$TEXROOT/Makefile .
Thu, 5 Apr 2007
FRE: more on the TODO list
At present make clean does not work in the top-level directory.
One solution would be for the top-level Makefile to define:
clean:
make -f Makefile.fms clean
make -f Makefile.ocean clean
...
Wed, 4 Apr 2007
latex: hyperlatex
Wed, 4 Apr 2007
latex: tex4ht wrapper t4post
- The standard
dvips is producing small images. I attempted to scale
it up using -x 2000 with and without -E ... no fun. How to get
larger images? Also appears to be no way to correct this in CSS...
the <img> tag has height and width in units of pt .
- In
\$HOME/tex/bin I now have scripts to aid post-processing. After
running htlatex , run t4post in that directory... it puts all HTML
files in the right wrapper.
- CSS problems are mostly fixed by using the
NoFonts flag. I now have
a single file called \$HOME/css/tex4ht.css that may be fine-tuned.
I also downloaded hyperlatex, which might just be easier! (not sure).
Tue, 3 Apr 2007
latex: tex4ht progress
tex4ht looks like it ought to do whatever I want, ("highly
configurable", etc) and besides, the Debian port is run by Kapil
Paranjape! We've been corresponding about it, among other things.
The original doc from Eitan Gurari is somewhat impenetrable, but I
found some decent documentation elsewhere, which may be somewhat out
of date.
Problems:
- I'm having problems with my
\includegraphics{} EPS images.
- my PHP header and footer?
- CSS is generating the same for boldface and plain text.
By using a two-step process using dvips instead of Kapil's Debian
default dvipng I am now generating images from \includegraphics{} .
These lines are now enabled in \$HOME/tex/articles/testhtml/tex4ht.env
(why won't this work when I use \$HOME/.text4ht.env instead?):
G.png
Gdvips -E -Ppdf -mode ibmvga -D 110 -f %%1 -pp %%2 > zz%%4.ps
Gconvert zz%%4.ps -trim +repage -density 110x110 -transparent '#FFFFFF' %%3
Grm zz%%4.ps
instead of these lines in /usr/share/texmf/tex4ht/tex4ht.env :
G.png
Gdvipng -q -mode ibmvga -D 110 -o %%3 -T tight -pp %%2 -bg 'Transparent' %%1
I think the best way might be to post-process the output from tex4ht
to embed the <body> portion of the output within our standard header
and footer.
By using the NoFonts option I seem to have reduced the number of CSS
lines to fix. Now I think the CSS file is somewhat static... should be
able to replace it with something.
Tue, 3 Apr 2007
fre: libraries for nalanda_2007_04?
F-group will be using the new FRE environment for testing
nalanda_2007_04 to make libraries. I am proposing that we set up the
libraries in /home/fms/... prior to release, and test that we can pass
RTS using those libraries.
Start with downloading your own instance of the FRE scripts:
\$HOME/fms/bin/fre_setup \$HOME/fre
source \$HOME/fre/site/fre.cshrc
(For now use absolute paths for the FREROOT directory argument...
fre_setup isn't dealing with relative paths correctly).
User fms will:
- identify the list of components that will need libraries: though
the release may begin with only
shared , we need to have a list of
the required components.
- make sure each required component has its own CVS module within
CVSROOT/modules .
- Set up three
<compile> nodes for each target: one with no target
and the production template, one with target="debug"
and the debug template, one with target="flt" and the
fltconsistency template.
- Writing the libraries should be part of the checklist for "moving
the
testing tag".
Liaisons will conduct runs using the precompiled libraries as far as
possible.
Mon, 2 Apr 2007
FRE: Notes and fixes
fre_setup doesn't work correctly if its argument (which will become
FREROOT ) is a relative pathname.
- The compile script uses
make instead of gmake . (Actually those are
synonyms, but it's been noted that if you want to re-alias make
(e.g alias make make -j 8 ) you have to re-alias make and not gmake .
- You might see error messages from perl that may say
Bad file
descriptor in some odd instances: those seem to be some bug or
un-understood phenomenon at this point.
Mon, 2 Apr 2007
FRE: immediate TODO list
fre_setup doesn't work correctly if its argument (which will become
FREROOT ) is a relative pathname. See /home/vb/tmp/fre/foo/fre_setup
for a fix.
fre_setup checks out the testing tag of bin2 , etc! Is that OK?
Should be the HEAD tag, probably...
- does
fremake break frestatus ? We need to settle the issue of where
job output goes, so that frestatus can use it.
- settle the issue of how
fre.cshrc is to know where
experiment-specific stuff goes.
- get canonical XML for checkout and compile working again (both
writing the XML file, and verifying).
- do the top-level Makefile along with the top-level component
(actually the component should tell it when a load is required).
- Figure out how to suppress those
Bad file descriptor messages.
- To be consistent with what went before,
<mkmfTemplate> should
either have a file in a file attribute, or the template directly
loaded in between the start and end tags. Right now we have the
filename between the start and end tags.cd ..
Thu, 29 Mar 2007
tlemcen: configuring X for projectors
The safe configuration for projectors at this time seems to be a
1024x768 display. However, tlemcen is a laptop with a 1920x1200 native
configuration running NVidia graphics.
I found Christophe Troestler's page on laptop configuration to be a
useful resource describing how to set things up to project correctly
from a subwindow of a display that's larger than 1024x768.
The key steps are as follows:
- Run a "nested" X display that's a reasonable size for projection,
- Set a "panning domain" overlapping with the nested display and send
only that domain to the second screen (projector).
- Run your presentation full-screen within that nested display.
In greater detail:
- install Xnest: a package that's part of
XFree86 that runs a "nested
X server". (Installing for me is as simple as typing apt-get
install xnest ). Xnest will act like an independent X display,
running something "full-screen" in it will only occupy the Xnest
window of the real screen.
- always run your
Xnest window in a fixed location, e.g Xnest
-geometry 1024x768+0+0 , which will put it at the top left of your
real screen.
- It will probably be slightly offset from the origin (top-left
corner, nominally
(0,0) ) of the real screen: you can find the
actual location of the window by typing xwininfo and pointing at
your Xnest window. Near the end of the output you'll see the list
of corner coordinates. The first corner is the one closest to the
origin, something like +4+28 .
- set up your dual display to clone your screen to the secondary
output. In my case (
nvidia-xconfig ) that gives you lines in the
Screen section of your X configuration file (in my case,
/etc/X11/xorg.conf ) file that looks something like:
Option "TwinView" "True"
Option "TwinViewOrientation" "Clone"
Option "UseEdidFreqs" "True"
Option "MetaModes" "1920x1200, 1024x768 @1920x1200 +4+28"
- This says the second screen (projector) receives a
panning domain
within the main 1920x1200 screen, which is 1024x768 in size and
offset +4+28 from the origin. This panning domain happens to
coincide with your Xnest window. (There is plenty more information
about Nvidia X configuration if you really want it).
- Now any application running full-screen in your
Xnest window will
appear full-screen on the projection. You also have the rest of
your own screen to do stuff that won't be visible on the
projection. The Xnest usage doc describes how to do this. The way I
do it is:
% Xnest -geometry 1024x768+0+0 :1 &
% kpdf --display :1 /home/vb/tex/talks/fms/forum/20070328.pdf &
The first line starts up the Xnest window on display :1 , the second
line runs kpdf on that display.
Wed, 28 Mar 2007
emacs: the latex-beamer class
My first successful attempt at a talk using the latex beamer class!
the talk is at /home/vb/tex/talks/fms/forum/20070328.tex and the PDF
in the same directory.
Pretty classy look!
Wed, 28 Mar 2007
FRE FMS: MI Team meeting 28 March 2007
Today's MI Team talk is posted to the web.
Wed, 28 Mar 2007
FRE: Amy's FAQ on the dual-run capability
- How do I start a dual run for a new experiment or as a
reproducibility test for an old experiment?
- How do I rerun just a subset of a previous experiment? Will I be
charged for the hours it takes to run?
- How do I tell what hosts and cpusets my previous experiments ran on?
- How do I compare the results of the two runs?
- For which experiments should I perform dual runs?
How do I start a dual run for a new experiment or as a reproducibility
test for an old experiment?
To dual-run an experiment, set up the original experiment just as
before: call frerun as usual and submit your job script. To create a
second instance of the experiment as a dual run, invoke
/home/fms/bin/frerun again with the same FRE schema file and
experiment, but this time with the -u option. This will create a new
runscript that differs from the original runscript in the following
ways:
1. The output will be written to a subdirectory of the original experiment
with an integer digit for a name, ie,
/archive/\$USER/...
experiment_name/
|-- 1/
| |-- ascii/
| |-- history/
| `-- restart/
|-- ascii/
|-- history/
|-- pp/
`-- restart/
2. No post-processing will be performed on the dual run.
3. Dual-run jobs will be submitted with the qsub -A repro -l repro .
Jobs submitted with these options will not be charged to allocated
time, and will show up with a '2' in the STATE column of qa :
% qa -u fjz
572364 fjz dc CM2.1U-D4_1861-2000-Aerosol_Q5 ic.a 60 - 600/
- - 03/23 qw
572370 fjz dc CM2.1U-D4_1861-2000-Aerosol_Q4 ic.a 60 - 600/
- - 03/23 2qw
You can add other qsub -l options e.g 4700 , bx2 , 3700 , or ic5 , to
direct dual run jobs (or any jobs) to specific nodes. qconf -scl
reports the list of available values for -l .
How do I rerun just a subset of a previous experiment? Will I be
charged for the hours it takes to run?
Use frerun -u to generate a dual-run runscript for your experiment, just
as above. You can create as many distinct dual-run runscripts as you like.
Then edit the runscript to change the initial conditions file to point to
the appropriate file from the original experiment's restart directory.
The job will need to be stopped manually, or you can use frepriority
to adjust the number of queue allocations.
If there is a need for this to be more automated, for example by providing a
command line option to frerun to set the initial conditions file for the
dual-run, this may be implemented later.
Since the runscript was created as a dual-run runscript, it will use
the special qsub flags to indicate a dual-run, and you will not be
charged for the hours.
How do I tell what hosts and cpusets my previous experiments ran on?
This information is contained within the \$archivedir/expt/ascii/stdout file
of your experiment. There is an option to frerun which will help you parse
this file by producing a summary from the tail sheet information of your job
submissions:
frestatus -rlx \$rtsxml \$expt
This produces output such as:
NOTE: Natural end-of-script for
/home/wh/ccsp/ipcc_ar4_volc/scripts/CM2.1U-D4_1PctTo4X_J2b.
jobname CM2.1U-D4_1PctTo4X_J2b
jobnumber 459604
qsub_time Tue Jan 23 22:34:43 2007
start_time Wed Jan 24 02:25:03 2007
end_time Wed Jan 24 10:21:33 2007
failed 0
cpuset 8-37,68-87,238-243,246-249
hostname ic3
NOTE: Natural end-of-script for
/home/wh/ccsp/ipcc_ar4_volc/scripts/CM2.1U-D4_1PctTo4X_J2b.
jobname CM2.1U-D4_1PctTo4X_J2b
jobnumber 460641
qsub_time Wed Jan 24 10:21:33 2007
start_time Wed Jan 24 15:58:09 2007
end_time Wed Jan 24 23:50:59 2007
failed 0
cpuset 54-57,250-253,330-337,362-371,394-403,430-443,450-451,468-475
hostname ic6
...
Note that these cpusets are logical cpusets, not physical cpusets.
How do I compare the results of the two runs?
To compare the results of two runs, you can use frecheck . This will
check all available matching output files from your original and dual
runs.
To compare just two specific restart files, run the resdiff utility.
ls -1 \$archivedir/expt/restart/19830101.cpio
\$archivedir/expt/1/restart/19830101.cpio | resdiff
The resdiff utility is located in /home/fms/bin , and a usage message is
available with resdiff -h . resdiff uses cmp to compare the files
within multiple cpio archives. There is another utility histdiff which
does a similar thing but uses Remik's nccmp tool. This allows for more
detailed comparisons; see histdiff -h for more details.
Currently there is not a more automated way to test the results, but
automated mechanisms may be implemented in the future.
For which experiments should I perform dual runs?
Dual runs should be done at the discretion of the users and scientific
groups. Any runs which are deemed sufficiently important, or have had
anomalous behaviour such as unexplained failures, maybe worth
rerunning. A history of all jobs on the system and where they ran are
available, going back more or less indefinitely.
Wed, 28 Mar 2007
FMS: April 2007 patch to nalanda
There has been a flurry of activity since the nalanda release to
integrate some the irreversible changes introduced by the distribution
of the shortwave flux field into streams; and the changes to
incorporate the new conservation checks on water (and other quantities
soon).
The changes are serious enough to warrant a new patch as soon as
possible. Please follow the wiki page on post-nalanda tag moves as we
begin the patch.
Wed, 21 Mar 2007
FMS: using histx
Will Cooke's notes on using histx for performance analysis:
Here are some notes I made on getting performance data out of our
models. I'd leave the -d out of the histx part of the regression. i.e.
do everything.
If you have time, you could try this to see if there's a sore thumb sticking
out in your model runs.
Will
Method for using histx
See GFDL Wiki page on Altix profiling for detailed info.
For profiling a small portion of the code.
Add
call enable_histx
call disable_histx
around the code you want to profile.
If you're timing the entire code, start here.
Add
source /home/gcs/histx_1.4b/setup.csh
to your XML setup (I'm assuming csh/tcsh is being used). The /home/gcs
references should become /home/fms/ ... sooner rather than later.
Add
-L/home/gcs/histx_1.4b/lib -lhistx
to the LIBS variable of <mkmfTemplate> (which gets tacked on to LDFLAGS ).
The code must be compiled with -g also.
fremake the code as normal. you need to relink at the minimum.
Add
<regression name="prof">
<run days="8" months="0" npes="15" runTimePerJob="00:60:00"
histx="-l -d -f"/>
</regression>
to your XML experiment.
-l gives line information.
-d disables the timing info until it hits call enable_histx . Remove
this is you're timing the entire code.
-f is needed for parallel code.
frerun -r prof -x ...
run the script in a cpuset environment (IC4, IC5 or qsub)
Go to the archive directory and explode the hi*.cpio file.
Run
source /home/gcs/histx_1.4b/setup.csh
iprep hi.* > my_profile.out
to get combined statistics on your section of code.
Wed, 21 Mar 2007
FMS FRE: the site configuration
Amy Langenhorst proposes a method to organize the site configuration
files so that users can easily have their own copy of the site config
file.
The idea is to have a script, say fre_config . fre_config will checkout
(or update) a version of the site files into your directory of choice:
fre_config -r nalanda -o \$HOME/fre
will create, under \$HOME/fre , the following tree:
bin/ -- contains mkmf, fremake, frerun, etc.
lib/ -- contains FRE.pm and so on
site/ -- site-configured csh setup, mkmf templates for local architectures
(site could be renamed loc for "local", or etc but site might be
clearer... the distinction I'm making is that this is usually not
local to one host, but to one site. See my .cshrc , it also sources
.cshrc,site , .cshrc.`uname` , .cshrc.`hostname` , ...)
site contains the file fre.cshrc , which will set up other paths, see below.
fre_config will also modify the checked out copy of fre.cshrc to set
\$FREROOT to the root point of this checkout (\$HOME/fre in this
example...), add \$FREROOT/bin to \$PATH , \$FREROOT/lib to \$PERLLIB , and
so on. fre_config will then source this file. (Actually it will print
a line asking you to source this file...)
Users can make mods to their fre.cshrc , and then re-source it.
Currently fre.cshrc sets
CVSROOT: delete as this will now move into <checkout root="">
MKMF: path to mkmf
VERSIONS: saves exact per-file cvs checkout info
CVSREDO: redo the checkout
INCLUDE: path to some include files like netCDF
LISTPATHS: path_names file generator, basically a wrapper for find
BATCH_COMPILE: qsub with some defaults
We should go over this list again. e.g BATCH_RUN needs to be added.
And I think REDO etc need be omitted... does anyone ever use them?
They simply clutter the scripts at this point. You redo based on the
canonical FRE file.
Tue, 20 Mar 2007
FRE: the dual-run capability
There are several possibilities on how to handle it.
- no changes to FRE schema.
frerun -u which currently works on
regression tests to set up a unique run, will now also work on
production. It will restart from the <initCond> file and not
perform any post-processing. In short, exactly as though you
created a new experiment that inherited exactly an existing one,
and turned off the <postProcess> node.
- The future evolution of FRE schema will move towards having a
realization attribute that identifies members of an ensemble. The
FRE DB will have the capability to return the exact difference
in configuration between two realizations, e.g a difference between
<initCond> files ("initial-condition ensemble") or between settings
of some input parameter ("perturbed-parameter ensemble").
Mon, 19 Mar 2007
emacs: muse, changed extension to "emu"
Acknowledges the "emacs" part... see last two lines of emacs-muse
customization, below.
(setq muse-file-extension "emu") ; cooler than muse
(add-hook 'find-file-hooks 'muse-mode-maybe)
Mon, 19 Mar 2007
FMS FRE: notes on new repository policies and structures
As we are starting to add the feature of precompiled component
libraries, it's time to take a fresh look at how to structure the
repository.
The component-based FRE schema that is currently being built allows
components to be built at various levels of granularity. We
principally aim to provide the standard model components: atmosphere,
ocean, land, sea ice. A list of such components might include:
- atmos_dyn
-
- FV
- FV cube sphere
- BGRID
- spectral
- zetac
- amip
- EBM
- SCM
- shallow water gridpoint
- shallow water spectral
- shallow water cube sphere
- null
- atmos_phys
-
- am2
- am3
- simple_physics
- null(?)
- ocean
-
- mom4p0
- mom4p0_static_om3 (plus other static configs)
- him
- him static configs
- mom4p1
- mixed layer
- amip
- null
- land
-
- ice
-
- infrastructure
-
- fms with libMPI
- fms with libSMA
- fms with nocomm
In addition, we might choose to package items at a higher level of
granularity: e.g groups of atmospheric column physics, or ocean
bio-geochemistry packages as solo components. This would require each
to have at least some solo test configurations. Perhaps one useful
functional definition of a "supported" model component in this setup
would be the existence of a test program for running it.
It also occurs to me that we could package stuff up at a lower level
of granularity: complete coupled models. Currently one way to retrieve
a model configuration that is known to pass the RTS is to retrieve the
RTS itself, using cvs co -r nalanda rts . The executables that are
supported under this scheme could also be delivered in the same way as
libraries for the components.
A package under the proposed design is a component of a recognized model
configuration that is
Proposal:
Parallel to /home/fms/cvs is /home/fms/components .
- Under which is a long list of components.
- under which is a directory for each release (
(city) and also
(city)_yyyy_mm ). The component and release axes are orthogonal: we
choose to put component outside because the new approach has
version as an attribute of <codeBase> .
- under which we have directories
src (checked-out source), lib
(library), include (headers and modfiles), data (input files), xml
(fre), exec (for application-level components).
Start with some standard ones and then keep extending? too much work
for liaisons?
Thu, 15 Mar 2007
emacs: muse issues
- I discovered a problem with the RDF files produced by
muse-journal
when there are SGML tags in the early text of the entry (which gets
stuck in the RDF <description> ). Fixed it by customizing
muse-journal-rdf-entry-template ...
- still having a problem figuring out how to get complete directory
trees from the muse directory published... tried the method in the
example
muse-init.el (with the weird-looking ,@(...) lisp
expressions...) which didn't work, nor does :include yet... For the
moment I am going to create separate projects for each directory.
Thu, 15 Mar 2007
LaTeX: latex2html is still quite broken
- \usepackage{html} puts
AucTeX into PDFLatex mode... quite annoying!
latex2html picks up latexrc.tex if it's in the local directory...
appears to ignore \$TEXINPUTS ... is that normal?
- when
latexrc is loaded the document is completely haywire...
- I need to fix this to update the FMS Manual!!!
Tue, 13 Mar 2007
FRE: "componentizing" the checkout and compilation
FRE is now set to work with a new version on fremake that can link to
existing libraries and headers, and skip checkout and compilation of
the components you wish to use as a "black box".
Sun, 11 Mar 2007
web: using CSS to create cobweb and www versions of the same page
We have many pages (the FMS and FRE pages are a good example...) where
we want to create pages where some information is to be made public
(www ) and some to be GFDL-internal (cobweb ). Here is a way to use CSS
to achieve this. (CSS stylesheets are the standard way to control how
HTML is actually rendered on screen).
Add this to your HTML header:
<style type="text/css">
.cobweb { display:none; } /* for pages shared between cobweb and www */
</style>
This says, in CSS, that any item of class cobweb is not to be
displayed.
Then, you write your webpages with the GFDL-internal information
included, but enclose the information you don't displayed on the
external web in:
<span class="cobweb">
GFDL-internal information ...
</span>
<span class="cobweb">
Create your webpage in /home/vb/external_html and use symbolic links
to list the file also in /home/vb/internal_html . You will see the
pages rendered differently in a browser when you invoke the cobweb and
www URLs of the same file: the GFDL-internal information will be
invisible in the www page.
As an exercise in CSS, see if you can figure out how I disabled the
standard drop-down menus and the font-resizer macro in the top right
using CSS...
</span>
Fri, 9 Mar 2007
tlemcen: kscd autoplay
Many "modern" desktop environments seem to take a page out of the
Gates playbook and try to guess what you want under most
circumstances. When it's right you usually don't notice, but when it's
wrong it can be quite a problem.
In this particular instance, the issue is that when you insert an
audio CD, KDE automatically launches kscd , the CD player. You jmight
not want that, and secondly on my current installation on tlemcen ,
kscd is not connecting to audio (plays but silently).
After some frustrating attempts to discover where in the KDE config it
says to autoplay CDs using kscd , I gave up and simply disabled it. The
first attempt,
apt-get remove kscd
can't be recommended, as it also wants to remove kubuntu-desktop .
I settled on
mv /usr/bin/kscd /usr/bin/kscd.DISABLED
which causes the KDE daemon to pop up an error message, but no matter.
Further research on the kubuntu forums shows that the file in question
is ~/.kde/share/config/medianotifierrc , which in turns says that for
audio CDs, start up
/usr/share/apps/konqueror/servicemenus/audiocd_play.desktop , which, at
the bottom, says Exec=kscd .
So, take your pick, kill kscd , delete the audiocd line from
medianotifierrc , or point audiocd_play.desktop to something other than kscd .
Mon, 5 Mar 2007
netCDF: padding the file header
A problem often encountered with making changes to netCDF files is
that by default, at the time the file was first created, the header is
exactly the length required to hold the headers as then defined. Any
subsequent attempts to change the header information using
NF_REDEF (which is used for example by ncatted) involve mass data
motion as the library attempts to move the entire actual data down in
order to make a few bytes more space in the header portion.
One way to get around this problem is by using the two-underscore
version of the header completion routine NF__ENDDEF. This version
has extra arguments to create padding after the header (H_MINFREE ) and
after the static data (V_MINFREE ). To quote the NF__ENDDEF user guide:
The minfree parameters allow one to control costs of future calls to
nc_redef , nc_enddef by requesting that minfree bytes be available at
the end of the section.
Here's how the call looks in Fortran:
INTEGER FUNCTION NF_ENDDEF(INTEGER NCID, INTEGER H_MINFREE, INTEGER V_ALIGN,
INTEGER V_MINFREE, INTEGER R_ALIGN)
NCID
NetCDF ID, from a previous call to NF_OPEN or NF_CREATE.
H_MINFREE
Sets the pad at the end of the "header" section.
V_ALIGN
Controls the alignment of the beginning of the data section for
fixed size variables.
V_MINFREE
Sets the pad at the end of the data section for fixed size variables.
R_ALIGN
Controls the alignment of the beginning of the data section for
variables which have an unlimited dimension (record variables).
Implications in FMS? We can pad datasets as they are created, by
using the right flavour of NF__ENDDEF in mpp_io . This is controlled
right now by an obscure variable called header_buffer_val , which has
to be set non-zero to turn on this feature.
As the ncatted manpage shows, if you have an existing file without the
padding, you can add it using the --hdr_pad argument. This argument
also exists in ncks and ncrename. I interpret the documentation to say
that any subsequent processing of the file will not destroy the
header. If you use it all up, and still continue to add stuff to the
header, you'll just fall back to the old slow behaviour of moving all
the data down in the file.
Once we fix this in mpp_io there is still an issue of code that
doesn't pass through mpp_io . I've seen at least one reference to
NF_ENDDEF with one underscore (which means no padding) in the drifters
package.
Sat, 3 Mar 2007
parallel computing: a new distributed OS?
Limbo programming language and the
Inferno OS
Sat, 3 Mar 2007
emacs: setting muse project headers on a per-project basis
How do I set muse-html-header on a per-file or per-project basis?
This doesn't work...
(add-hook 'muse-before-publish-hook
'(lambda ()
;; (setq muse-html-header
(message
(concat (file-name-directory (buffer-file-name (current-buffer))) "header.html"))
;; (setq muse-html-footer
(message
(concat (file-name-directory (buffer-file-name (current-buffer))) "footer.html"))
(setq muse-html-table-attributes "class=\"muse-table\"")))
Instead I have
(setq muse-html-header "header.html")
(setq muse-html-footer "footer.html")
which seems to create confusion if I'm editing multiple projects
simultaneously...
It also appears I need to set
(setq muse-journal-rdf-base-url "http://cobweb.gfdl.noaa.gov/~vb/weblogs/")
... which it appears you can also set in the project-alist as :header ...
but how to make this come out different per-project?
So, did I finally figure out how to set the project-alist ?
(setq muse-project-alist
'(("gfdlweb" ; GFDL public web
("~/muse/gfdlweb" :default "index")
(:base "html":path "~/external_html"
:header "~/muse/gfdlweb/header.html"
:footer "~/muse/gfdlweb/footer.html"))
;; (:base "pdf" :path "~/external_html/pdf"))
("weblog" ; weblog on cobweb
("~/muse/cobweb/weblogs" :default "journal")
(:base "journal-html" :path "~/internal_html/weblogs"
:header "~/muse/cobweb/weblogs/header.html")
:footer "~/muse/cobweb/weblogs/footer.html")
(:base "journal-rdf" :path "~/internal_html/weblogs"
:base-url "http://cobweb.gfdl.noaa.gov/~vb/weblogs/"))
("cobweb" ; GFDL internal web
("~/muse/cobweb" :default "index")
(:base "html" :path "~/internal_html"
:exclude "weblogs"
:header "~/muse/cobweb/header.html"
:footer "~/muse/cobweb/footer.html"))
;; (:base "pdf" :path "~/internal_html/pdf"))
("web" ; Princeton web
("~/muse/web" :default "index")
(:base "html" :path "~/public_html"))))
;; (:base "pdf" :path "~/public_html/pdf"))))
Seems to work!
Sat, 3 Mar 2007
emacs: htmlize
The muse documentation says that using htmlize we are able to process
<src lang="foo"> but it doesn't seem to work, perhaps
because we have an older version.
To invoke htmlize you seem to need
(add-to-list 'load-path
"/usr/share/emacs/site-lisp/emacs-wiki/contrib")
(require 'htmlize)
htmlize produces nice-looking output, but by default it's a complete
HTML file, with header, style info, and body. Need to figure out how
to embed it within muse .
Tue, 27 Feb 2007
emacs: muse web doc doesn't quite match my version
I now have a working setup and some homepages on cobweb and gfdl.
Still need to find out if princeton web will accept php...
other minor tweaks: necessary.
One issue is that muse-el from ubuntu dapper is not the latest on the
muse-mode website... however the version found there does not install
cleanly on dapper , and besides, does not validate my muse.
What about magpierss ? I've fixed header.html so it points to
../magpierss instead...
Mon, 26 Feb 2007
emacs: muse musings
Ok, so I seem to have a working muse setup, I now have
- directory
gfdlweb for publishing to the GFDL web in
the external_html directory
- directory
cobweb for publishing to the
GFDL internal web in the internal_html directory
- directory
web for publishing to the Princeton web in
the public_html directory
An additional directory cobweb/weblogs is published using journal-html
and journal-rdf styles (journal-rss appears to have a bug). This is
encoded in .emacs.tlemcen as follows:
(setq muse-project-alist
'(("gfdlweb" ; GFDL public web
("~/muse/gfdlweb" :default "index")
(:base "html" :path "~/external_html"))
;; (:base "pdf" :path "~/external_html/pdf"))
("weblog" ; weblog on cobweb
("~/muse/cobweb/weblogs" :default "journal")
(:base "journal-html" :path "~/internal_html/weblogs")
(:base "journal-rdf" :path "~/internal_html/weblogs"))
("cobweb" ; GFDL internal web
("~/muse/cobweb" :default "index")
(:base "html" :path "~/internal_html" :exclude "weblogs"))
;; (:base "pdf" :path "~/internal_html/pdf"))
("web" ; Princeton web
("~/muse/web" :default "index")
(:base "html" :path "~/public_html"))))
;; (:base "pdf" :path "~/public_html/pdf"))))
Order appears important: weblog must precede cobweb , above.
Thu, 22 Feb 2007
FRE: Changes to FRE
Amy and I are proposing some <a href="{url}071114.html">changes
to FRE</a> in response to some of the most requested features:
namely, avoiding compiling the model components where you do not
expect to modify the source, and second, the ability to compile
multiple experiments from the same source.
The first of these involves certain changes to FRE syntax, and also,
unfortunately, certain changes to the repository, both of which are
explained in this entry. A decision has to be made as to whether to do
these now.
The major change to FRE syntax involves a reordering of the XML node
tree, so that <component> is now a high-level tag. All
the operations are now organized by component (as they are already for
post-processing). The key advantage of the current proposal is that
now checkout and compile instructions are organized by component as
well: this means component developers testing within a coupled model
configuration need only checkout and compile the component they are
interested in, and link to pre-compiled libraries for the other
components.
<component name="fms" paths="shared">
<compile>
<cppDefs>-DSPMD -Duse_libMPI -Duse_netCDF -Duse_shared_pointers -Duse_SGI_GSM</cppDefs>
</compile>
</component>
<component name="atmos_phys" paths="atmos_param" requires="fms">
<compile>
<cppDefs></cppDefs>
</compile>
</component>
<component name="atmos_dyn" paths="atmos_coupled atmos_fv_dynamics
atmos_shared" requires="fms atmos_phys">
<compile>
<cppDefs>-DSPMD -Duse_shared_pointers -Duse_SGI_GSM</cppDefs>
</compile>
</component>
<component name="ice" paths="ice_amip ice_param" requires="fms">
<compile>
<cppDefs></cppDefs>
</compile>
</component>
<component name="land" paths="land_lad land_param" requires="fms">
<compile>
<cppDefs>-DLAND_BND_TRACERS</cppDefs>
</compile>
</component>
<component name="ocean" paths="ocean_amip" requires="fms">
<compile>
<cppDefs></cppDefs>
</compile>
</component>
<component name="coupler" paths="coupler" requires="ocean land ice atmos_dyn fms">
<compile>
<cppDefs>-DLAND_BND_TRACERS</cppDefs>
</compile>
</component>
Each <component> now has its own
<cppDefs> , as well as <mkmfTemplate> ,
etc if desired.
Please note the following advantages:
- CPP macros are only applied to the component where they are relevant;
- For debugging one component, you could potentially compile all
other components at optimization and only this one with
-O0 -g .
An even more useful feature is that the compilation of components can
be skipped entirely, by pointing to an existing component library.
Just as the <executable> tag allows one now to skip
compilation by invoking an existing executable, the
<library> tag will allow one to skip the compilation of
an existing component. For instance
<component name="ocean" paths="ocean_mom4" requires="fms">
<library path="/home/fms/lib/nalanda/libmom4.a" include="/home/fms/lib/nalanda/include/mom4">
<compile>
<cppDefs></cppDefs>
</compile>
</component>
will entirely skip compiling the ocean component, but instead invoke
the library /home/fms/lib/nalanda/libmom4.a at the linker
stage. The ability to perform selective compilation, skipping
especially the shared code, but many other components as
well, is probably the most desired feature in FRE. (Along with
multiple compiles from a single source, on which more later...)
One complication that arises is the include attribute of the
library. This is an attribute that is required for components
coded in F90/F95. F90 compilers store module information in a
.mod file (and I haven't ever figured out compiler writers
can't just bundle this information into the .o file). The
.mod files will be required in order to process use
statements in higher-level modules.
The include directory can be correctly processed using a
-I flag, but the current setup does not apply this flag to
.f or .f90 files.
One possibility we have often considered is to rename all files in our
repository to .F90 . This can be done without losing the CVS
history of the file, but if you have an existing checked out
.f90 file and you attempt to update it, the update will
likely fail. This is true even for pre-existing tags.
So, question for liaisons: how mad would people around the lab get, if
along with the nalanda release, we did mass file renaming?
A second, unrelated change to the repository arises from an error in
structure noted in the course of the FRE rewrite. There are modules in
the coupler directory called atmos_ocean_fluxes and
coupler_types . These are =use=d by the component
models, ocean and so on. However the coupler is part of the
superstructure and is supposed to sit above the models in the
component hierarchy, and thus get compiled later. I'd like to
propose moving these two modules to a new shared/coupler
directory.
Tue, 6 Feb 2007
grids: Gridspec status
The files in
/archive/z1l/test_xgrid/tripolar1DXregular2.5Dx2D contain
examples of a mosaic consisting of a tripolar grid, a cube-sphere grid
mosaic (and the exchange grid between?)
Sun, 14 Jan 2007
FRE: notes on the FRE rewrite
The redesign of FRE involves both a refactoring and modularization of
the code, and an evolution of the XML syntax.
Code restructuring
A first cut at the restructured code is seen in
/home/vb/src/perl , with tools fresrc ,
fremake , etc using the module FRE.pm . Site defaults
are in the file /home/vb/src/perl/site/fre.cshrc but isn't
quite properly configured to accept overrides from a setup
tag in the FRE.
FRE.pm is object-oriented: each FRE XML file in unpacked into
a new pbject called a fre . In most instances (in fact all, so
far) the script using FRE.pm will only create one
fre . But in principle, one could imagine a tool using several
fre s: for example, to allow inheritance across FRE files.
Within a fre , each experiment is unpacked into a hash. The
key of the node is the experiment name: the value
of the hash is the expt node. The experiment name is the base
key for all hashes: thus, even in a script spanning multiple
=fre=s, we require that experiment names be unique. (We could
relax this requirement if needed by constructing a unique key from the
concatenation of the fre name and the experiment name, i.e
the key of the expt hash.)
All the information below the experiment node is maintained in the
node that's returned as the value of the expt hash. The great
advantage of XML::LibXML is that there's no need to unpack
the XML hierarchy very deeply: you create unique keys at some high
level, and query for the rest.
Changes to FRE file syntax
The principal elements of the new syntax involve one major
restructuring: everything under the experiment tag is now
configured by component . Most of the other changes are new
and more general synonyms for existing elements, e.g the cvs
node is now replaced by a checkout node, whose type
attribute can have values cvs , svn , etc.
The component tag now appears only around the
compile node. (It already appeared as an element inside the
postProcess node...) Each component can have its own version
of compile , with its own cppDefs ,
mkmfTemplate , srcList , and csh tags. The
output of compilation of one component is an object library (currently
static only, i.e .a not .so ). The final stage of
compilation does the linking of libraries as a separate step.
A complication arising of F90 header files (.mod files) is
that components can only be compiled in a certain order. In general,
child components must be compiled prior to the parents. The
infrastructure component is a "universal child" and the
superstructure is the top-level parent, to be compiled last.
Since FMS has a relatively flat structure, we only need to designate
the infrastructure and the superstructure, which we do using the
role attribute. In future, we'll supply a requires
attribute, which will specify dependencies. In fact I'll go do that
now...
Thu, 21 Dec 2006
ESMF: Iredell's proposal: ~/doc/ms/gfdl/pilot3a.doc for pilot proj
- does not mention FV cubed sphere?
- timeline for ocean model component of hurricane (III?) If timeline
is right, could GUOM be a contender alongside HYCOM?
- post-processor component or service?
Mon, 20 Nov 2006
Curator: Talks on Curator
The <a href="talks/curator.pdf">first talk</a> I can remember giving
on Curator is my first proposal that this would be a natural
development from ESMF, at the <a
href="http://www.esmf.ucar.edu/main_site/meeting_summaries/mtg_0305_commmtg.html">
2nd ESMF Community Meeting</a> in Princeton, May 2003.
<a href="talks/apan2004b.pdf">Talk</a> given at <a
href="http://apan.net/meetings/honolulu2004/">APAN eScience
Workshop</a>, January 2004: describes how Curator flows naturally from
current practice in climate modeling.
<a href="talks/esp2004.pdf">First presented</a> to <a
href="http://go-essp.gfdl.noaa.gov/">GO-ESSP</a> (then called ESP)
community, at <a
href="http://data1.gfdl.noaa.gov/~hap/go-essp/meetings/06_08_04/agenda_presentations.html">
ESP meeting</a> in Princeton, June 2004.
<a href="talks/curator_gridmeta2005.pdf">
First talk</a> post-award, at <a
href="http://www.esmf.ucar.edu/main_site/meeting_summaries/mtg_0507_commmtg.html">
ESMF Community Meeting</a> in Cambridge, MA, July 2005. Also
introduces the grid metadata.
<a href="talks/curator_gridmeta2005.pdf">
Similar talk</a> to <a
href="http://www.cisl.ucar.edu/dir/CAS2K5/index.html">CAS Workshop</a> in
Annecy, September 2005.
<a href="talks/curator-prism2005.pdf">
Introducing Curator to PRISM</a> at <a
href="http://www.prism.enes.org/news_meetings/meetings/CommunityMtg2005/minutes.php">
First PRISM Community Meeting</a>, Toulouse, November 2005. Also a <a
href="talks/prism2005.pdf">keynote address</a> at the same meeting
intoduces the use cases.
Overview of <a href="talks/metadata.pdf">ESMF-ESC metadata
activities</a> given to PRISM <a
href="http://www.prism.enes.org/news_meetings/meetings/MetadataMtg_May2006/minutes.php">
metadata meeting</a> in Exeter, May 2006.
This talk to an NSF <a
href="http://www.sdsc.edu/PMaC/GeoScience_Workshop/"> workshop on
Petascale computing in the Geosciences</a> in San Diego, April 2006;
presents <a href="talks/petageo.pdf"> Curator as part of an integrated
approach to the petascale</a> looking at models, data, and multi-model
campaigns.
Talks at the <a
href="http://data1.gfdl.noaa.gov/~ck/go-essp/presentations/06_19_06/agenda_presentations.html">
GO-ESSP Meeting</a> in Livermore CA, June 2005, covered the <a
href="talks/esp2006.pdf"> outlook for IPCC AR5 and beyond</a>, as well
as a detailed look at the <a href="talks/gridmeta2006.pdf"> draft grid
metadata standard</a>.
More outreach beyond climate modeling: <a
href="talks/modest7c-balaji-esmf.pdf"> this talk</a> was solicited by
an astrophysics community applying frameworks to stellar dynamics
models, at the <a
href="http://www.manybody.org/modest/Workshops/modest-7c.html">
MODEST-7c</a> workshop in Philadelphia, September 2006. <a
href="talks/crrc-balaji.pdf"> Another talk</a> later that month to a
community interested in real-time response to coastal disasters, at
the <a href="http://www.crrc.unh.edu/fall_institute/"> CRRC Fall
Institute</a> in Durham NH, September 2006.
The <a
href="http://www.earthsystemcurator.org/index.php?option=com_content&task=view&id=30&Itemid=65">
first ESC Meeting</a> attained <a
href="http://hotitems.oar.noaa.gov/storyPrint_org.php?sid=3759">
some notoriety</a>.
Fri, 17 Feb 2006
parallel computing: FV runs Held-Suarez
Modifications to make Held-Suarez solo driver for FV work with the new
code. This is the experiment HSfvd in idealized.xml .
atmos_nudge.f90 which is now in
atmos_fv_dynamics/driver/coupled needs a new home, so that
driver/solo can also use it.
The version in atmos/shared seems to be "dead"... it's not in
any module and I think Bruce has killed it. Perhaps the version in
atmos_fv_dynamics/driver/coupled could replace it?
Otherwise it could go in atmos_fv_dynamics/tools ?
shared/data_override and shared/time_interp need to
be included in the CVS module fms_fv_dynamics_solo .
driver/solo/atmosphere.f90 modified and renamed to
atmosphere.F90
driver/solo/fv_phys.F90 modified
fv_pack modified to publish two more variables needed by solo
driver.
atmosphere.F90 , fv_phys and fv_pack are now
tagged lima_20060217_vb .
Thu, 9 Feb 2006
FMS: FV core gets testing tag
Ok, it's going into testing ... finger crossed.
mpp_pset.F90 did not work on Origin: it's now been fixed. (And it even
works!) Changes:
the use_SGI_GSM flag now can be turned on for Origin.
if use_SGI_GSM is on, mpp_pset_init asserts that
SMA_GLOBAL_ALLOC must be on.
mpp_translate_remote_ptr does nothing on Irix...
SMA_GLOBAL_ALLOC means no translation required.
There was one place where a Cray pointer was being passed to an
integer(POINTER_KIND) , which MIPSpro doesn't like.
Now we have
#ifdef sgi_mipspro
real :: dummy
pointer( ptr, dummy )
#else
integer(POINTER_KIND), intent(in) :: ptr
#endif
instead.
Also may need to correct fv_physics and
atmosphere.f90 ... looking into it.
Thu, 9 Feb 2006
SciDAC proposal notes
SciDAC proposal notes:
At the current time (2005-2015) the principal mode of advancement in
climate modeling is by the study of a process across many
models: multi-model ensembles, where we achieve many independent
realizations of a simulation to construct a PDF. This involves very
large datasets (Tb-Pb) created at sites distributed around the world,
requiring to be analyzed on a common footing:
- Petabyte-scale storage by itself is achievable, but delivery over
the network is a problem. Need a fresh look at compression and
analysis of large data stored on a network of distributed servers.
- since storage isn't an issue we can look at both lossy and lossless
compression, the original data is still archived.
- simple compression techniques that aren't aware of the content of
the bits can be improved upon: 1) where FP numbers can be
identified, specialized compression can be applied (dynamic range of
exponent bits much smaller than mantissa bits) 2) knowing the file
contents to be gridded physical fields, multi-grid (or PCA or
wavelets) or other methods can be applied to store the dataset as an
overlay of several files whose size scales inversely with
wavenumber. "Domain-aware compression".
- AMR and nested models: explore the extension of techniques to
complex grid mosaics.
- techniques for manipulation of remote data, expressing and applying
sophisticated computation server-side;
Prerequisites:
- standards for describing complex grid mosaics, development of
regridding algorithms on mosaics.
- federation of climate data archives around the world.
Some work underway in other funded projects on the prerequisites, but
that work isn't complete. I'm not sure whether to provide these as
linked efforts or base work on this grant.
Wed, 1 Feb 2006
FMS: FV core ready for testing?
resuming the FV weblog entry... will try and patch the HTML over to
the wiki.
The new FV core is ready to be introduced into the testing
code stream. All required changes are within the directory
atmos_fv_dynamics , plus a new file in the shared/mpp
directory, mpp_pset.F90 . PSET stands for Persistent
Shared-memory Execution Thread and is the implementation of
shared-memory on Altix and Origin (so far). I'll be giving a lunchtime
seminar on PSETs in about a month, if you're interested in how it's
done.
All files are tagged lima_20060131_vb .
Instructions for moving the testing tag:
In shared/mpp , checkout mpp_pset.F90 and apply
testing tag.
In atmos_fv_dynamics many files will not be in the release.
These files will disappear:
model/benergy.f90
model/cd_core.F90
model/d2a3d.F90
model/drymadj.f90
model/geo_map.F90
model/geop_d.F90
model/geopk.f90
model/mapz_module.f90
model/p_var.f90
model/pkez.f90
model/polavg.f90
model/te_map.F90
tools/read_fv_rst.F90
tools/upper.F90
tools/write_fv_rst.F90
These files will remain:
driver/coupled/atmos_nudge.f90
model/dyn_core.F90
model/ecmfft.f90
model/fill_module.f90
model/fv_dynamics.F90
model/fv_pack.F90
model/pft_module.F90
model/shr_kind_mod.f90
model/sw_core.F90
model/tracer_2d.F90
model/update_fv_phys.F90
tools/age_of_air.F90
tools/fv_diagnostics.F90
tools/getmax.F90
tools/gmean.F90
tools/init_dry_atm.F90
tools/init_sw_ic.F90
tools/mod_comm.F90
tools/par_vecsum.F90
tools/pmaxmin.F90
tools/pv_module.F90
tools/set_eta.f90
tools/timingModule.F90
These files are new:
model/fv_arrays.F90
model/fv_arrays.h
model/fv_point.inc
model/mapz_module.F90
tools/fv_restart.F90
These files need to be renamed from .f90 to .F90 :
driver/coupled/atmosphere.f90
driver/coupled/fv_physics.f90
model/tp_core.f90
Note that model/mapz_module was already changed from
.f90 to .F90 between lima and memphis, in the
code I inherited...
The easiest way to get the testing tag on the right files, I think is
this:
cd atmos_fv_dynamics
cvs tag -d testing
cvs update -r lima_20060131_vb
cvs tag testing
... but there will still be the issue of the files whose names need to
be changed.
Compiling and running:
The code requires one set of flags for reproducing Lima answers, and
another set for the new code, which will become the standard version
shortly, after the usual stringent tests, climate runs and so on. I
am going to call the new version the Memphis version in this document,
even though it isn't officially sanctioned yet.
Running the lima version:
To run the Lima version, use the following flags in the
cppDefs XML tag;
-DSPMD -Duse_libMPI -Duse_netCDF -DUSE_LIMA
This is supposed to reproduce answers against any current run using
FV, but I've only tested it for m45_am2p13 , and only on the
Altix.
You need to set fv_core_nml as follows:
<namelist name="fv_core_nml">
nlon=144, mlat=90, nlev=24, ncnst=4,
consv_te = 0.7, layout=1,\$npes
</namelist>
(If you're running a concurrent coupled model, the value of
layout(2) of course is no longer \$npes but whatever
atmos_npes is...)
Running the memphis version:
Your running the new version is not a requirement for Memphis
testing, as we don't have official reference runs yet. However, if
you are curious about the shared-memory stuff, here's how to use it.
To run the new version on Altix, use:
-DSPMD -Duse_libMPI -Duse_netCDF -Duse_shared_pointers -Duse_SGI_GSM
The numbers specified in the layout argument of fv_core_nml
specify the PSET count and the MPI count. Typically, I set the MPI
count to 15 or 30, and let it pick the PSET count from \$npes.
For example
<namelist name="fv_core_nml">
nlon=144, mlat=90, nlev=24, ncnst=4,
consv_te = 0.7, layout=0,15
</namelist>
can be run on 1,2,3 or 6 threads, using 15, 30, 45 or 90 PEs.
The reason for choosing 2, 3 or 6 threads but not 4 or 5 is that
thread-parallelism is mostly applied in the k or j
direction within the FV core, and along i in the AM physics.
So I chose numbers that divide nlon , mlat/15 , and
nlev exactly.
The size of the physics window is set in atmosphere_nml . The
code currently requires that the physics window divide the 2D domain
decomposition exactly. For instance, in the example above the 6x15
distribution yields a 2D domain decomposition for physics that's 24x6.
It's best to pick one that divides 24x6... that will also work for 3
threads (48x6) or 2 threads (72x6). For instance:
<namelist name="atmosphere_nml">
physics_window = 24,1
</namelist>
Setting physics_window to (0,0) yields a window that
fills the whole domain, so that the window loop only loops once, so
that's been set as the default.
Here is a reference run for m45_am2p13 using the memphis
version... as I said, this isn't officially sanctioned yet. Answers
match on any thread count, of course only if -fltconsistency
is used.
<reference restart="/archive/vb/fms/lima_vb/rts/ia64/
m45_am2p13_shpbase/1x0m8d_30pe/restart/19820109.cpio"/>
Tue, 8 Nov 2005
FMS: FV core mod_comm changes
Transformation of mod_comm to 2D: it's now written so that mp_init
alone is called by the ypelist (one PE per latitude band). Every other
routine can be called by the whole pelist, but PEs not in ypelist will
exit the routine immeidately. How?
- added variable no_mod_comm, default TRUE, set FALSE at the top of
mp_init. Every routine other than mp_init has as its first line
if( no_mod_comm )return
mp_init is now called from fv_arrays_init, not fv_init:
!initialize mod_comm on the ypelist
call mpp_declare_pelist(ypelist)
if( allocator )then
call mpp_set_current_pelist(ypelist, commID=commID)
call mp_init( nx, ny, nz, commID )
end if
call mpp_set_current_pelist(pelist)
fv_arrays_allocate is eliminated, this is now done by fv_arrays_init.
Public variables of mod_comm that must be set correctly outside
mod_comm: gid, numpro, numcpu, yfirst, ylast. The last 4 in fv_init,
yfirst/last are for prints, numpro/numcpu are for n_zonal.
Moved n_zonal into fv_arrays_init: no external dependence
numpro/numcpu
Eliminated yfirst/ylast prints from fv_init, but they still need to be
initialized internally, so y_decomp is still called.
gid is now == mpp_pe, does anyone require it to be 0 on master?
numpro/numcpu now silenced. (there were unused references to them in
fv_dynamics, eliminated)
Need to add layout to fv_core_nml
In this version all mp_* calls only work on shared arrays?
added two new shared arrays: penorth in fv_dynamics, cymax in tracer_2d
Possible problems routines: par_vecsum.F90 calls mp_sum1d, replace
with mpp_sum?: callers gmean, mapz_module
fv_restart calls mp_bcst_* on non-shared scalars/arrays: replace with
mpp_broadcast?
getmax calls mp_reduce_max, replace with mpp_max? called by timingModule.F90
Tue, 25 Oct 2005
FMS: FV core, minor changes
Some name changes and reorg is necessary in fv_arrays_mod :
fv_arrays_allocate (plural) to be merged into fv_arrays_init
fv_array_allocate to generate address without communication (since
fv_stack is already a shared stack). Instead, call fv_array_check on
it so that you get a noop when #ifndef debug_shared_pointers
Remove the len argument to fv_array_allocate , instead make that a
module global, and add a new routine fv_stack_reset which is called
once per timestep, from fv_dynamics .
Thu, 20 Oct 2005
FMS: FV core, pointer shortcomings
It turns out that the use-associated pointer cannot directly be
applied to a Cray pointee. The test code shown here will fail to
inherit the pointer... however when you define the pointer pp
and assign the value p to it (see commented lines), it works:
module test_p
implicit none
private
integer(8), public :: p
public :: make_a
contains
subroutine make_a
real, allocatable, save :: a(:)
allocate( a(100) )
call random_number(a)
p = loc(a)
end subroutine make_a
end module test_p
program test
use test_p, only: p, make_a
real :: b(100)
pointer(p,b)
! pointer(pp,b)
! pp = p
call make_a
print *, b
end program test
Wed, 19 Oct 2005
FMS: FV core works
Ok, now the lima_vb code without the sharedptr flag exactly matches
the shpbase code:
ic1 9:35pm> dmget
/archive/vb/fms/lima_vb/rts/ia64/m45_am2p13_shpbase/1x0m8d_15pe/restart/19820109.cpio
/archive/vb/fms/lima_vb/rts/ia64//m45_am2p13_lima_vb/1x0m8d_15pe4/restart/19820109.cpio
ic1 10:01pm> ls -1
/archive/vb/fms/lima_vb/rts/ia64/m45_am2p13_shpbase/1x0m8d_15pe/restart/19820109.cpio
/archive/vb/fms/lima_vb/rts/ia64//m45_am2p13_lima_vb/1x0m8d_15pe4/restart/19820109.cpio | resdiff
193408 blocks
193408 blocks
/// /archive/vb/fms/lima_vb/rts/ia64//m45_am2p13_lima_vb/1x0m8d_15pe4/restart/19820109.cpio
\\\\\\ /archive/vb/fms/lima_vb/rts/ia64/m45_am2p13_shpbase/1x0m8d_15pe/restart/19820109.cpio
Comparing atmos_coupled.res.nc...
Comparing coupler.res...
Comparing fv_rst.res...
Comparing fv_srf_wnd.res...
Comparing ice_model.res.nc...
Comparing mg_drag.res.nc...
Comparing ocean_model.res.nc...
Comparing physics_driver.res.nc...
Comparing radiation_driver.res.nc...
Comparing radiative_gases.res.nc...
Comparing soil.res.nc...
Comparing strat_cloud.res.nc...
Comparing vegetation.res.nc...
Note that this was produced with the 1x0m8d_15pe4 script.
The 1x0m8d_15pe5 script is now testing it with the shared pointers
turned on, but a single thread.
Executable is in ~/fms/lima_vb/rts/ia64/m45_am2p13_lima_vb/shp
compiled with
fvmk -DSPMD -Duse_libMPI -Duse_netCDF -Duse_shared_pointers -Ddebug_shared_pointers
Tue, 18 Oct 2005
FMS: FV core, fv_domain closed
Working now after Gerardo's help and a few other minor bugfixes
Next is to correct Will's read_fv_rst references to fv_domain .
Close off fv_domain .
Tue, 11 Oct 2005
FMS: FV core, debugger problem
trouble debugging in totalview... no symbol table seems to be created
for read_fv_rst , which is not in a module.
Cannot simply make it into a module, because then there is use-
circularity betwene fv_pack and this module.
Better is to create a new module fv_restart_mod , containing
read_fv_restart , write_fv_restart , fv_restart .
This module should be used/called from atmosphere_init/end ,
not fv_init/end . (right after fv_init and right
before fv_end )
Sun, 9 Oct 2005
FMS: FV core, almost there
Ok the bulk of the code changes look complete.
Next pause and take stock, see if M45 runs ok on 1x15 and 1x30
All experiments in
Experiments: each is run 8d at 1x15 1x30 and 4x2d at 1x15
<table summary="Regression test table for FV SHP validation" border>
<caption>
Experiments to validate shared pointers based on the
m45_am2p13 RTS. Hover on the column header for additional
info. Hover on the experiment to get the archive directory (that you
can pass to frecheck -c , for instance).
</caption>
<tr>
<th>Name</th>
<th title="relative to /home/vb/fms: XML file is in .../rts/am2.xml">Root</th>
<th title="applied as an update relative to lima">Tag</th>
<th title="also -Duse_libMPI -Duse_netCDF on all">CPP flags</th>
<th>Comments</th>
<th>Status</th>
<tr>
<td title="/archive/vb/fms/lima/rts/ia64">lima</td>
<td class="code">lima</td>
<td>lima</td>
<td class="code">-DSPMD</td>
<td>Baseline from lima</td>
<td title="passes RTS">ok</td>
<tr>
<td title="not used for frecheck">lima_vb</td>
<td class="code">lima_vb</td>
<td>lima_vb (branch tag: <b>unstable!</b>)</td>
<td>various</td>
<td>branch code used for quick testing</td>
<td>ok</td>
<tr>
<td title="/archive/vb/fms/testing/rts">testing</td>
<td class="code">testing</td>
<td>testing</td>
<td class="code">-DSPMD</td>
<td>Baseline from testing</td>
<td title="passes RTS; matches lima">ok</td>
<tr>
<td title="/archive/vb/fms/lima_vb/rts/ia64">shpbase_lima</td>
<td class="code">lima_vb</td>
<td>lima_shpbase_vb</td>
<td class="code">-DSPMD -DUSE_LIMA</td>
<td>Baseline merged code from kkg, matching lima</td>
<td title="passes RTS; matches lima">ok</td>
<tr>
<td title="/archive/vb/fms/lima_vb/rts/ia64">shpbase</td>
<td class="code">lima_vb</td>
<td>lima_shpbase_vb</td>
<td class="code">-DSPMD</td>
<td>Baseline merged code from kkg, matching lima_sjl</td>
<td title="passes RTS">ok</td>
<tr>
<td>shpdevel</td>
<td class="code">lima_vb</td>
<td>lima_shpbase_vb</td>
<td class="code">-DSPMD -Duse_shared_pointers -Ddebug_shared_pointers</td>
<td>Baseline from testing</td>
<td>nil</td>
</table>
SPMD: we could leave it in place and call the mp_ routines with the pe
subset? if they all call it's going to be a problem in some places.
The 4D array transfers have an OMP loop that could become thread-parallel.
Sat, 8 Oct 2005
FMS: FV code flags
The code as it stands compiles with a variety of switches: I've tried
-DSPMD ("original")
-DSPMD -Duse_shared_pointers
" " ("new code compiles without shared pointers")
-Duse_shared_pointers (" the target: replace SPMD calls")
Add the following settings to your .cshrc :
# main fv source directory
set fvsrc = ~/fms/lima_vb/rts/ia64/m45_am2p13_lima_vb/src
set fvdir = \$fvsrc/atmos_fv_dynamics/model
# exec directory without use_shared_pointers
set fvnoshpx = ~/fms/lima_vb/shp/no_shared_pointers
# exec directory with use_shared_pointers
set fvshpx = ~/fms/lima_vb/shp/use_shared_pointers
# pathnames file
set fvpaths = ~/fms/lima_vb/shp/path_names
# alias for making, pass CPPDEFS in args
alias fvmk 'mkmf -t ~/chepauk/mkmf.template.chepauk -c"\!*" \
-a \$fvsrc \$fvpaths /usr/include/netcdf shared/mpp/include'
Then you can compile any of the above by using fvmk and
make :
fvmk -DSPMD -Duse_shared_pointers
make
Some routines yet to be parallelized, the original list is:
List of 39 OMP-parallel routines: init_dry_atm pmaxmin
geopk get_height_given_pressure get_pressure_given_height
pv_entropy pkez tracer_mass fv_init Ga_Get4d_i4 hydro_eq
geop_d Ga_Put4d_i4 Ga_Put4d_r4 te_map benergy fv_dynamics
drymadj add_tracers fv_diag mp_reduce_max update_fv_phys
omp_start get_bottom_mass p_var Ga_Get4d_r8 vort_d cd_core
read_fv_rst geo_map Ga_Put4d_r8 fv_restart p_energy
age_of_air maxmin_global Ga_Get4d_r4 tracer_2d_lima
compute_vdot_gradp d2a3d pmaxming
Next, pmaxmin.F90 : pmaxming is easy (copying a halo array into a naked
array)
pmaxmin dimensions things (im,jt) where jt is JxK . We need to
calculate the division... add a routine fv_array_limits to fv_arrays .
Still need to do the mpi_reduce ... mpp_max , min .
prt_maxmin_local can be parallelized also, but isn't in the original.
pv_module.F90 : OK
te_map has the =ixj=1,jp= loop... use fv_array_limits to calculate the
loop limits on these... this needs n_zonal from fv_pack to be
calculated.
Modified fv_array and fv_pack
cd_core :
update_fv_phys : updated argument lists to remove u_dt /etc... isn't it
better to have them still in the arglist, and mpp_malloc them in
update_phys_up ?
mapz_module.F90 completed: te dz are
actually reuse of shared arrays ua and va . Also
ps_bp is directly use-associated in the fv_arrays.h
method in geo_map and te_map . Both of these are only
called by fv_dynamics .
There is also an erroneous reference to the variable gid outside
#ifdef SPMD . Have initialized =gid=0 #ifndef SPMD= in
mapz_module :
fv_dynamics.F90 : now along with ps_bp , we also need
to use-associate the *_phys shared arrays.
Thu, 6 Oct 2005
FMS: FV details
mapz_module: stick with the argument list for now. But it's been shown
that specifying start and end indices in module procedures can force
copies... may need to change the way arguments are passed, or switch
to use association.
routine te_map :
Lots of SPMD message-passing to clean up.
Need to figure out what pem/tte0/hs are in the calling routine
fv_dynamics , required to be parallel.
the 2000 loop is odd... not parallelized. Everyone executes over whole
space...!
Too many switches between OMP k -loops and j -loops, could be cleaned
up.
2 j -loops are split by a call to par_vecsum at L687
k -loop at L751 is kmap,km ... I changed it to max(ksp,kmap),kep !!
should be OK, no?
Not parallelizing loop at L807
geo_map:
not parallelizing 2000 and 4000 loops
Similarly in routine pkez, L2506 loop is deferred parallelization, but
the comment there gives a hint as to how to fix it.
benergy stalled: te/dz were local arrays in the old (Lima) version,
but now are arguments. It also appears that fv_dynamics calls benergy
passing ua/va here... why?
Also ps_bp is one shared array that is use-associated while the rest
are args... why?
Wed, 5 Oct 2005
FMS: FV details (minor)
Just following the makefiles path won't do...
init_dry_atm: changed POLVANI read to mpp_open
did NOT duplicate changes to USE_LIMA version
fv_physics: still need to do windows logic properly, take from 1D2D
fv_diagnostics:
atmosphere.f90:
get_bottom_mass and wind, should use assumed shape arrays and offsets
ip,jp
these routines are called with unshared arguments, shouldn't be
parallelized
(they were called outside the original calling tree, that's why they
show up as orphaned)
read_fv_rst.F90: did not touch the I/O bit, Will has rewritten it,
merge
corrected routines read_fv_rst and add_tracers
write_fv_rst.F90: corrected how arrays are acquired, MERGE IO from Will.
maxmin_global is NOT parallelized, uses unshared arrays.
mapz_module.F90: complicated interactions with fv_dynamics, save for tomorrow.
Tue, 4 Oct 2005
FMS: FV core, added fv_array_check
Added a routine fv_array_check, which verifies if an array is a shared
array. The check is only performed #ifdef debug_shared_pointers. The
is needed because the check requires an mpp_sync... should not
be used in production.
This call is added in the preamble, where we expect shared arrays to
be passed in through the argument list. (e.g d2a3d).
Current syntax requires Cray pointers (actually the LOC function).
This should perhaps be cleaned up later.
fv_pack.F90: finished (both with and without USE_LIMA)
use mpp_malloc for local (auto) shared arrays
use fv_array_check for shared arrays through the argument list
use #include "fv_arrays.h" for use-associated arrays
still need to replace SPMD
next in compilation sequence is update_fv_phys
update_fv_phys.F90:
add {t,q,u,v}_dt to fv_arrays.h and fv_arrays.F90
use the include method... argument list changed to eliminate _dt
shared arrays
What is du_s? There is a recv but no corresponding send
this routine uses beglon/endlon/beglat/endlat, which is redundant with
is/ie/js/je, probably should be replaced: scope for error/mismatch.
Or add error check in fv_init.
Note use of mpp_sync in this routine: this is because we need to sync
inbetween an OMP k-loop and a j-loop.
tp_core.f90 needs to become .F90! currently uses special command in
path_names file, which for some reason is not used in the makefile.
I edited the Makefile by hand!
20050929: FMS: FV core shared pointers
init_dry_atm: arrays are use-associated
replace read(61)+scatter with parallel read
pmaxmin: looking to see if all instances of the main array are shared
arrays... exceptions:
fv_diag:529(a2)
fv_diag:770(age)
- - - - - - -
=Annotated tree of OpenMP-containing subroutines=
get_bottom_mass in file driver/coupled/atmosphere.f90:397: (no callers, no calls)
tracer_mass in file model/fv_pack.F90:1965: (no callers, no calls)
add_tracers in file tools/read_fv_rst.F90:443: (no callers, no calls)
fv_init in file model/fv_pack.F90:294:
array args: none
calls: fv_arrays_init
set_fv_geom
pft_init
fftfax
fv_arrays_allocate
tm_set_tracer_profile
fv_restart
is called by: atmosphere_init
fv_restart in file model/fv_pack.F90:819:
array args: none
calls: init_sw_ic
set_eta
init_dry_atm
read_fv_rst
check_eta
d2a3d
is called by: fv_init
read_fv_rst in file tools/read_fv_rst.F90:3:
array args: none [1 omp directive - OK]
calls: set_eta
get_number_tracers
get_tracer_indices
get_tracer_names
set_tracer_profile
pmaxmin
pmaxming
p_var
d2a3d [with km=1]
is called by: fv_restart:901
init_dry_atm in file tools/init_dry_atm.F90:478:
array args: none
calls: p_var
jet2d_symm
hydro_eq
d2a3d [with km=1]
is called by: fv_restart
hydro_eq in file tools/init_dry_atm.F90:652:
array args: use-associated
calls: pmaxming
is called by: init_dry_atm
fv_diag in file tools/fv_diagnostics.F90:279:
array args: none
calls: get_time
get_date
pmaxmin
drymadj
zsmean
vort_d
pv_entropy
get_pressure_given_height
get_height_given_pressure
pmaxming
age_of_air
is called by: atmosphere_up
get_pressure_given_height in file tools/fv_diagnostics.F90:578: (openmp leaf)
array args: wz, a2 (local to caller)
ts [pt(1,beglat,nlev) in calls; use-associated]
is called by: fv_diag
get_height_given_pressure in file tools/fv_diagnostics.F90:645: (openmp leaf)
array args: wz, a2 (local to caller)
is called by: fv_diag
age_of_air in file tools/fv_diagnostics.F90:710:
array args: delp, peln, q (use-associated)
calls: pmaxmin
is called by: fv_diag
Note: age_of_air in file tools/age_of_air.F90:1: is not used!
pmaxming in file tools/pmaxmin.F90:4:
array args: a [varies by caller, all use-associated]
calls: pmaxmin
is called by: fv_diag [a -> q (use-associated)]
hydro_eq [a -> pt (use-associated)]
read_fv_rst [a -> u, v, pt, q, u_srf, v_srf (use-associated)]
pmaxmin in file tools/pmaxmin.F90:30: (openmp leaf)
array args: a [varies by caller]
is called by: fv_diag [a -> ps, ua, va, omga (use-associated), a2 (local)]
age_of_air [a -> age (local)]
pmaxming [a -> tmp (local)]
read_fv_rst [a -> zsurf (local)]
write_fv_rst [a -> ps(1,beglat) (use-associated)]
vort_d in file tools/pv_module.F90:16: (openmp leaf)
array args: u, v [use-associated]
vort [local or use-associated in caller - see below]
is called by: fv_diag [use-associated omga as vort in call]
init_sw_ic [local vort in call]
pv_entropy in file tools/pv_module.F90:97:
array args: pt, pkz, delp [use-associated]
vort [use-associated, actual arg is omga]
calls: ppme
is called by: fv_diag
fv_dynamics in file model/fv_dynamics.F90:39:
array args: u, v, delp, pt, q, ps, pe, pk, pkz, phis, omga, peln, ua, va [use-associated]
calls: benergy
p_energy
pft2d
cd_core
tracer_2d
tracer_2d_lima
geo_map
te_map
compute_vdot_gradp
is called by: atmosphere_down
benergy in file model/mapz_module.F90:2193: (openmp leaf)
array args: u, v, pt, delp, q, pe, peln, phis [use-associated]
tte [local to caller]
te, dz [work, actual args use-associated: ua, va (resp.)]
is called by: fv_dynamics
te_map in file model/mapz_module.F90:26:
array args: pk, q, delp, pe, ps, u, v, pt, ua, va, omga, peln, pkz [use-associated]
pem [local to caller]
calls: pkez
map1_ppm
map1_q2
par_vecsum
d2a3d
is called by: fv_dynamics
pkez in file model/mapz_module.F90:2383: (openmp leaf)
array args: pe, pk, pkz [use-associated]
is called by: te_map
geo_map in file model/mapz_module.F90:830:
array args: u, v, pt, delp, q, pe, pk, ps, omga, peln, pkz, phis [use-associated]
ua, va [use-associated, work]
calls: mapn_ppm
map1_ppm
p_energy
par_vecsum
d2a3d
is called by: fv_dynamics [phis(1,jfirst) in call]
p_energy in file model/mapz_module.F90:1298:
array args: v, pt, delp, q, pe, peln, phis, ua, va [use-associated]
calls: d2a3d
is called by: fv_dynamics [phis(1,jfirst) in call]
geo_map [phis in call -> phis(1,jfirst) in call from fv_dynamics]
compute_vdot_gradp in file model/fv_dynamics.F90:568:
array args: cx, cy [local to caller]
pe, omga [use-associated]
calls: pft2d
is called by: fv_dynamics
tracer_2d_lima in file model/tracer_2d.F90:10:
array args: q [use-associated]
cx, mfx, cy, mfy [local to caller]
dp1 [actual arg use-associated: va]
flx, va [work arrays ("useless output");
actual args use-associated: ua, pkz (resp.)]
calls: split_trac
tp2c
is called by: fv_dynamics
cd_core in file model/dyn_core.F90:32:
array args: dx, rdx, [both 1d]
u, v, pt, delp, delpf, pe, pk [delpf not in fv_array.h]
uc, vc, delpc, ptc, dpt, wz3, pkc, wz ["useless output"]
calls: get_eta_level
c_sw
geopk <- leaf, openmp (in file model/dyn_core.F90:917:)
pft2d
upol5
prt_maxmin_local
d_sw
geop_d <- leaf, openmp (in file model/dyn_core.F90:795)
is called by: fv_dynamics
geop_d in file model/dyn_core.F90:795: (openmp leaf)
array args: pe, delp, pk, wz, hs, pt
[pk, wz: work arrays not in fv_array.h]
is called by: cd_core
actual args: pe, delp, pkc, wz, hs(1,jfirst), pt
geopk in file model/dyn_core.F90:917: (openmp leaf)
array args: pe, delp, pk, wz, hs, pt
[delp, pk, wz, pt: work arrays not in fv_array.h]
is called by: cd_core
actual args: pe, delpc, pkc, wz, hs(1,jfirst), ptc
update_fv_phys in file model/update_fv_phys.F90:1:
array args: u_dt,v_dt,t_dt,q_dt
[allocated in fv_physics_down; driver/coupled/fv_physics.f90]
calls: pft2d_phys [called separately with u_dt, v_dt and t_dt]
polavg [called separately with t_dt and q_dt]
get_atmos_nudge [called with all arr args]
is called by: fv_physics_up [driver/coupled/fv_physics.f90]
d2a3d in file model/fv_pack.F90:1602: (openmp leaf)
is called by: fv_restart
te_map
geo_map
p_energy
init_dry_atm (km=1!)
read_fv_rst (km=1!)
p_var in file model/fv_pack.F90:1719:
array args: delp,ps,pe,peln,pk,pkz,q [all use-associated - ok]
calls: drymadj
is called by: init_dry_atm [no array args]
read_fv_rst [no array args]
drymadj in file model/fv_pack.F90:1839: (openmp leaf)
array args: ps,delp,pe,pk,peln,pkz,q [all use-associated - ok]
is called by: p_var
fv_diag
maxmin_global in file tools/write_fv_rst.F90:265:
is called by: write_fv_rst
[first 4 places inside #ifdef SPMD;
5th outside, omp directive in maxmin_global
probably can be eliminated and call
parameters adjusted]
Wed, 28 Sep 2005
data: IPCC AR5
Points for Ron/Karl's talk at WGCM:
Several technologies are already or will be ready to be deployed in
time for IPCC/AR5 (i.e prototype ready for test flights on IPCC AR4
data holdings by around 2008). Much of the energy requires to go to
building consensus around standards.
Data volumes are likely to be too high to be served off a single site
(even if it all mirrored in one place... we will probably need to
spread the bandwidth).
Greater tolerance of grid diversity is needed: already evident on
ocean side in AR4 and will continue to be so; likely to be manifest in
atmosphere as well in AR5, eg cube-sphere.
the "1 dataset = 1 file" paradigm will likely have to be broken - it
is unlikely that data formats and web servers where the bulk of data
reside will be able to handle projected dataset sizes gracefully.
Robust and transparent aggregation is a requirement.
It will be possible to describe the differences between model
configurations in greater detail: expanding on CMOR vocabulary for
describing IPCC experiments to include model metadata describing model
components (media) and subcomponents (physics options, algorithms).
Basic server-side analysis capabilities will be possible: aggregation,
regridding, subsetting, perhaps some degree of construction of new
fields (e.g fluxes from mass and velocity fields).
Required operations in this environment:
Certification: will be handled via a metadata database, whose
canonical version will reside at PCMDI, but which might be mirrored
elsewhere. PCMDI will certify data quality as being up to snuff by
passing it through their validator, which will verify that
experiments, models and grids are correctly described, required fields
are present and correctly organized. The database only contains the
metadata, the actual data may continue to reside at the home
institution. The DB will contain checksum information so the consumer
can verify that the dataset being analyzed is the one certified.
Standardization: A grid metadata standard will be agreed upon by the
modeling and data framework communities, and enshrined in CF soon.
Both client-side (e.g ferret, grads, vcs) and server-side (cdat, fds,
las) tools will learn to display and analyze data using the new grid
descirptors. In particular, with such a standard, regridding for the
purpose of analysis is likely to become a more routine operation than
it is now. This will give consumers the option of analysing data on
the native grid, or, perhaps with a more limited palette of options,
on a "standard" grid.
Also model metadata for searching and understanding differences in
model configurations: greater incidence of shared components compared
to AR4.
Server-side data processing: at the minimum, subsetting and some
regridding capability will be performed on the servers where data
reside before delivery to the consumer. The data producer may supply
custom regridding software adapted to the native grid, and will take
responsibility for testing and QA of on-the-fly regridding. Some of
this is compute-intensive and may require deferred processing and data
delivery using web tokens (e.g "batch LAS").
So the bullets:
A distributed data archive: the data are dispersed, but can be
searched and located through a relational database ("Curator")
containing the metadata. The metadata are centrally held and certified
as meeting AR5 requirements by PCMDI.
Extended metadata standards: via CF conventions, also centred at
PCMDI. Metadata standards will allow diverse native grids, at the
minimum, displaced-pole, tripolar, cubed-sphere. Description of
experiments and scenarios will be extended to more precise
descriptions of model configurations, components and subcomponents,
also via CF model metadata.
Server-side data processing: aggregation, regridding, subsetting,
including "batch" web services for computationally intensive analysis.
<!— FV shared pointers —>
The implementation of shared pointers is ready to be deployed: see
fv_arrays.F90 and fv_arrays.h for the implementation.
Deployment will take two flavours:
- routines where shared arrays are use-associated
- routines where shared arrays are passed by arguments: in this
case, one needs to track back up the calling tree to find calling
instances and get them right.
Calling tree (created by \$HOME/src/perl/ftree : OMP-parallel
routines in bold):
{include "calling_tree" nil t}
Tue, 27 Sep 2005
FMS: FV core shared pointers will be Cray!
The implementation of shared pointers is ready to be deployed: see
fv_arrays.F90 and fv_arrays.h for the implementation.
Deployment will take two flavours:
- routines where shared arrays are use-associated
- routines where shared arrays are passed by arguments: in this
case, one needs to track back up the calling tree to find calling
instances and get them right.
Calling tree (created by \$HOME/src/perl/ftree : OMP-parallel
routines in bold):
{include "full_calling_tree" nil t}
Mon, 8 Aug 2005
Curator talk?
We (me and who? Steve H? Cecelia etc.?) should probably present
something on the <a
title="Curator and Grid standards talk at MIT, July 2005"
href="{url}curator_gridmeta2005.pdf">Curator</a> at <a
title="IN15: Multidisciplinary Global Modeling: The Really Big Picture"
href="http://www.agu.org/meetings/fm05/?pageRequest=search&show=detail&sessid=413">
AGU</a>. Abstracts are <a href="http://submissions4.agu.org/submission/entrance.asp">
due 8 September 2005</a>, and one must join the AGU by 15 August
2005.
Thu, 4 Aug 2005
FMS: FV core shared pointers on stack and heap?
The lima_vb branch has been updated with tests for two
methods for creating shared memory arrays that are remotely
addressible. Both require the use of the module mpt-1.12 and
the CPP macros -Duse_libMPI -Duse_SGI_GSM .
use_SGI_GSM requires MPI: I will modify
fms_platform.h so that this is automatic.
For stack and automatic arrays, we use the GSM_Alloc (get
techpubs/intel reference) call. This has been bound now to our
mpp_malloc call. Here is an example of how to create a
remotely addressible automatic array:
subroutine sub(n)
real :: auto(n)
#ifdef use_SGI_GSM
pointer( p, auto )
integer, save :: len=0
call mpp_malloc( p, n, len )
#endif
Now auto is an automatic array shared between all the PEs in
the current_pelist .
mpp_malloc relies on the Cray pointer p
automatically having the save attribute. This was true on
Cray/Origin, can we verify that this is true on Altix?
For allocatable arrays, we use the MPI_SGI_GlobalPtr (get
techpubs/intel reference) mechanism. This requires allocating on one
of the PEs, and sending the address to the other PEs. On the Origin,
it was sufficient to set the environment variable
SMA_GLOBAL_ALLOC . This was sufficient to ensure that the
numerical value of the address was the same everywhere: the allocating
PE merely needed to broadcast the address.
On Altix, the address as seen by a remote PE is not the same numerical
value. SGI added a call for us, MPI_SGI_GlobalPtr , which
translates the remote address to one that is valid from the calling
PE. This has been bound to two new calls. mpp_send_ptr and
mpp_recv_ptr . Here is an example of using it:
if( pe.EQ.root )then
allocate( a(n) )
l = loc(a)
do i = 1,npes-1
call mpp_send_ptr( l, mod(root+i,npes) )
end do
else
call mpp_recv_ptr( l, root )
end if
call sub(l) !instead of call sub(a), or make sub argument-less
subroutine sub(p)
real :: a(n)
pointer(p,n)
Now all PEs in the current_pelist are looking at the same
array a(:) .
In writing subroutine sub you have to be <b>very careful</b>
to assign exclusive portions of a(:) to different PEs. You'll
have race condition errors if you don't. However, any existing OMP
code is guaranteed to have eliminated such race conditions.
The checkout for this experiment (based on m45_am2p13 in
am2.xml ):
<cvs>
<codeBase>fms_fv_am2</codeBase>
<modelConfig>lima</modelConfig>
<cvsUpdates>
cvs update -r lima_vb atmos_fv_dynamics shared/mpp
</cvsUpdates>
</cvs>
<compile>
<csh>
cp /home/vb/fms/lima_vb/rts/ia64/m45_am2p13_lima_vb/src/path_names \
\$code_dir
</csh>
<cppDefs>-DUSE_LIMA -DSPMD -Duse_libMPI -Duse_netCDF</cppDefs>
</compile>
The correct versions are obtained by setting the following compilation
target:
<target platform="ia64">
<csh>
source /opt/modules/default/init/tcsh
module purge
module load modules icc.8.1.026 ifort.8.1.023 mpt-1.12 \
scsl-1.5.1.0 idb.7.3.2
setenv MALLOC_CHECK_ 0
</csh>
</target>
Next step is to turn on the -Duse_SGI_GSM flag. Working on that now.
Wed, 3 Aug 2005
FMS: FV core shared pointers, cpp flag?
#ifdef use_GSM is used by Jeff for the changes to do direct
copies into halos.
#ifdef use_MPI_GSM is used by Gerardo for the changes to do
mpp_malloc -like stuff... should that be folded in?
Ask Gerardo about the use of sizeof() : the fortran bindings
returns answer in bytes? works on types?
So the steps to use on Altix are
<blockquote>
allocate on 1 PE<br>
share the pointer... does this mean everyone else but root_pe
has to declare differently?
</blockquote>
Mon, 1 Aug 2005
FMS: FV shared pointers
Create m45_am2p13_prelima by inheritance from m45_am2p13
Link the src : we want to use the same
~/fms/lima_vb/rts/ia64/m45_am2p13/src files as well as
the path_names file with special compilation for
atmos_fv_dynamics/model/tp_core.f90
Added lima_vb tag to
atmos_fv_dynamics/driver/coupled/atmos_nudge.f90
Also added lima_vb tag to
atmos_fv_dynamics/model/shr_kind_mod.f90 . Koushik had created
it in atmos_fv_dynamics/tools : that file no longer has the
tag.
The XML to make it is now
<cvs>
<codeBase>fms_fv_am2</codeBase>
<modelConfig>lima</modelConfig>
<cvsUpdates>
cvs update -r lima_vb atmos_fv_dynamics
</cvsUpdates>
</cvs>
<compile>
<csh>
cp /home/vb/fms/lima_vb/rts/ia64/m45_am2p13_lima_vb/src/path_names \
\$code_dir
</csh>
<cppDefs>-DUSE_LIMA -DSPMD -Duse_libMPI -Duse_netCDF</cppDefs>
</compile>
Thu, 28 Jul 2005
FMS: MPP: Mosaic
While getting underway with the updates to MPP, Jeff, Zhi and I are
also looking at a major code overhaul, as well as incorporation of
features requested earlier but never implemented for one reason or
another (usually sound ones:-).
Here's a list of possible features to incorporate into MPP (in no
particular order):
- making
pelist opaque: currently it is a simple integer array of PE
IDs, are we test internally (in get_peset ) whether this list, when
ordered, corresponds to an existing communicator. This is a
potential performance problem. The proposed change will require the
use of the existing mpp_declare_pelist routine to create
pelist s before use; perhaps the optional on-the-fly
pelist argument available in many routines will be suppressed.
- remove restriction that arrays passed into
mpp_domains methods be
declared on the data domain; instead this could be on a domain
larger than the data domain. The domain2D datatype now understands
three classes of subdomains.
- support partial-width halo update
- more compact code, removal of nested
#include s, and
perhaps elimination of the _old versions of stuff. This requires
proof that the _new version performs at least as well, in a range
of test cases, unit tests as well as system tests, over a range of
resolutions and scaling (PE counts).
For Mosaic I am proposing a new definition interface.
mpp_define_mosaic . It will reuse as much as possible the
current, single-tile version of the software, called
mpp_define_domains .
it's done in two stages
do n = 1,ntiles
call mpp_define_domains( ... tile=n, pelist=pelist(n) )
end do
call mpp_define_contact_point( tile1, tile2, &
(/i1,j1/), (/i2,j2/), align='X' )
Provide examples of this for cubesphere, tripolar with horizontal and
vertical division.
Also the cyclic and folded cases which we currently
treat as keywords can become contact_points instead!
|
ENDCONTENT;
print $pagecontent;
$url = 'http://www.gfdl.noaa.gov/~vb/weblogs/talks.rdf';
$rss = fetch_rss($url);
if( $rss ) {
echo '' . $rss->channel['title'] . " | \n";
$num = 0; $max = 16;
foreach ($rss->items as $item) {
$href = $item['link'];
$title = $item['title'];
if( $num==$max )$title = 'More...';
echo "$title | \n";
$num++; if( $num>$max )break;
}
}
$url = 'http://www.gfdl.noaa.gov/~vb/weblogs/meetings.rdf';
$rss = fetch_rss($url);
if( $rss ) {
echo '' . $rss->channel['title'] . " | \n";
$num = 0; $max = 16;
foreach ($rss->items as $item) {
$href = $item['link'];
$title = $item['title'];
if( $num==$max )$title = 'More...';
echo "$title | \n";
$num++; if( $num>$max )break;
}
}
$url = 'http://news.google.com/news?hl=en&ned=us&q=climate+science&ie=UTF-8&output=rss';
$rss = fetch_rss($url);
if( $rss ) {
echo '' . $rss->channel['title'] . " | \n";
foreach ($rss->items as $item) {
$href = $item['link'];
$title = $item['title'];
echo "$title | \n";
}
}
// $url = 'http://www.realclimate.org/wp-rss2.php0'; // added 0 to void link
$url = 'http://www.realclimate.org/wp-rss2.php';
$rss = fetch_rss($url);
if( $rss ) {
echo 'Climate Science Commentary from ' .
$rss->channel['title'] . " | \n";
foreach ($rss->items as $item) {
$href = $item['link'];
$title = $item['title'];
echo "$title | \n";
}
}
$pagecontent = <<
created by v. balaji (balajiprinceton.edu) in emacs using the emacs-muse
mode.
ENDCONTENT;
print $pagecontent;
print "last modified: ". date( "d F Y", getlastmod() );
print " this page visited: ".getCount(). " times ";
include "/var/www/html/core/partf";
include "/var/www/html/core/partg";
|