Perhaps these schemes are best discussed with an example. Imagine a model with five levels. At one grid point levels 2 & 3 and levels 3 & 4 are statically unstable. The standard scheme will, at the first pass, mix 3 & 4 and then 2 & 3. It will repeat this ncon times. Marotzke's scheme will mark the unstable pairs and then mix 2-4 in one go. It will then return to this column for a second pass and check all levels once more. My scheme will mix 2 & 3; then compare the densities of 3 & 4 and (if unstable) mix 2-4 like Marotzke's scheme. It will then recompute the density of level 4, compare levels 4 & 5 and mix 2-5 if unstable. Finally it will compare 1 & 2 again, since the density of 2 has changed in the mixing process, so level 1 might have become statically unstable. Only the density of 2 is recalculated for this.
Note that Marotzke's scheme handles the initial mixing of levels 2-4 more efficiently. Probably my scheme could be made slightly faster still by including the ``marking'' feature from Marotzke's scheme (the schemes were developed independently). However, in the typical convection situation only levels 1 & 2 are initially unstable, due to surface cooling. In this situation marking doesn't help. My scheme saves time by ``remembering'' which parts of the water column we already know to be stable, and rechecking only those levels necessary.
There is a subtlety that should be mentioned: due to the non-linear equation of state the task of removing all static instability from the water column may not have a unique solution. In the example above, mixing 2 & 3 could yield a mixture with a lower density than level 4, in spite of 3 being denser than 4, and 2 being denser than 3 originally. In this case, my scheme would only mix 2 & 3, while Marotzke's scheme would still mix 2-4. So both schemes are not strictly equivalent, though for all practical purposes they almost certainly are.
I performed some test runs with the GFDL modular ocean model (MOM) in a two-basin configuration (the same as used by Marotzke and Willebrand 1991). The model has ca. 1000 horizontal grid points and 15 levels, and was integrated for 1 year (time step 1.5 h) on a Cray YMP. Three different model states were used: (A) a state with almost no static instability, achieved by strong uniform surface heating; (B) a state with convection occurring at about 15% of all grid points; (C) a state with convection at 30% of all grid points. The latter two were near equilibrium, with permanent convection. I compared the overall cpu time consumed by these runs with different convection schemes. The standard scheme was tried for three different numbers of iterations ncon. The results are summarized in the table; the overall cpu time is given relative to a run with no convection scheme.
Convection scheme relative cpu time
A B C
No convection scheme 1 1 1
standard, ncon=1 1.13 1.13 1.13
standard, ncon=7 1.88 1.89 1.92
standard, ncon=10 2.25 2.27 2.32
implicit 1.52 1.52 1.52
complete 1.12 1.20 1.36
It was surprising to find that the few innocent-looking lines of model code that handle the convection consume a large percentage of the overall processing time. The numbers are probably an upper limit; a model with realistic topography and time-dependent forcing will use a bigger chunk of the cpu time for iterations in the relaxation routine for the stream function, so that the relative amount spent on convection will be lower. In my test runs, the standard scheme adds 13% cpu time per pass. My complete convection scheme used as much time as 1-3 iterations of the standard scheme, depending on the amount of convection. For zero convection it is as fast as one pass of the standard scheme, because it does the same job in this case. Additional cpu time is only used at those grid points where convection actually occurs. My scheme is considerably faster than the implicit scheme, especially for models where convection happens only at a few grid points, or only part of the time. I did not have Marotzke's scheme available for the test, but in his 1991 paper he mentions a comparison where the computation time with the implicit scheme was 60% of that with his scheme. This would give Marotzke's scheme a relative cpu time of about 2.5 in the table, with strong dependence on the amount of convective activity.
Surface heat fluxes looked identical in the runs with the implicit and complete schemes. The standard scheme showed significant deviations, however, in the surface flux as well as the convective heat flux at different depths. This is not surprising, since the rate at which heat is brought up by convection will be reduced if mixing is incomplete. The runs with ncon=7 and ncon=10 still differed noticeably from each other, and from the complete mixing case. It is possible that this could affect the deep circulation, which is driven by convective heat loss, but I didn't do long integrations to test this. The problem gets worse for longer time steps; with the standard scheme the rate of vertical mixing depends on the time step length. If acceleration techniques are used (``split time stepping'', Bryan 1984), the final equilibrium could differ from one without acceleration due to this unwanted time-step dependence. Marotzke (1991) reports a case where the choice of convective scheme had a decisive influence on the deep circulation. The intention of this note is not to examine these problems any further; it is to provide an efficient alternative.
Conclusion
A convection scheme which completely removes static instability from the water column in one pass has been described, and which is much faster than the implicit scheme of the GFDL model. This scheme avoids possible problems resulting from the incomplete mixing in the standard scheme, while only using as much computer time as 1-3 iterations of the standard scheme.