HAM is incompatible with technically new implementation of operator splitting
Posted by David Neubauer 11/04/2021 https://redmine.hammoz.ethz.ch/issues/803
ECHAM and ICON with ECHAM-physics using lparamcpl=.true. apply “operator strang splitting”, i.e. every process uses as the state an updated state from all previous processes. The reason is to avoid negative tracer concentrations (a process can’t remove what a previous process had already removed). The disadvantage is that the order of processes is important. With lparamcpl=.false. in ICON with ECHAM-physics all tendencies are collected and added only in the end. I.e. all processes use the same state. This is only compatible with small timesteps (not meant for storm-resolving (km-scale) simulations but presumably to have more options available for operator splitting).
The switch-lparamcpl was implemented in ICON after the first coupling of HAM to it. Presumably this switch was implemented to be more flexible about how operator splitting is handled in ICON.
In ECHAM the “operator strang splitting” was implemented such that the fields were not updated during the model time steps but each process contributed to the tendencies and therefore a current “intermediate” state was used for each process by adding the current tendencies to the fields. In ICON this was changed. With lparamcpl=.true. the fields AND the tendencies are updated after each process and each process takes the fields as the current state NOT field + tendencies. I.e. all physics routines were accordingly adapted. Except of course the HAM subroutines which still included fields+tendencies as state of the model. Presumably it is in interplay between vdiff and sedimentation that causes an exponential increase in aerosol tracers, crashing then the model (vdiff transports aerosol away from the surface layer, reducing removal at the surface, while sedimentation can move the aerosol into the surface layer -> under certain conditions if both tendencies are counted twice due the incompatibility of HAM to lparamcpl, this can lead to exponential growth). But what finally crashes the model is not so important, the problem is that with lparamcpl=.true. fields AND tendencies are updated by processes, i.e. if a HAM subroutine uses field+tendency for the state the impact of a process is counted TWICE, causing in some cases exponential changes.
The code where in the HAM subroutines field+tendendy are used were checked and replaced with field only. The code is adapted the code to lparamcpl=.true only so far. The changes are documented in a branch on github (gpu/fix_lparamcpl_david) and marked with DNfix in the code (30a670aae43d588a1fef817ecb11c26e084dafa2). Also a few small things were fixed (also marked with DNfix).