-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
restart reproducibility (without waves) when USE_LA_LI2016=True #46
Comments
I have created a reproducer branch here. This branch contains additional settings and tests which reproduce the restart issue at C96mx100. Two additional control and restart tests are added demonstrating that: a) when When running this test branch, the results are not being compared against the baseline. The control and restart runs are being compared directly in this case, therefore the "LIST_FILES" within the tests is an empty string. Also, all components and coupling is being done at a single DT, removing any possible averaging effects. The tests can be run using: ./rt.sh -ek -l rt.cpldrestart.conf >output 2>&1 & These results are contained within the coupler history files, which are written at every time timestep. The coupler history file ufs.cpld.cpl.hi.2016-10-03-04500.nc in the restart run should be compared to the same coupler history file for the continuous run. When comparing ufs.cpld.cpl.hi.2016-10-03-04500.nc between restart and continuous runs for the |
I've been able to run the coupled model on gaea in this branch and reproduce the error. |
@breichl the code and test case should now work on Gaea. |
My latest test run on Gaea is :/lustre/f2/scratch/Denise.Worthen/FV3_RT/rt_25674 |
Is there an instruction to build the executable on Gaea? I tried to run/understand the test script in the tests folder as above, but I seem to be missing something as nothing is happening. I then tried to build from build.sh, which asks for ESMFMKFILE environment variable to be set. Not sure what this should be. Unfortunately I can't look at your directory for guidance, I assume because there aren't cross-permissions between GFDL/EMC accounts. |
I suspect you'll need to export an accnr variable. I used "export ACCNR=nggps_emc", justin uses "export ACCNR=gfdl_b" I don't know why build.sh doesn't work, but Justin got the same behaviour. I must have something to do w/ how gaea uses modules. I used the standard method, which is shown in this screenshot, where you first load the required modules: Using rt.sh (ie, ./rt.sh -ek -l rt.cpldrestart.conf >output 2>&1 &), the run directory will be created /lustre/f2/scratch/username/FV3_RT/rt_number |
Thanks Denise. I seem to be failing in the "module purge" step of "module-setup.sh.inc" within rt.sh. I've pinged Justin to see if he is familiar with this. |
I think its running now. I'll keep you posted if I can spot the issue. |
It looks to me like there are missing halo updates on taux and tauy in A&B grid configurations. The changes here appear to fix the restart issue: https://github.com/breichl/MOM6/tree/user/bgr/Tau_halo_updates_in_nupoc |
@breichl Thanks--let me give it a try. So this lack of halo update must be benign when the LI_2016 is not used, is that right? |
It appears so, which indicates that it matters only because taux/tauy is used to set ustar_gustless on cell centers (via set_derived_forcing_fields in mom_ocean_model_nuopc). ustar_gustless is only used when LI_2016 is true. I presume ustar_gustless is the only place where taux/tauy are averaged to cell centers (the ustar averaging, for example, is computed from other terms within these loops, hence ustar is not sensitive to the halo update). The halo updates are there for the C-grid already, hence the C-grid case working. So this fix seems consistent with all the symptoms. |
I've tested in all our non-wave benchmark configurations and they all pass the restart test now so I think you correctly isolated and solved the issue. Would you like to make a PR back to noaa-emc w/ the fix or should I use my fork (with you credited w/ fix)? |
Just sent it up, but feel free to use yours if its easier. |
No, that is fine. Thanks. I know Jiande wants to wait on MOM6 updates until the new FMS is ready. So I'm not sure of the exact timing. This PR will also need an issue on ufs-weather so I'll create that. |
close |
Merge in latest dev/gfdl updates
Stop logging the deprecated run-time parameter NEW_SPONGES, and always log INTERPOLATE_SPONGE_TIME_SPACE as if NEW_SPONGES were not used. This commit will address MOM6 issue NOAA-EMC#46, which can be closed it is accepted. This will change the MOM_parameter_doc entries in some cases, but all answers are bitwise identical.
When running the regression tests without waves for restart reproducibility, USE_LA_LI2016 is set to false to reproduce. Further description from @DeniseWorthen: This is a long-standing issue where the grid decomposition appears in the MOM6 fields after restart if this parameter is set true. It was not resolved when we switched to vertex_shear=F in the update to MOM6 (PR mom-ocean#290).
The text was updated successfully, but these errors were encountered: