Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update fix submodule (Rcov_crisn21) & GSI_BINARY_SOURCE_DIR #718

Merged

Conversation

RussTreadon-NOAA
Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA commented Mar 13, 2024

DUE DATE for merger of this PR into develop is 4/24/2024 (six weeks after PR creation).

Description
The PR includes two changes

  • update the hash for the fix submodule in order to pull correlated observation error file Rcov_crisn21 into fix/ when the GSI is built.
  • update GSI_BINARY_SOURCE_DIR to point at 20240208

Fixes #714
Fixes #717

Type of change

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

  • clone and build PR branch on WCOSS2, Hera, Orion and Hercules
  • run ctests on each platform

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code

@RussTreadon-NOAA
Copy link
Contributor Author

Clone and build RussTreadon-NOAA:feature/rcov_crisn21 on Cactus, Hera, and Orion. A diff -r of the fix/ from this installation compared with develop shows the expected Rcov_crisn21 differences

Orion-login-3:/work2/noaa/da/rtreadon/git/gsi/pr718$ diff -r fix ../develop/fix/
diff -r fix/.gitignore ../develop/fix/.gitignore
159d158
< Rcov_crisn21
diff -r fix/gsi_binary_files.cmake ../develop/fix/gsi_binary_files.cmake
123d122
<   Rcov_crisn21
Only in fix: Rcov_crisn21

fix/ associated with the PR includes file Rcov_crisn21 after the build whereas develop fix/ does not. This difference is expected and correct.

ctests are underway on each platform and will be reported as tests complete.

@RussTreadon-NOAA
Copy link
Contributor Author

Cactus ctests

Install RussTreadon-NOAA:feature/rcov_crisn21 at 4913729 and develop at f282a94. Run ctests with following results

Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/pr718/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_glbens
    Start 4: netcdf_fv3_regional
    Start 5: hafs_4denvar_glbens
    Start 6: hafs_3denvar_hybens
    Start 7: global_enkf
1/7 Test #4: netcdf_fv3_regional ..............   Passed  663.43 sec
2/7 Test #3: rrfs_3denvar_glbens ..............   Passed  726.20 sec
3/7 Test #7: global_enkf ......................   Passed  1091.32 sec
4/7 Test #2: rtma .............................   Passed  1389.90 sec
5/7 Test #6: hafs_3denvar_hybens ..............   Passed  1393.46 sec
6/7 Test #5: hafs_4denvar_glbens ..............   Passed  1454.51 sec
7/7 Test #1: global_4denvar ...................   Passed  2043.14 sec

100% tests passed, 0 tests failed out of 7

Total Test time (real) = 2043.26 sec

Note that while all the above ctests pass, those tests which assimilate cris-fsr_n21 have different initial cris-fsr_n21 obs-ges statistics. Below is the o-g 01 difference for loproc_updat from PR #718 (<) and PR #692 (>)

< o-g 01 rad  n21       cris-fsr      333749160        67667         6943    1782.9       1782.9      0.25679      0.25679    
---
> o-g 01 rad  n21       cris-fsr      333749160        67667         7589    1718.7       1718.7      0.22648      0.22648 

This is expected because PR #692 did not include Rcov_crisn21 in the run directory.

@RussTreadon-NOAA
Copy link
Contributor Author

Orion ctests

Install RussTreadon-NOAA:feature/rcov_crisn21 at 4913729f and develop at f282a944. Run ctests with following results

Test project /work2/noaa/da/rtreadon/git/gsi/pr718/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_glbens
    Start 4: netcdf_fv3_regional
    Start 5: hafs_4denvar_glbens
    Start 6: hafs_3denvar_hybens
    Start 7: global_enkf
1/7 Test #4: netcdf_fv3_regional ..............   Passed  483.89 sec
2/7 Test #3: rrfs_3denvar_glbens ..............   Passed  1147.13 sec
3/7 Test #2: rtma .............................   Passed  1270.33 sec
4/7 Test #6: hafs_3denvar_hybens ..............   Passed  1697.65 sec
5/7 Test #5: hafs_4denvar_glbens ..............   Passed  1814.86 sec
6/7 Test #7: global_enkf ......................   Passed  1927.98 sec
7/7 Test #1: global_4denvar ...................   Passed  2703.04 sec

100% tests passed, 0 tests failed out of 7

Total Test time (real) = 2703.05 sec

@RussTreadon-NOAA RussTreadon-NOAA changed the title Update fix submodule to bring in Rcov_crisn21 Update fix submodule (Rcov_crisn21) & GSI_BINARY_SOURCE_DIR Mar 14, 2024
@RussTreadon-NOAA
Copy link
Contributor Author

Hera ctests

Test project /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi/pr718/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_glbens
    Start 4: netcdf_fv3_regional
    Start 5: hafs_4denvar_glbens
    Start 6: hafs_3denvar_hybens
    Start 7: global_enkf
1/7 Test #7: global_enkf ......................   Passed  2595.72 sec
2/7 Test #3: rrfs_3denvar_glbens ..............   Passed  4273.25 sec
3/7 Test #1: global_4denvar ...................***Failed  4390.18 sec
4/7 Test #4: netcdf_fv3_regional ..............***Failed  4394.20 sec
5/7 Test #6: hafs_3denvar_hybens ..............   Passed  4464.69 sec
6/7 Test #5: hafs_4denvar_glbens ..............   Passed  4885.36 sec
7/7 Test #2: rtma .............................   Passed  6013.35 sec

71% tests passed, 2 tests failed out of 7

Total Test time (real) = 6013.40 sec

The following tests FAILED:
          1 - global_4denvar (Failed)
          4 - netcdf_fv3_regional (Failed)

global_4denvar failed due to

The runtime for global_4denvar_hiproc_updat is 371.105873 seconds.  This has exceeded maximum allowable threshold time of 358.092849 seconds, resulting in Failure of timethresh2 the regression test.

gsi.x wall times for the various configurations are

global_4denvar_hiproc_contrl/stdout:The total amount of wall time                        = 325.538954
global_4denvar_hiproc_updat/stdout:The total amount of wall time                        = 371.105873
global_4denvar_loproc_contrl/stdout:The total amount of wall time                        = 453.565875
global_4denvar_loproc_updat/stdout:The total amount of wall time                        = 447.536802

The wall times do not exhibit anomalous behavior. This is not a fatal fail.

netcdf_fv3_regional failed due to

The runtime for netcdf_fv3_regional_hiproc_updat is 95.267409 seconds.  This has exceeded maximum allowable threshold time of 78.094755 seconds, resulting in Failure of timethresh2 the regression test.

gsi.x wall times for the various configurations are

netcdf_fv3_regional_hiproc_contrl/stdout:The total amount of wall time                        = 62.475804
netcdf_fv3_regional_hiproc_updat/stdout:The total amount of wall time                        = 95.267409
netcdf_fv3_regional_loproc_contrl/stdout:The total amount of wall time                        = 73.202860
netcdf_fv3_regional_loproc_updat/stdout:The total amount of wall time                        = 77.570720

The hiproc_updat ran longer than hiproc_contrl. The opposite is true for the loproc jobs. This is not a fatal fail.

@RussTreadon-NOAA
Copy link
Contributor Author

Hercules ctests

Test project /work2/noaa/da/rtreadon/git/gsi/pr718/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_glbens
    Start 4: netcdf_fv3_regional
    Start 5: hafs_4denvar_glbens
    Start 6: hafs_3denvar_hybens
    Start 7: global_enkf
1/7 Test #4: netcdf_fv3_regional ..............   Passed  483.06 sec
2/7 Test #3: rrfs_3denvar_glbens ..............***Failed  543.97 sec
3/7 Test #7: global_enkf ......................   Passed  726.85 sec
4/7 Test #2: rtma .............................   Passed  965.39 sec
5/7 Test #6: hafs_3denvar_hybens ..............   Passed  1093.41 sec
6/7 Test #5: hafs_4denvar_glbens ..............   Passed  1331.44 sec
7/7 Test #1: global_4denvar ...................   Passed  1683.32 sec

86% tests passed, 1 tests failed out of 7

Total Test time (real) = 1683.51 sec

The following tests FAILED:
          3 - rrfs_3denvar_glbens (Failed)

The rrfs_3denvar_glbens failure is due to

The fv3_dynvars are reproducible
The fv3_sfcdata are reproducible
The results between the two runs (rrfs_3denvar_glbens_loproc_updat and rrfs_3denvar_glbens_hiproc_updat) are not reproducible
Thus, the case has Failed siganl of the regression tests.

Comparison of fv3_tracer between the two _updat_ runs shows differences in the sphum` field

xaxis_1 min/max 1=1.0,396.0 min/max 2=1.0,396.0 max abs diff=0.0000000000
yaxis_1 min/max 1=1.0,232.0 min/max 2=1.0,232.0 max abs diff=0.0000000000
zaxis_1 min/max 1=1.0,65.0 min/max 2=1.0,65.0 max abs diff=0.0000000000
Time min/max 1=1.0,1.0 min/max 2=1.0,1.0 max abs diff=0.0000000000
sphum min/max 1=0.0,0.022526605 min/max 2=0.0,0.022526605 max abs diff=0.0000016253
liq_wat min/max 1=0.0,0.0025440012 min/max 2=0.0,0.0025440012 max abs diff=0.0000000000
ice_wat min/max 1=0.0,0.0005032422 min/max 2=0.0,0.0005032422 max abs diff=0.0000000000
rainwat min/max 1=0.0,0.007244442 min/max 2=0.0,0.007244442 max abs diff=0.0000000000
snowwat min/max 1=0.0,0.0068065105 min/max 2=0.0,0.0068065105 max abs diff=0.0000000000
graupel min/max 1=0.0,0.0047608074 min/max 2=0.0,0.0047608074 max abs diff=0.0000000000
water_nc min/max 1=0.0,1336478600.0 min/max 2=0.0,1336478600.0 max abs diff=0.0000000000
ice_nc min/max 1=0.0,4201064.0 min/max 2=0.0,4201064.0 max abs diff=0.0000000000
rain_nc min/max 1=0.0,613399.6 min/max 2=0.0,613399.6 max abs diff=0.0000000000
o3mr min/max 1=3.852443e-08,1.5621523e-05 min/max 2=3.852443e-08,1.5621523e-05 max abs diff=0.0000000000
liq_aero min/max 1=10846143.0,71099390000.0 min/max 2=10846143.0,71099390000.0 max abs diff=0.0000000000
ice_aero min/max 1=0.0,6482549.0 min/max 2=0.0,6482549.0 max abs diff=0.0000000000
sgs_tke min/max 1=9.934678e-05,42.890892 min/max 2=9.934678e-05,42.890892 max abs diff=0.0000000000

This is a known problem for regional ctests on Hercules. PR #698 contains changes which address this problem.

@RussTreadon-NOAA
Copy link
Contributor Author

@CatherineThomas-NOAA , may I add you as a reviewer to PR? The changes are simple

  1. update fix submodule hash to bring in Rcov_crisn21
  2. update GSI_BINARY_SOURCE_DIR to 20240208

@CatherineThomas-NOAA
Copy link
Collaborator

@RussTreadon-NOAA Yes, you can add me as a reviewer. I'll get to this today.

@RussTreadon-NOAA
Copy link
Contributor Author

Thank you @CatherineThomas-NOAA

@CatherineThomas-NOAA
Copy link
Collaborator

I compared the feature branch with GSI develop and found the same differences that @RussTreadon-NOAA reported: Rcov_crisn21 entries in fix/.gitignore and fix/gsi_binary_files.cmake as well as pointing to the new GSI_BINARY_SOURCE_DIR in the modulefiles.

I also compared the updated GSI_BINARY_SOURCE_DIR with the previous one on Hera, Orion, and Cactus. I found the same differences in all 3 comparisons:

  • Add crisn21 in the anavinfo file
  • New text files for the soil DA
  • Assimilate ASCAT MetOp-C in convinfo
  • Assimilate ATMS N21, CrIS N21, ABI G18, AHI H9 in satinfo (with some additional value changes for CrIS N21)

This is mostly consistent with the GSI-fix updates for that time period (20230911 to 20240208). The only inconsistency is that the text of the PR and commit mention also turning on AHI Himawari 8, but it looks like it was already on and no changes were made. @ADCollard Can you confirm that this is as expected?

@ADCollard
Copy link
Contributor

@CatherineThomas-NOAA That is a mistake. It should just be Himawari-8.

@CatherineThomas-NOAA
Copy link
Collaborator

@CatherineThomas-NOAA That is a mistake. It should just be Himawari-8.

@ADCollard
Only Himawari-8 should be assimilated? Or did you mean only Himawari-9 should be newly added?

@ADCollard
Copy link
Contributor

@CatherineThomas-NOAA G'ah! I made the same mistake again! It should only be Himawari-9. I need more coffee!

@CatherineThomas-NOAA
Copy link
Collaborator

@ADCollard Thanks for clearing things up (even if they got less clear along the way). We got there in the end :).

Copy link
Contributor Author

@RussTreadon-NOAA RussTreadon-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@RussTreadon-NOAA RussTreadon-NOAA merged commit a8d670c into NOAA-EMC:develop Mar 14, 2024
@RussTreadon-NOAA RussTreadon-NOAA deleted the feature/rcov_crisn21 branch March 14, 2024 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update fix hash to bring in Rcov_crisn21 Update fix hash and GSI_BINARY_SOURCE_DIR
3 participants