Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #998 - coupling update error fix #999

Merged

Conversation

ukmo-juan-castillo
Copy link
Collaborator

@ukmo-juan-castillo ukmo-juan-castillo commented Apr 24, 2023

Pull Request Summary

In certain coupled configurations, the piece of code testing the coupling frequency to check if 'receive' coupling exchanges need to take place fail, resulting in an infinite loop causing the integration between time zero and the first time step to repeat indefinitely.

Description

The intention of this PR is to fix issue #998, which describes in detail what the problem and the proposed solution are.

  • Suggested reviewer: ukmo-ccbunney, mickaelaccensi
  • Labels that should be added: bug
  • Are answer changes expected from this PR? No, just fixing infinite loop and deadlocks in certain coupled configurations

Issue(s) addressed

Commit Message

In certain coupled configurations, the piece of code testing the coupling frequency to check if 'receive' coupling exchanges need to take place fail, resulting in an infinite loop causing the integration between time zero and the first time step to repeat indefinitely. This check needs to be rewritten, which fixes also issue #816 in a simpler way.

Check list

Testing

  • How were these changes tested?
  • Are the changes covered by regression tests? (If not, why? Do new tests need to be added?)
  • Have the matrix regression tests been run (if yes, please note HPC and compiler)?
  • Please indicate the expected changes in the regression test output, (Note the list of known non-identical tests.)
  • Please provide the summary output of matrix.comp (matrix.Diff.txt, matrixCompFull.txt and matrixCompSummary.txt):

@MatthewMasarik-NOAA
Copy link
Collaborator

@ukmo-juan-castillo, thank you for this fix for coupling exchange check. I'll start the regression tests.

@MatthewMasarik-NOAA
Copy link
Collaborator

@ukmo-juan-castillo, could you please sync your branch to bring it current with develop?

@MatthewMasarik-NOAA
Copy link
Collaborator

Thank you for syncing @ukmo-juan-castillo.

@MatthewMasarik-NOAA MatthewMasarik-NOAA self-requested a review April 26, 2023 12:37
Copy link
Collaborator

@MatthewMasarik-NOAA MatthewMasarik-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review Pass

Regtests Pass

The only non-identicals are the known non-b4b's.

**********************************************************************                            
********************* non-identical cases ****************************                            
**********************************************************************                            
mww3_test_03/./work_PR1_MPI_e                     (1 files differ)                                
mww3_test_03/./work_PR3_UNO_MPI_e                     (1 files differ)                            
mww3_test_03/./work_PR2_UNO_MPI_e                     (1 files differ)                            
mww3_test_03/./work_PR2_UNO_MPI_d2                     (18 files differ)                          
mww3_test_03/./work_PR1_MPI_d2                     (10 files differ)                              
mww3_test_03/./work_PR3_UNO_MPI_d2_c                     (17 files differ)                        
mww3_test_03/./work_PR3_UQ_MPI_d2_c                     (15 files differ)                         
mww3_test_03/./work_PR3_UNO_MPI_d2                     (20 files differ)                          
mww3_test_03/./work_PR2_UQ_MPI_d2                     (16 files differ)                           
mww3_test_03/./work_PR3_UQ_MPI_e                     (1 files differ)                             
mww3_test_03/./work_PR3_UNO_MPI_e_c                     (1 files differ)                          
mww3_test_03/./work_PR3_UQ_MPI_d2                     (16 files differ)                           
ww3_tp2.10/./work_MPI_OMPH                     (7 files differ)                                   
ww3_tp2.16/./work_MPI_OMPH                     (4 files differ)                                   
ww3_ufs1.3/./work_a                     (3 files differ)                                          
                                                                                                  
**********************************************************************                            
************************ identical cases *****************************                            
**********************************************************************

matrixCompSummary.txt
matrixCompFull.txt
matrixDiff.txt

This is OK to proceed from the EMC side. I'll give some time today for @ukmo-ccbunney @mickaelaccensi to comment since it deals with OASIS coupling.

Copy link
Collaborator

@ukmo-ccbunney ukmo-ccbunney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me @ukmo-juan-castillo. It's great that this has also simplified and cleaned up the code of the bugfix from the other issue too - double win!

Results of the the OASIS coupling regtests are all looking fine.

@MatthewMasarik-NOAA
Copy link
Collaborator

Changes look good to me @ukmo-juan-castillo. It's great that this has also simplified and cleaned up the code of the bugfix from the other issue too - double win!

Results of the the OASIS coupling regtests are all looking fine.

Great! Thank you for your review @ukmo-ccbunney.

@MatthewMasarik-NOAA MatthewMasarik-NOAA self-requested a review April 28, 2023 13:35
Copy link
Collaborator

@MatthewMasarik-NOAA MatthewMasarik-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has previously been reviewed (#999 (review)), and passed both the code review and regtests. @ukmo-ccbunney has given the OK with respect to the OASIS coupling (#999 (review)).

Approved.

@MatthewMasarik-NOAA MatthewMasarik-NOAA merged commit 24514ac into NOAA-EMC:develop Apr 28, 2023
@MatthewMasarik-NOAA
Copy link
Collaborator

@ukmo-juan-castillo, Thank you for providing this fix that addresses #998, as well as #816! Excellent work!

ukmo-ccbunney added a commit to ukmo-waves/WW3 that referenced this pull request Apr 28, 2023
ukmo-ccbunney added a commit to ukmo-waves/WW3 that referenced this pull request May 2, 2023
* Backport of coupling update error fix from develop.
See: NOAA-EMC#999

* Fixed typo
MatthewMasarik-NOAA added a commit to MatthewMasarik-NOAA/WW3 that referenced this pull request Jun 27, 2023
* origin/develop:
  handle NaN air-sea temperatures from nearest land points (NOAA-EMC#869)
  Increase valid_max for f in ounp (NOAA-EMC#1014)
  Bugfix deallocation of invalid memory in ww3_prnc (NOAA-EMC#1016)
  Update to orion intel module path and two typo corrections. (NOAA-EMC#1011)
  Bugfix to out of bounds array write in w3profsmd_pdlib.f90 (NOAA-EMC#1013)
  Simple logic fix for time interpolation of boundary nodes at the end of W3XYPFSNIMP. (NOAA-EMC#1005)
  in w3iors use NSEA instead of NSEAL in serial write/read of VA (NOAA-EMC#954)
  Update documentation for UNST namelist (NOAA-EMC#986)
  In certain coupled configurations, the piece of code testing the coupling frequency to check if 'receive' coupling exchanges need to take place fail, resulting in an infinite loop causing the integration between time zero and the first time step to repeat indefinitely. This check needs to be rewritten, which fixes also issue NOAA-EMC#816 in a simpler way.  (NOAA-EMC#999)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Coupling update error
3 participants