Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port UFS_UTILS to S4 #543

Closed
DavidHuber-NOAA opened this issue Jun 25, 2021 · 12 comments · Fixed by #557
Closed

Port UFS_UTILS to S4 #543

DavidHuber-NOAA opened this issue Jun 25, 2021 · 12 comments · Fixed by #557

Comments

@DavidHuber-NOAA
Copy link
Collaborator

DavidHuber-NOAA commented Jun 25, 2021

UFS_UTILS should be ported to SSEC's S4 cluster to enable GFS and DA research efforts on that system.

@GeorgeGayno-NOAA
Copy link
Collaborator

As you may already know, UFS_UTILS uses the HPC-stack: https://github.com/NOAA-EMC/hpc-stack/wiki

So that should be ported first.

@DavidHuber-NOAA
Copy link
Collaborator Author

Yes, I have a ported version of HPC-stack on S4. It's a little outdated (version 1.0.0), so I will be updating to version 1.1.0 before finishing the port.

@edwardhartnett
Copy link
Collaborator

Since UFS_UTILS is standard fortran, the only "porting" that is necessary is adding support for S4 in the cmake build - which means setting some compiler flags.

Nor is that strictly necessary, as the user can specify fortran flags from the build command, without them being in the cmake file. So @DavidHuber-NOAA you should be able to build the existing release on S4 just by providing the correct flags.

I have no objection to supporting S4 with additional flags in cmake to make that easier in future releases, but it should work out of the box, right now.

@DavidHuber-NOAA
Copy link
Collaborator Author

@GeorgeGayno-NOAA @edwardhartnett S4 does not have access to HPSS, so the gdas_init driver cannot be fully ported. However, there is some interest to be able to run chgres. Thus I'd like to write a partial driver for S4 that can run just the chgres portion and issue an error if some attempts to extract from HPSS. Are there any qualms with this solution?

@GeorgeGayno-NOAA
Copy link
Collaborator

@GeorgeGayno-NOAA @edwardhartnett S4 does not have access to HPSS, so the gdas_init driver cannot be fully ported. However, there is some interest to be able to run chgres. Thus I'd like to write a partial driver for S4 that can run just the chgres portion and issue an error if some attempts to extract from HPSS. Are there any qualms with this solution?

You can turn off the data extraction by setting EXTRACT_DATA=no.

@GeorgeGayno-NOAA
Copy link
Collaborator

You would need to place the input data in the directory expected by the gdas_init utility.

@DavidHuber-NOAA
Copy link
Collaborator Author

That should just be the EXTRACT_DIR variable in the config file, correct?

@GeorgeGayno-NOAA
Copy link
Collaborator

That should just be the EXTRACT_DIR variable in the config file, correct?

Yes, but when you untar the hpss tarball it will add subdirectories under EXTRACT_DIR. The utility expects the full path.

Do you plan to run the hpss step on another machine, then push the data to S4?

@DavidHuber-NOAA
Copy link
Collaborator Author

Yes, I believe that would be the plan. Although now that you ask, I'm not sure if any of the data that comes off of hpss is restricted. If that's the case, then porting this driver is pointless since restricted data is not allowed on S4. Do you know off hand if any of the data is restricted?

@GeorgeGayno-NOAA
Copy link
Collaborator

Yes, I believe that would be the plan. Although now that you ask, I'm not sure if any of the data that comes off of hpss is restricted. If that's the case, then porting this driver is pointless since restricted data is not allowed on S4. Do you know off hand if any of the data is restricted?

I don't think the restart files input to chgres are restricted. Only certain datasets ingested by the GSI are restricted. But I will let @KateFriedman-NOAA verify this.

@KateFriedman-NOAA
Copy link
Collaborator

@GeorgeGayno-NOAA Correct, the restart files aren't restricted themselves but if they are in a tarball that contains even a single file that is restricted (e.g. dump file) then the entire tarball will have rstprod permissions and the user will need rstprod group permissions to access the tarball contents. Here are the production tarballs from an example 00z cycle that end up with rstprod permissions because of one or more files within being restricted (note the gdas restart tarball is one of them):

~> hpsstar dir /NCEPPROD/hpssprod/runhistory/rh2021/202107/20210701 | grep gfs | grep rstprod | grep _00
[connecting to hpsscore1.fairmont.rdhpcs.noaa.gov/1217]
-rw-r-----    1 nwprod    rstprod  50577001984 Jul  3 12:42 com_gfs_prod_enkfgdas.20210701_00.enkfgdas.tar
-rw-r-----    1 nwprod    rstprod   9253903360 Jul  3 07:36 com_gfs_prod_gdas.20210701_00.gdas.tar
-rw-r-----    1 nwprod    rstprod  65228770304 Jul  3 07:52 com_gfs_prod_gdas.20210701_00.gdas_restart.tar
-rw-r-----    1 nwprod    rstprod  12622054400 Jul  3 08:23 com_gfs_prod_gfs.20210701_00.gfs.tar

@DavidHuber-NOAA
Copy link
Collaborator Author

Alright, thanks @KateFriedman-NOAA @GeorgeGayno-NOAA. I will give this a try on Hera/S4.

DavidHuber-NOAA added a commit to DavidHuber-NOAA/UFS_UTILS that referenced this issue Aug 9, 2021
DavidHuber-NOAA added a commit to DavidHuber-NOAA/UFS_UTILS that referenced this issue Aug 9, 2021
GeorgeGayno-NOAA pushed a commit that referenced this issue Aug 9, 2021
Update the repository build for S4.

New gdas_init utility driver script for S4. Script only invokes the chgres_cube step
as S4 does not have access to HPSS.

Fixes #543.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants