Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create env_postprocessing.xml in case root #4742

Merged
merged 11 commits into from
Mar 5, 2025

Conversation

mnlevy1981
Copy link
Contributor

The contents of this new file are based on a config file in CUPiD (cime_config/config_tool.xml) that is comparable to a component's config_component.xml file

I will mark this as ready for review after I've had a chance to run test suites and fix any issues raised by those tests. For example, if there is no tools/CUPiD/cime_config/config_tool.xml then it raises the following error:

ERROR: Makes no sense to have empty read-only file: /glade/work/mlevy/codes/CESM/cesm3_0_alpha05d/tools/CUPiD/cime_config/config_tool.xml

[ Description of the changes in this Pull Request. It should be enough
information for someone not following this development to understand.
Lines should be wrapped at about 72 characters. Please also update
the CIME documentation, if necessary, in doc/source/rst and indicate
below if you need to have the gh-pages html regenerated.]

Test suite:
Test baseline:
Test namelist changes:
Test status: [bit for bit, roundoff, climate changing]

Fixes [CIME Github issue #]

User interface changes?:

Update gh-pages html (Y/N)?:

The contents of this new file are based on a config file in CUPiD
(cime_config/config_tool.xml) that is comparable to a component's
config_component.xml file
@@ -356,6 +358,9 @@ def read_xml(self):
self._env_entryid_files.append(
EnvWorkflow(self._caseroot, read_only=self._force_read_only)
)
self._env_entryid_files.append(
EnvPostprocessing(self._caseroot, read_only=self._force_read_only)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I think that you want to make sure POSTPROCESSING_SPEC_FILE exists before you create an EnvPostprocessing object.

Copy link
Contributor Author

@mnlevy1981 mnlevy1981 Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this on Slack, but realized I should keep this thread going as well... self.get_value("POSTPROCESSING_SPEC_FILE") is None at this point. I tried other env_case.xml variables like MODEL and got None as well, so I think get_values() doesn't work in the Case.read_xml() method. How can I get the value of POSTPROCESSING_SPEC_FILE?

Copy link
Contributor

@jedwards4b jedwards4b Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe generate the almost empty env_postprocessing.xml here and then delete it (or not) when POSTPROCESSING_SPEC_FILE is available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll see how testing goes, but I think the logic for removing env_postprocessing.xml if it is empty actually belongs in Case.flush() instead of Case.configure(); that's what d393f0c does (though I realized I spun up a new sandbox on my laptop to test this off the supercomputer and didn't run pre-commit... so there will be at least one more "fix CI" commit coming)

@jedwards4b
Copy link
Contributor

Perhaps @jasonb5 should review this and provide guidance on how it might make better use of #4271.

@billsacks
Copy link
Member

Perhaps @jasonb5 should review this and provide guidance on how it might make better use of #4271.

I was just thinking about #4271 in a different context, so thought I'd post some questions here in case there's consideration of leveraging that here:

  • Do we actually use anything from Remove provenance/model specific configuration #4271 currently in CESM? In particular, I see that @jasonb5 provided a cesm_config.py file, but I don't see that anywhere
  • If I understand the code from Remove provenance/model specific configuration #4271 correctly, it seems like it expects to find any extensions in the cime_config directory. For CESM, though, to avoid duplication between our various top-level repositories (CESM, CAM, CTSM, etc.), each of which have their own cime_config, I'm thinking we'd want to put these extensions in ccs_config, which is shared between these top-level repositories. I'm wondering about the feasibility of having that flexibility. For example, could the location of this customize location be provided in config_files.xml?

@jasonb5
Copy link
Collaborator

jasonb5 commented Feb 3, 2025

@mnlevy1981 @jedwards4b
Where do you want to use the customize feature in this PR?

@billsacks

I was just thinking about #4271 in a different context, so thought I'd post some questions here in case there's consideration of leveraging that here:

* Do we actually use anything from [Remove provenance/model specific configuration #4271](https://github.com/ESMCI/cime/pull/4271) currently in CESM? In particular, I see that @jasonb5 provided a `cesm_config.py` file, but I don't see that anywhere

While I provided cesm_config.py it isn't really needed as the default values are the CESM values. You'd only need to provide it if deviating from the defaults.

* If I understand the code from [Remove provenance/model specific configuration #4271](https://github.com/ESMCI/cime/pull/4271) correctly, it seems like it expects to find any extensions in the `cime_config` directory. For CESM, though, to avoid duplication between our various top-level repositories (CESM, CAM, CTSM, etc.), each of which have their own cime_config, I'm thinking we'd want to put these extensions in `ccs_config`, which is shared between these top-level repositories. I'm wondering about the feasibility of having that flexibility. For example, could the location of this customize location be provided in config_files.xml?

I'll need to make a change to accommodate the CESM structure for ccs_config, I'm thinking you're suggestion will be the most straight forward.

@billsacks
Copy link
Member

Thanks for your thoughts @jasonb5 . The use case I'm thinking about is to add some simple sanity / consistency checks for compsets in CESM - rules like, "with an active land model and an active ocean model, the compset needs to include a river model". Since some of these rules will probably be CESM-specific, this felt like a good use of this model-specific customization machinery that you implemented – though I'm open to feedback with that. But, from my perspective, I feel like it makes sense to wait to generalize this customization location until I get time to prototype this and see that it will actually be as straightforward as it is in my head, because I'm going to drop this idea if it turns out to be difficult. If @mnlevy1981 and @jedwards4b are considering this customization for this PR, I'd like to hear from them if they agree that ccs_config would be a better location than CESM's cime_config.

@jasonb5
Copy link
Collaborator

jasonb5 commented Feb 4, 2025

@billsacks @jgfouca A feature to provide sanity/consistency checks to compsets sounds like a great idea. I think it would be nice if we could implement a generalized framework in CIME and design it in a way where the model configuration can maintain the specific specific rules. Could you create an issue with this and we can discuss it further.

I'm curious to hear the use of customize here. If I'm understanding this correctly the goal is to provide case specific configuration to external tooling. It would be good to keep this as generalized as possible in CIME and move any model specific configuration into their respective repositories.

@mnlevy1981
Copy link
Contributor Author

Where do you want to use the customize feature in this PR?

@jasonb5 this is a tough question to answer, given my relatively infrequent contributions to CIME and my lack of familiarity with recent developments. That said, I think the big change to make to this PR is that POSTPROCESSING_SPEC_FILE should currently only be defined by CESM. If E3SM or other models want to use it later, each model will likely want a different default value for it so each model should define it (perhaps via the customize feature?).

Then maybe @jedwards4b's comment isn't about making sure the file specified in POSTPROCESSING_SPEC_FILE exists, but instead checking to see if POSTPROCESSING_SPEC_FILE has been defined (I mean, we should probably also abort if the file can't be read... but for non-CESM cases the variable itself wouldn't exist in env_case.xml)

@jasonb5
Copy link
Collaborator

jasonb5 commented Feb 4, 2025

@mnlevy1981 I like the idea of checking if POSTPROCESSING_SPEC_FILE is defined/exists rather than using customize to toggle it. This would make the transition simple for models that want to enable this feature in the future.

@TeaganKing
Copy link

Just wanted to add that this depends on the following PRs being merged:

@billsacks
Copy link
Member

A feature to provide sanity/consistency checks to compsets sounds like a great idea. I think it would be nice if we could implement a generalized framework in CIME and design it in a way where the model configuration can maintain the specific specific rules. Could you create an issue with this and we can discuss it further.

FYI for those following this tangent: I have opened a draft PR for discussion here: #4754. (Sorry for the continued tangent, but wanted to point people there for further discussion on that point.)

If POSTPROCESSING_SPEC_FILE is not found, then we do not call
env_postprocessing.add_elements_by_group(). This still creates an
env_postprocessing.xml file that only contains

<?xml version="1.0"?>
<file id="env_postprocessing.xml" version="2.0">
  <header>
      These variables may be changed anytime during a run, they
      control jobs that are dependent on case.run.
    </header>
</file>

I think we'd ideally not create this file at all?

Also, I ran pre-commit :)
@mnlevy1981
Copy link
Contributor Author

1e9b23e creates an almost-empty xml file if POSTPROCESSING_SPEC_FILE doesn't exist:

<?xml version="1.0"?>
<file id="env_postprocessing.xml" version="2.0">
  <header>
      These variables may be changed anytime during a run, they
      control jobs that are dependent on case.run.
    </header>
</file>

This fails the linting test, because there is no <group></group>. I think the ideal solution would still be to avoid creating env_postprocessing.xml if the spec file doesn't exist, but I'm not clear on how to do that given my comments in #4742 (comment)

I mentioned this on Slack, but realized I should keep this thread going as well... self.get_value("POSTPROCESSING_SPEC_FILE") is None at this point. I tried other env_case.xml variables like MODEL and got None as well, so I think get_values() doesn't work in the Case.read_xml() method. How can I get the value of POSTPROCESSING_SPEC_FILE?

If the spec file can not be found, remove env_postprocessing.xml from list of
files that CIME will create
Got a message that v2 was no longer supported, and the recommendation was to
switch to v3 or v4. I chose v3 because that is also used in docs.yml
This is now done in case.flush() since so many commands create this file (we
don't just want to remove it after creating a new case because it comes back
after running case.setup)
In Case.read_xml(), we only want to include EnvPostprocessing object if one of
the following is true:
1. The case is being created (added Case._existing_case to make it easy to
check)
2. env_postprocessing.xml exists (which should only be the case if the file in
POSTPROCESSING_SPEC_FILE exists)
pytest was raising flags in Case.flush() [maybe because of the way
make_valid_case() created a new case directory?]; I'm sure there's a far more
pythonic way to write this logic, but I need to avoid os.path.isfile(None)
In postprocessing.py, files.get_value("POSTPROCESSING_SPEC_FILE") will return
None if the model does not have this variable defined. In that case, we don't
want to call os.path.isfile(), we just want to set file_exists=False
@mnlevy1981
Copy link
Contributor Author

abbff8a was failing the E3SM testing, and it was definitely due to code I changed (I forgot to account for POSTPROCESSING_SPEC_FILE not being written to env_case.xml). b1f8a44 fixed that, still failed the E3SM test with

 =========================== short test summary info ============================
FAILED CIME/tests/test_sys_cime_case.py::TestCimeCase::test_cime_case_test_walltime_mgmt_8 - AssertionError: 1 != 0 : 
    COMMAND: unset CIME_GLOBAL_WALLTIME && /src/E3SM/cime/scripts/create_test --no-setup --non-local --machine=frontier --compiler=gnu --project e3sm SMS_P5600.f19_g16_rx1.A -t fake_testing_only_20250304_203554 --baseline-root /storage/cases/scripts_regression_test.20250304_203312/baselines --test-root=/storage/cases/scripts_regression_test.20250304_203312 --output-root=/storage/cases/scripts_regression_test.20250304_203312
    FROM_DIR: /src/E3SM/cime
    SHOULD HAVE WORKED, INSTEAD GOT STAT 1
    OUTPUT: 
    ERRPUT: ERROR: Compiler gnu not valid for machine frontier
============ 1 failed, 63 passed, 26 skipped in 1051.48s (0:17:31) =============

It looks like cime6.1.69 addresses this, so I've merged the latest master into my branch; it's going to take 20 minutes for the tests to run, but I'm going to mark this as ready for review and then go home. If the tests fail, I'll address it tonight / tomorrow but 🤞

@mnlevy1981 mnlevy1981 marked this pull request as ready for review March 4, 2025 23:15
@jedwards4b jedwards4b merged commit 71d2b71 into ESMCI:master Mar 5, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants