Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Refine example scripts for versioned schema #1056

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

slangeveld
Copy link
Contributor

@slangeveld slangeveld commented Mar 6, 2025

Resolves #1010

First stab at refining example scripts, documentation, metadata and tests now that we have moved to versioned schemas.

The following changes has been made:

  • Versioned schema metadata used by the tests will now be automatically generated as part of the example scripts
  • /examples have been refactored to reduce the amount of subfolders to hopefully make it a bit easier to get an overview
  • Added more export scripts to make sure all required metadata is exported
  • Modified existing export scripts to make sure the exported metadata is up to date
  • Updated run_examples.sh to run the required export scripts in order to update all required metadata.
  • Updated paths in the example documentation to point where the files are located after refactoring

The thought is to have the automatically generated matadata placed in the folder /share/metadata under the /examples folder. When these are updated they will use the current schema so they will always be up to date with the latest schema. As it is designed now we will therefore not keep older metadata examples, just overwrite what is currently there. The pattern of storing the metadata example files in a folder with the version name they were generated (e.g. 0.8.0) will therefore be deprecated (if we do it this way). Will keep the folder 0.8.0 until we have decided on what to do and this feature has been merged.

Todo:

  • Run the script run_examples.sh as part of the update_schema.py script to update metadata when a schema has been updated

Checklist

  • Tests added (if not, comment why): Test have been refactored to use the metadata that is automatically generated
  • Test coverage equal or up from main (run pytest with --cov=<packagename> --cov-report term-missing)
  • If not squash merging, every commit passes tests
  • Appropriate commit prefix and precise commit message used
  • All debug prints and unnecessary comments removed
  • Docstrings are correct and updated
  • Documentation is updated, if necessary
  • Latest main rebased/merged into branch
  • Added comments on this PR where appropriate to help reviewers
  • Moved issue status on project board
  • Checked the boxes in this checklist ✅

@slangeveld slangeveld marked this pull request as ready for review March 6, 2025 15:37
@slangeveld
Copy link
Contributor Author

This is a massive one, so we could maybe set up a PR review meeting to look at it

:language: yaml

|
Using fmu-dataio for post-processed data
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this as aggregation should not be something for the user to know about from the fmu-dataio side

Copy link
Collaborator

@jcrivenaes jcrivenaes Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if the aggradation you need is avaliable in Sumo. But what if the user requires a tailored new aggradation that is not covered in Sumo? I think Per Olav wrote this --> @perolavsvendsen

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Just to be clear, it was me again who suggested this after this topic came up in a few different contexts)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was written prior to the central service for surface statistics. So I think it's fine to remove it 👍

(Could be that it needs to come back in a slightly different form at some point. There are lots of custom post-processing scripts floating around, and it is probably not realistic to create centralized services for all these things. In those cases, and in some future where we no longer store results on /scratch, this might have to be covered.)

@@ -92,12 +92,12 @@ of easy export of RMS volumetrics can be found here :ref:`example-export-volumes
Python script
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference example-export-volumes-rms above at line 90 (expand the codelines github is hiding) seems to no longer exist. Is this something that we should removed from the documentation or is the reference located somewhere else?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fairly sure it can be removed, or at least should reference standard_results/initial_inplace_volumes

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is outdated and was pointing to an example showing how to export volumes using the simplified export function for inplace_volumes. You can update it to point to the standard result documentation mentioned by Matthew 👍

@@ -45,7 +45,7 @@ def _isoformat_all_datetimes(indate):
def _metadata_examples():
return {
path.name: _isoformat_all_datetimes(_parse_yaml(path))
for path in Path(".").absolute().glob("examples/0.8.0/*.yml")
for path in Path(".").absolute().glob("examples/share/metadata/*.yml")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the automatically generated example metadata

@slangeveld

This comment was marked as resolved.

Copy link
Collaborator

@mferrera mferrera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, nice job. Some initial comments here but we'll do a group review too.

Do we want keep the old files there (with the comments) as a reference for the user? It could be nice, but it will by time be out of date so might not be the best solution.

I don't think we need to. We should be able to point them to https://fmu-dataio.readthedocs.io/en/latest/datamodel/model/fmu.dataio._models.fmu_results.fmu_results.ObjectMetadata.html and https://fmu-dataio.readthedocs.io/en/latest/datamodel/model/fmu.dataio._models.fmu_results.fmu_results.CaseMetadata.html and rely on the doc strings to serve that purpose

Users themselves probably do not care about these details (and we don't want them to, since they could start trying to build scripting logic on it and calcify parts of the data model...), but consumers will. So i think it's OK to point these details out this way

@@ -28,14 +28,14 @@ Exporting fault polygons
Python script
~~~~~~~~~~~~~

.. literalinclude:: ../../examples/s/d/nn/xcase/realization-0/iter-0/rms/bin/export_faultpolygons.py
.. literalinclude:: ../../examples/realization-0/iter-0/scripts/export_polygons.py
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe keep this as fault_polygons, since there are other polygons we may want to show as an example?

Probably the most correct name is fault lines, but we don't want to conflate this with the forthcoming simple export

how `fmu-dataio` can be used for different contexts, and for different data types.

In addition, you will also find examples of metadata that `fmu-dataio` produces under the
`/share/metadata` folder. These files are used by the unit tests to test the pydantic logic.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`/share/metadata` folder. These files are used by the unit tests to test the pydantic logic.
`/share/metadata` folder. These files are used by the unit tests to test the Pydantic logic.

nittiest of nits

Comment on lines +10 to +12
* Most of the logic and data exports are done by the scripts in `realization-0`.
* The two other realizations, `realization-1` and `realization-9` are only there to export some simple
surfaces needed in order to be able to run aggregations.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciate this clarification

@@ -9,13 +9,13 @@
logger = logging.getLogger(__name__)
logger.setLevel(logging.WARNING)

CFG = utils.yaml_load("../../fmuconfig/output/global_variables.yml")
CFG = utils.yaml_load("../fmuconfig/output/global_variables.yml")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we discussed and this is in response to feedback I gave, but I think it would actually be preferred to structure things such that this double parent directory remains. Most users just copy-paste these things, and in the normal FMU context it should always be up two parent directories

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, keep ../../ ...

@@ -0,0 +1,100 @@
# Autogenerated from global configuration.
# DO NOT EDIT THIS FILE MANUALLY!
# Machine st-linrgsxx.st.statoil.no by user nn, at 2021-06-23 07:35:54.428896, using fmu.config ver. 1.0.6.dev7+g6aee329
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Machine st-linrgsxx.st.statoil.no by user nn, at 2021-06-23 07:35:54.428896, using fmu.config ver. 1.0.6.dev7+g6aee329
# Machine host.name by user nn, at 2021-06-23 07:35:54.428896, using fmu.config ver. 1.0.6.dev7+g6aee329

We should probably remove this as well, even if it was sort-of sanitized...

vertical_domain: depth
domain_reference: msl
file:
absolute_path: /Users/SLANG/Repos/fmu-dataio/examples/iter-0/share/results/maps/aggregated_surfaces.gri
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessarily a bad thing, but not ideal to have our paths to end up included in documentation. Maybe there is some post processing we can do for a few fields that contain our machine stuff

Comment on lines +27 to +32
operating_system:
hostname: AC-P66X0XDL40
operating_system: macOS-15.3.1-arm64-arm-64bit
release: 24.3.0
system: Darwin
version: 'Darwin Kernel Version 24.3.0: Thu Jan 2 20:24:23 PST 2025; root:xnu-11215.81.4~3/RELEASE_ARM64_T6031'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well

- datetime: '2025-03-06T14:56:31.006502Z'
event: created
user:
id: SLANG
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tnatt would have to confirm, but we might be able to remove this one

@@ -98,7 +98,7 @@ venv.bak/
# setuptools_scm version
src/fmu/dataio/version.py
# aggregated examples directory
examples/s/d/nn/xcase/iter-0/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The /s/d/nn/xcase/... had an intention, mimicking /scratch/diskproject/nn_user/fmucase folder structure. Of course, /s/d/nn may perhaps be removed, but I do not see the point of removing xcase as case-metadata is written under this folder from fmu-dataio

Copy link
Collaborator

@mferrera mferrera Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think keeping some case directory would also preserve the CFG = "../../...yml" pattern rather than single up directory, so probably some case or project idicator would be good

(Full disclosure: I suggested removing the s/d/nn directory since in GitHub it ignores all the middle directories with nothing in them, so it is effectively treated as one directory anyway)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, would be good to discuss the best way to do this in a standup or separate meeting. If keeping the xcase folder would make it easier for the users to understand as it is similar to how it will be structured in their projects, I'm also in favor for that.

Adding the xcase folder back would not preserve the CFG = "../../...yml" though, as the config file is located within the /realization-0/iter-0 folder. This path had to be changed because the script used to live inside rms/bin/.

Comment on lines +110 to +115
if __name__ == "__main__":
print("\nExporting polygons and metadata...")
export_faultlines()
export_field_region()
export_field_outline()
print("Done exporting polygons and metadata.")
Copy link
Collaborator

@jcrivenaes jcrivenaes Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In examples/realization-0/iter-0/scripts/export_faultroom_surfaces.py further above you have renamed functions to main() and also moved comments, leaving this:

if __name__ == "__main__":
     main()

To be consistent, you could call all export*() from a main() routine and hence have samme pattern

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Will adapt that pattern to all the export scripts

Comment on lines +35 to +38
if __name__ == "__main__":
print("\nExporting a preprocessed surface and metadata...")
export_preprocessed_surface()
print("Done exporting a preprocessed surface and metadata.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same her

@@ -22,7 +22,7 @@
logger = logging.getLogger(__name__)
logger.setLevel(logging.WARNING)

CFG = ut.yaml_load("../../fmuconfig/output/global_variables.yml")
CFG = ut.yaml_load("../fmuconfig/output/global_variables.yml")
Copy link
Collaborator

@jcrivenaes jcrivenaes Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

../../ ... and other places below

@@ -0,0 +1,22 @@
,ZONE,REGION,FLUID,BULK_OIL,PORV_OIL,HCPV_OIL,STOIIP_OIL,ASSOCIATEDGAS_OIL,BULK_GAS,PORV_GAS,HCPV_GAS,GIIP_GAS,ASSOCIATEDOIL_GAS,BULK_TOTAL,PORV_TOTAL
Copy link
Collaborator

@tnatt tnatt Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is on the old volume format. We probably want to update the example to use the simplified export function for inplace_volumes 🙂 to prevent this legacy way of exporting volumes to be implemented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meeting notes from PR review meeting: Agreed to keep this the way it is for now and look into restructuring this when we have more simplified functions ready

main()
print("Done aggregating surfaces and exporting files and metadata.\n")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should just drop the entire aggregated example ❓ Aggregations should only be done through sumo to ensure data consistency for the consumers. And the entire fmu.dataio.AggregatedData will go through a makeover in the future, so this example will need to be updated then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meeting notes from PR review meeting: Agreed to remove aggregation completely from the example documentation and only keep the minimum needed to generate the aggregation metadata file (for the tests) in the scripts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refine examples/ for versioned schemas
5 participants