Add NameConstraint with relaxed name loading #3463

bjlittle · 2019-10-14T18:25:44Z

This PR introduces the names property to both cubes and coordinates - a property that returns a tuple of the standard_name, long_name, var_name and STASH.

This property is needed as a means to support more flexible name loading i.e., a iris._constraints.Constraint._coordless_match that will succeed for a match on either of the names property values, rather than depending on the behaviour of the iris._cube_coord_common.CFVariableMixin.name method. Note that, this also applies to iris.cube.Cube.extract and iris.cube.CubeList.extract et al.

This PR also introduces the iris.NameConstraint for more specific control of names related matching.

Closes #3407

lib/iris/tests/test_constraints.py

bjlittle · 2019-10-14T18:26:27Z

Requires a whatsnew entry...

rcomer

I’m curious about the inclusion of STASH in the names property. It’s a code, rather than a name. Also we have advice to use an AttributeConstraint on STASH, further down that same section of user guide.

rcomer · 2019-10-14T19:52:33Z

lib/iris/_constraints.py

+            iris.NameConstraint(standard_name='air_temperature',
+                                STASH=lambda stash: stash.item == 203
+
+            .. note:: Name constraint names are case sensitive.


Name constraint names are case sensitive. Is it worth having an example where the lambda uses e.g. long_name.lower(), to show how you can make a non case sensitive constraint?

@rcomer Just to be clear (and perhaps this is what requires to be communicated more clearly) what I mean here is that it's:

STASH=lambda stash: stash.item == 203, and

not stash=lambda stash: stash.item == 203

It's the keyword argument key that is case sensitive, and the user can't change that.

Ah OK, that makes sense.

@rcomer I've updated the doc-string in an attempt to clarify, HTH 😄

Much better, thanks 😀

pp-mo · 2019-10-15T09:40:36Z

I haven't time to look at this in detail yet.

But I want to say right away that I'm strongly opposed to the specific support for the STASH attribute.
Not just because it is format-specific, which seems pretty questionable anyway, but also because supporting UM files as a special case is highly MetOffice-centric, which I think is unhelpful to the project as a whole.
I recently proposed a similar load/save control attribute for GRIB : SciTools/iris-grib#156. So should that maybe be added in here too ? I really don't think so.

rcomer · 2019-10-15T09:45:04Z

format-specific, which seems pretty questionable anyway

I was thinking that, but then I remembered that so is var_name....

stephenworsley · 2019-10-15T09:50:09Z

lib/iris/_constraints.py

+
+        """
+        self._names = names
+        super(NameConstraint, self).__init__(cube_func=self._cube_func)


Since we're in Python 3 now, I believe this can be written as super().__init__(cube_func=self._cube_func).

bjlittle · 2019-10-15T10:01:36Z

@pp-mo This isn't anything new. This PR is just highlighting the existing behaviour of iris that was previously implicit i.e., the iris._cube_coord_common.CFVariableMixin.name method will return the STASH if the cube doesn't have a standard_name, long_name or var_name. Therefore you can filter on STASH for loading and extracting:

>>> print(cube)
air_temperature / (K)               (latitude: 73; longitude: 96)
     Dimension coordinates:
          latitude                           x              -
          longitude                          -              x
     Scalar coordinates:
          forecast_period: 6477 hours, bound=(-28083.0, 6477.0) hours
          forecast_reference_time: 1998-03-01 03:00:00
          pressure: 1000.0 hPa
          time: 1998-12-01 00:00:00, bound=(1994-12-01 00:00:00, 1998-12-01 00:00:00)
     Attributes:
          STASH: m01s16i203
          source: Data from Met Office Unified Model
     Cell methods:
          mean within years: time
          mean over years: time
>>> cube.standard_name = None
>>> cube.extract('m01s16i203')
<iris 'Cube' of m01s16i203 / (K) (latitude: 73; longitude: 96)>

The above example is using iris from master (prior to this PR) highlights this.

So you seem to be opposed to something that already exists. I'm not quite sure how that can be addressed here in this PR?

stephenworsley · 2019-10-15T10:12:37Z

As it stands, I don't think the iris.NameConstraint is able to precisely emulate the past behaviour of _coordless_match in constraints. Comparing against any one of standard_name, long_name, var_name or STASH is not the same as comparing against cube.name(). Would it make sense to add name to names or to compare against it in iris.NameConstraint?

bjlittle · 2019-10-15T10:15:53Z

@stephenworsley The iris.NameConstraint is not emulating any past behaviour, it's new behaviour.

It's allowing you to perform a conjunction of names for a constraint match against a cube. It has no association whatsoever with the _coordless_match. It's targeting a use case where the standard_name is overloaded and cubes can only by discriminated by long_name, hence the need to easily, within our constraints framework, constrain against standard_name and long_name.

pp-mo · 2019-10-15T10:22:44Z

cube.extract('m01s16i203')
So you seem to be opposed to something that already exists. I'm not quite sure how that can be addressed here in this PR?

Well, I didn't actually know that !

Mind you, even if it does that already, I don't have to like it.
As for changing behaviour, you are certainly doing that anyway. Isn't that exactly why we're doing this now??

pp-mo · 2019-10-15T10:26:49Z

format-specific, which seems pretty questionable anyway

I was thinking that, but then I remembered that so is var_name....

I'm not too convinced by that. var_name is a part of the Iris data model, following CF, not just a product of the format loader : Any other file-format loader could set it, if that was appropriate.

I guess In a sense that "could" be true of STASH too, but I don't see how it could have an unambiguous value for any data not from a UM source.

stephenworsley · 2019-10-15T11:20:59Z

@bjlittle I was considering the acceptance criteria of #3407, specifically:

Include an example of how to load a cube in a manner that replicates the current behaviour, so users can easily retain our original functionality should they needed to.

Is this backwards compatibility being addressed? I would have thought there would still need to be some way to compare against Cube.name() using a Constraint in order to be fully backwards compatible. There doesn't seem to be anywhere in iris._constraints which calls Cube.name() any more.

(On a related note, this line in the docstring probably needs changing too).

bjlittle · 2019-10-15T11:39:56Z

@stephenworsley Thanks. We don't need to be fully backwards compatible here, regardless of the demands of #3407, hence the major release. This PR is addressing several use cases, not just #3407.

Given the relaxed load/extract behaviour of this PR my expectation is that most users will see no change in behaviour. It'll just work. For those that find the relaxed behaviour results in additional cubes being returned, then iris.NameConstraint is their new best friend. It seems a very reasonable compromise to me.

If you think this isn't the case, then I'd love to know, with a concrete example to support it, if possible.

stephenworsley · 2019-10-15T12:38:20Z

@bjlittle Have you considered extracting cubes with no names whos name() is 'unknown'? Its not unreasonable to imagine a situation where you might even expect some cubes to have no names and some to have the name 'unknown' and to want to treat those two cases similarly. It may well be possible to get the same results using NameConstraint, but probably not as elegantly as with the current behaviour.

bjlittle · 2019-10-15T21:25:49Z

lib/iris/_constraints.py

@@ -454,7 +454,7 @@ def __init__(self, **attributes):

        """
        self._attributes = attributes
-        Constraint.__init__(self, cube_func=self._cube_func)
+        super().__init__(cube_func=self._cube_func)


This one comes for free, whilst I'm here...

bjlittle · 2019-10-16T05:46:27Z

I'm going to add a convenience to NameConstraint where it will be possible to use positional arguments or/and keyword arguments e.g., where the following constraints are all equivalent:

NameConstraint(standard_name='air_temperature', long_name='air temp')
NameConstraint('air_temperature', 'air temp')
NameConstraint('air_temperature', long_name='air temp')

etc... as I can imaging users typically performing a NameConstraint(None, 'air temp') type of constraint.

bjlittle · 2019-10-17T01:35:21Z

@lbdreyer I'm now satisfied with this PR. It meets a thorny brief in an area of iris that I would consider heavy with technical debt, and a slightly dubious/not-fully-considered strategy for name constraint matching.

The changes in this PR provide the user with more explicit and flexible control to load and extract cubes.

On reflection, we have been relying on a very weak contract to identify cubes through the iris.cube.Cube.name() method. A richer form of this is the iris.cube.Cube.names property, which comes closer to identifying a cube through metadata - it's not perfect, but it's certainly better. The iris.NameConstraint embodies this within the context of our constraints framework.

Anyways, over to you. I'm bias, but I'd vote for banking this PR, moving forward, and supplement it with further documentation and/or doc-strings if you deem that further clarity is required.

lbdreyer · 2019-10-21T13:32:55Z

lib/iris/_cube_coord_common.py

+        stash_name = self.attributes.get('STASH')
+        if stash_name is not None:
+            stash_name = str(stash_name)
+        return (standard_name, long_name, var_name, stash_name)


I'm not sure about the addition of this names method on CFVariableMixin.

stash_name makes sense for a cube but not a coordinate (or cell measure or ancillary variable etc )

The introduction of the names property is in total alignment with the existing name method and its behaviour.

The name method also doesn't make sense for anything other than a cube, and yet we have it applied to coordinates and else where. So your argument also applies to that, and yet it still stands. This PR is highlighting existing behaviour already available in iris - to the surprise of @pp-mo.

So either we honour this relationship, or we remove the use of STASH from both the names property and the name method. If we did, then that would be a breaking change for the API, rather than an additive change.

Note that, the names property is only public as it's referenced from within iris.Constraint._coordless_match method.

lbdreyer · 2019-10-21T13:36:52Z

lib/iris/_constraints.py

+
+    """
+    def __init__(self, standard_name='none', long_name='none',
+                 var_name='none', STASH='none'):


The trouble is that this does introduce another way of constraining on a stash code, which could now be done either by NameConstraint or AttributeConstraint, which goes against the zen of Python

Again, this is in alignment with the name method. As it stands, it's completely possible to perform a stash constraint at least four other ways that I know of, excluding this PR, without thinking about it to much. IMHO iris is so anti-zen here already, that really the whole interface needs a re-design.

That said, the NameConstraint gives explicit control back to the user, so it does have value, and it also rights some wrongs to boot. The question that really needs answering here is whether stash should be part of the name method, names property and NameConstraint?

I know what my answer is, but I'm keen to know what others think... and in formulating an answer I suspect the conflict will come between a purist zen of Python philosophy and user convenience to avoid an ugly iris constraints framework.

pp-mo

A bunch of small suggestions, nothing earthshaking 📳

docs/iris/src/userguide/loading_iris_cubes.rst

lib/iris/_constraints.py

lib/iris/tests/test_constraints.py

lib/iris/_constraints.py

lib/iris/tests/unit/constraints/test_NameConstraint.py

lib/iris/tests/unit/cube_coord_common/test_CFVariableMixin.py

lib/iris/tests/test_constraints.py

pp-mo · 2019-11-15T15:31:06Z

Thanks @bjlittle.
Over to you 🎾

bjlittle · 2019-11-20T10:02:53Z

@pp-mo Serviced requested review actions or commented on why not.

Back to you 👍

lib/iris/_constraints.py

pp-mo · 2019-11-20T14:00:10Z

Nice, definitely improved.
Thanks @bjlittle, back to you again.

Just a couple of outstanding queries. And one brand new one...
Well, we must be close ! 🤞

bjlittle · 2019-11-20T14:10:32Z

@pp-mo Okay, this PR is nearing saturation point for discovering what new actions you want me to take...there is a lot of noise.

Okay, I've tackled a couple of the issues that you raised. If you've raised more and I've not addressed them, then that's simply because I can't find them, sorry.

BTW I've just nuked the additonal note in the doc-string - on reflection, I really don't think it's necessary nor appropriate.

Please let me know if there are more outstanding actions to perform or if I've missed something.

Thanks!

bjlittle · 2019-11-20T14:28:44Z

@pp-mo Sweet 🎉

Thanks for perservering... above and beyond! 👏

* Bump version to 2.4.0rc0. * unpin mpl (#3468) * Merge pull request #3301 from bayliffe/fastpercentilemethod_mask_test Analysis percentile method - update applicability test for fast_percentile_method * Have Travis test with iris-grib, remove problem tests (#3469) * Have Travis test with iris-grib, remove problem tests * mock GribInternalError correctly * Update license headers * account for changes in handling of grib message defaults * Test against the latest version of python-eccodes * Moved irir-grib skip to iris.tests * Merge pull request #2608 from cpelley/PICKLEABLE_FORMATS TEST: Extends #2569 to unpickle * _regrid_area_weighted_array: Move axes creation over which weights are calculated to before loop (#3519) * Purge iris.experimental.regrid np<1.7 support (#3539) * Add NameConstraint with relaxed name loading (#3463) * _regrid_area_weighted_array: move indices variable nearer to use (#3564) * _regrid_area_weighted_array: Tweak variable order to near other use in code (#3571) * Fix problems with export and echo command. (#3577) * Pushdocs fix2 (#3580) * Revert to single-line command for doctr invocation. * Added script comment, partly to force Github respin. * Bracketed six.moves and __future__ imports. * Fixes required due to the release of iris-grib v0.15.0 (#3582) * Fix python-eccodes pin in travis (#3593) * PI-2472: Optimise the area weighted regridding routine (#3598) * PI-2472: Tweak area weighting regrid move xdim and ydim axes (#3594) * _regrid_area_weighted_array: Set axis order to y_dim, x_dim last dimensions * _regrid_area_weighted_array: Extra tests for axes ordering * PI-2472: Tweak area weighting regrid enforce xdim ydim (#3595) * _regrid_area_weighted_array: Set axis order to y_dim, x_dim last dimensions * _regrid_area_weighted_array: Extra tests for axes ordering * _regrid_area_weighted_array: Ensure x_dim and y_dim * PI-2472: Tweak area weighting regrid move averaging out of loop (#3596) * _regrid_area_weighted_array: Refactor weights and move averaging outside loop * Pin pillow to make graphics tests work again. (#3630) * PI-2472: Area weighted regridding (#3623) * The Area-weights routine is refactored into the "__prepare" and "__perform" structure, in-line with our other regridding methods. * The area-weights are now computed in the "__prepare", so are calculated once. * The information required for checking the regridding weights and target grid info are now cached on the "regridder" object. * This is inline with the general use already described in the documentation. * pep8 conformance. * pep8 compliance. * Allow some 'key=None' args in Geostationary creation. (#3628) LGTM * Allow some 'key=None' args in Geostationary creation. * Integration test loading netcdf geostationary without offset properties. * pep8 conformance. * pep8 conformance. * pep8 conformance. * test_NameConstraint get mock from iris.tests. * Remove use of super() in _constraints.py for Py2 compatibility. * Updated license headers. * Updated iris-grib reference in extensions.txt. * Py2 support for iris-grib in Travis. * Updates for auto docs for Iris 2.4 release. * What's new entry to unpinning mpl. * Edited Py2 support for iris-grib in Travis. * Renamed whatsnew contributions folder for v2.4. * Hacked tests.integration.test_grib2 to avoid import error from iris-grib version < 0.15. * Only test grib with python 3. * Compiled v2.4 whatsnew. Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> Co-authored-by: Bill Little <bill.little@metoffice.gov.uk> Co-authored-by: stephenworsley <49274989+stephenworsley@users.noreply.github.com> Co-authored-by: abooton <anna.booton@metoffice.gov.uk> Co-authored-by: lbdreyer <lbdreyer@users.noreply.github.com> Co-authored-by: Emma Hogan <ehogan@users.noreply.github.com>

stickler-ci reviewed Oct 14, 2019

View reviewed changes

lib/iris/tests/test_constraints.py Outdated Show resolved Hide resolved

bjlittle added Release: Major Type: Enhancement labels Oct 14, 2019

bjlittle added this to the v3.0.0 milestone Oct 14, 2019

rcomer reviewed Oct 14, 2019

View reviewed changes

stephenworsley reviewed Oct 15, 2019

View reviewed changes

bjlittle force-pushed the relax-name-loading-and-add-name-contraint branch from 48e7e76 to a759170 Compare October 15, 2019 10:24

bjlittle commented Oct 15, 2019

View reviewed changes

bjlittle force-pushed the relax-name-loading-and-add-name-contraint branch from bb75f43 to 258b3ab Compare October 17, 2019 00:57

bjlittle assigned lbdreyer Oct 17, 2019

lbdreyer removed their assignment Oct 17, 2019

lbdreyer reviewed Oct 21, 2019

View reviewed changes

pp-mo reviewed Nov 15, 2019

View reviewed changes

bjlittle added 12 commits November 20, 2019 05:48

Add NameConstraint with relaxed name loading

ad34259

appease the god of stickler-ci

58f12f6

add whatsnew newfeature entries

9d24096

review actions

a25e8e4

review actions

3a4f9c2

review actions

d73621a

make NameConstraint convenient

6d1b42b

Fix NameConstraint doc-string

d6b7db2

Add new license header to new files

b52f7c3

Names property returns namedtuple

71e7929

fix manual typo from rebase and blackify

7073811

review actions

1e10bda

bjlittle force-pushed the relax-name-loading-and-add-name-contraint branch from 7ba58bb to 1e10bda Compare November 20, 2019 09:40

add _coordless_match comment

c3ce008

pp-mo mentioned this pull request Nov 20, 2019

Re-order constraints topics in User Guide #3553

Closed

pp-mo reviewed Nov 20, 2019

View reviewed changes

lib/iris/_constraints.py Outdated Show resolved Hide resolved

further review actions

b535c77

pp-mo merged commit 2d75d7f into SciTools:master Nov 20, 2019

bjlittle deleted the relax-name-loading-and-add-name-contraint branch December 6, 2019 14:48

trexfeathers pushed a commit to trexfeathers/iris that referenced this pull request Jan 14, 2020

Add NameConstraint with relaxed name loading (SciTools#3463)

fdad3dc

pp-mo pushed a commit to pp-mo/iris that referenced this pull request Jan 14, 2020

Add NameConstraint with relaxed name loading (SciTools#3463)

e9fcace

pp-mo pushed a commit to pp-mo/iris that referenced this pull request Jan 14, 2020

Add NameConstraint with relaxed name loading (SciTools#3463)

32875ed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add NameConstraint with relaxed name loading #3463

Add NameConstraint with relaxed name loading #3463

bjlittle commented Oct 14, 2019 •

edited

Loading

bjlittle commented Oct 14, 2019

rcomer left a comment

rcomer Oct 14, 2019

bjlittle Oct 15, 2019

rcomer Oct 15, 2019

bjlittle Oct 15, 2019

rcomer Oct 15, 2019

pp-mo commented Oct 15, 2019 •

edited

Loading

rcomer commented Oct 15, 2019

stephenworsley Oct 15, 2019

bjlittle commented Oct 15, 2019 •

edited

Loading

stephenworsley commented Oct 15, 2019

bjlittle commented Oct 15, 2019 •

edited

Loading

pp-mo commented Oct 15, 2019 •

edited

Loading

pp-mo commented Oct 15, 2019 •

edited

Loading

stephenworsley commented Oct 15, 2019

bjlittle commented Oct 15, 2019 •

edited

Loading

stephenworsley commented Oct 15, 2019 •

edited

Loading

bjlittle Oct 15, 2019

bjlittle commented Oct 16, 2019

bjlittle commented Oct 17, 2019

lbdreyer Oct 21, 2019

bjlittle Oct 21, 2019 •

edited

Loading

lbdreyer Oct 21, 2019

bjlittle Oct 21, 2019 •

edited

Loading

pp-mo left a comment

pp-mo commented Nov 15, 2019

bjlittle commented Nov 20, 2019

pp-mo commented Nov 20, 2019

bjlittle commented Nov 20, 2019 •

edited

Loading

bjlittle commented Nov 20, 2019

Add NameConstraint with relaxed name loading #3463

Add NameConstraint with relaxed name loading #3463

Conversation

bjlittle commented Oct 14, 2019 • edited Loading

bjlittle commented Oct 14, 2019

rcomer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pp-mo commented Oct 15, 2019 • edited Loading

rcomer commented Oct 15, 2019

Choose a reason for hiding this comment

bjlittle commented Oct 15, 2019 • edited Loading

stephenworsley commented Oct 15, 2019

bjlittle commented Oct 15, 2019 • edited Loading

pp-mo commented Oct 15, 2019 • edited Loading

pp-mo commented Oct 15, 2019 • edited Loading

stephenworsley commented Oct 15, 2019

bjlittle commented Oct 15, 2019 • edited Loading

stephenworsley commented Oct 15, 2019 • edited Loading

Choose a reason for hiding this comment

bjlittle commented Oct 16, 2019

bjlittle commented Oct 17, 2019

Choose a reason for hiding this comment

bjlittle Oct 21, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bjlittle Oct 21, 2019 • edited Loading

Choose a reason for hiding this comment

pp-mo left a comment

Choose a reason for hiding this comment

pp-mo commented Nov 15, 2019

bjlittle commented Nov 20, 2019

pp-mo commented Nov 20, 2019

bjlittle commented Nov 20, 2019 • edited Loading

bjlittle commented Nov 20, 2019

bjlittle commented Oct 14, 2019 •

edited

Loading

pp-mo commented Oct 15, 2019 •

edited

Loading

bjlittle commented Oct 15, 2019 •

edited

Loading

bjlittle commented Oct 15, 2019 •

edited

Loading

pp-mo commented Oct 15, 2019 •

edited

Loading

pp-mo commented Oct 15, 2019 •

edited

Loading

bjlittle commented Oct 15, 2019 •

edited

Loading

stephenworsley commented Oct 15, 2019 •

edited

Loading

bjlittle Oct 21, 2019 •

edited

Loading

bjlittle Oct 21, 2019 •

edited

Loading

bjlittle commented Nov 20, 2019 •

edited

Loading