Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NameConstraint with relaxed name loading #3463

Merged
26 changes: 22 additions & 4 deletions docs/iris/src/userguide/loading_iris_cubes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -166,18 +166,36 @@ As we have seen, loading the following file creates several Cubes::
cubes = iris.load(filename)

Specifying a name as a constraint argument to :py:func:`iris.load` will mean
only cubes with a matching :meth:`name <iris.cube.Cube.name>`
only cubes with matching :meth:`name <iris.cube.Cube.names>`
will be returned::

filename = iris.sample_data_path('uk_hires.pp')
cubes = iris.load(filename, 'specific_humidity')
cubes = iris.load(filename, 'surface_altitude')

To constrain the load to multiple distinct constraints, a list of constraints
Note that, the provided name will match against either the standard name,
long name, NetCDF variable name or STASH metadata of a cube. Therefore, the
previous example using the ``surface_altitude`` standard name constraint can
also be achieved using the STASH value of ``m01s00i033``::

filename = iris.sample_data_path('uk_hires.pp')
cubes = iris.load(filename, 'm01s00i033')

If further specific name constraint control is required i.e., to constrain
against a combination of standard name, long name, NetCDF variable name and/or
STASH metadata, consider using the :class:`iris.NameConstraint`. For example,
to constrain against both a standard name of ``surface_altitude`` **and** a STASH
of ``m01s00i033``::

filename = iris.sample_data_path('uk_hires.pp')
constraint = iris.NameConstraint(standard_name='surface_altitude', STASH='m01s00i033')
cubes = iris.load(filename, constraint)

To constrain the load to multiple distinct constraints, a list of constraints
can be provided. This is equivalent to running load once for each constraint
but is likely to be more efficient::

filename = iris.sample_data_path('uk_hires.pp')
cubes = iris.load(filename, ['air_potential_temperature', 'specific_humidity'])
cubes = iris.load(filename, ['air_potential_temperature', 'surface_altitude'])

The :class:`iris.Constraint` class can be used to restrict coordinate values
on load. For example, to constrain the load to match
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
* The :class:`~iris.NameConstraint` provides richer name constraint matching when loading or extracting against cubes, by supporting a constraint against any combination of ``standard_name``, ``long_name``, NetCDF ``var_name`` and ``STASH`` from the attributes dictionary of a :class:`~iris.cube.Cube`.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
* Cubes and coordinates now have a new ``names`` property that contains a tuple of the ``standard_name``, ``long_name``, NetCDF ``var_name``, and ``STASH`` attributes metadata.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
* Name constraint matching against cubes during loading or extracting has been relaxed from strictly matching against the :meth:`~iris.cube.Cube.name`, to matching against either the ``standard_name``, ``long_name``, NetCDF ``var_name``, or ``STASH`` attributes metadata of a cube.
2 changes: 2 additions & 0 deletions lib/iris/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ def callback(cube, field, filename):
"save",
"Constraint",
"AttributeConstraint",
"NameConstraint",
"sample_data_path",
"site_configuration",
"Future",
Expand All @@ -127,6 +128,7 @@ def callback(cube, field, filename):

Constraint = iris._constraints.Constraint
AttributeConstraint = iris._constraints.AttributeConstraint
NameConstraint = iris._constraints.NameConstraint


class Future(threading.local):
Expand Down
113 changes: 110 additions & 3 deletions lib/iris/_constraints.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,8 @@ def __init__(self, name=None, cube_func=None, coord_values=None, **kwargs):
Args:

* name: string or None
If a string, it is used as the name to match against Cube.name().
If a string, it is used as the name to match against the
`~iris.cube.Cube.names` property.
* cube_func: callable or None
If a callable, it must accept a Cube as its first and only argument
and return either True or False.
Expand Down Expand Up @@ -128,7 +129,9 @@ def _coordless_match(self, cube):
"""
match = True
if self._name:
match = self._name == cube.name()
# Require to also check against cube.name() for the fallback
# "unknown" default case, when there is no name metadata available.
match = self._name in cube.names or self._name == cube.name()
if match and self._cube_func:
match = self._cube_func(cube)
return match
Expand Down Expand Up @@ -466,7 +469,7 @@ def __init__(self, **attributes):

"""
self._attributes = attributes
Constraint.__init__(self, cube_func=self._cube_func)
super().__init__(cube_func=self._cube_func)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one comes for free, whilst I'm here...


def _cube_func(self, cube):
match = True
Expand All @@ -490,3 +493,107 @@ def _cube_func(self, cube):

def __repr__(self):
return "AttributeConstraint(%r)" % self._attributes


class NameConstraint(Constraint):
"""Provides a simple Cube name based :class:`Constraint`."""

def __init__(
self,
standard_name="none",
long_name="none",
var_name="none",
STASH="none",
):
"""
Provides a simple Cube name based :class:`Constraint`, which matches
against each of the names provided, which may be either standard name,
long name, NetCDF variable name and/or the STASH from the attributes
dictionary.

The name constraint will only succeed if *all* of the provided names
match.

Kwargs:
* standard_name:
A string or callable representing the standard name to match
against.
* long_name:
A string or callable representing the long name to match against.
* var_name:
A string or callable representing the NetCDF variable name to match
against.
* STASH:
A string or callable representing the UM STASH code to match
against.

.. note::
The default value of each of the keyword arguments is the string
"none", rather than the singleton None, as None may be a legitimate
value to be matched against e.g., to constrain against all cubes
where the standard_name is not set, then use standard_name=None.

... note::
The None value will not be passed through to a callable. Instead
use the "<name>=None" pattern.

Returns:
* Boolean

Example usage::

iris.NameConstraint(long_name='air temp', var_name=None)

iris.NameConstraint(long_name=lambda name: 'temp' in name)

iris.NameConstraint(standard_name='air_temperature',
STASH=lambda stash: stash.item == 203)

"""
self.standard_name = standard_name
self.long_name = long_name
self.var_name = var_name
self.STASH = STASH
self._names = ("standard_name", "long_name", "var_name", "STASH")
super().__init__(cube_func=self._cube_func)

def _cube_func(self, cube):
def matcher(target, value):
if callable(value):
result = False
if target is not None:
#
# Don't pass None through into the callable. Users should
# use the "name=None" pattern instead. Otherwise, users
# will need to explicitly handle the None case, which is
# unnecessary and pretty darn ugly e.g.,
#
# lambda name: name is not None and name.startswith('ick')
#
result = value(target)
else:
result = value == target
return result

match = True
for name in self._names:
expected = getattr(self, name)
if expected != "none":
if name == "STASH":
actual = cube.attributes.get(name)
else:
actual = getattr(cube, name)
match = matcher(actual, expected)
# Make this is a short-circuit match.
if match is False:
break

return match

def __repr__(self):
names = []
for name in self._names:
value = getattr(self, name)
if value != "none":
names.append("{}={!r}".format(name, value))
return "{}({})".format(self.__class__.__name__, ", ".join(names))
42 changes: 42 additions & 0 deletions lib/iris/_cube_coord_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
# See COPYING and COPYING.LESSER in the root of the repository for full
# licensing details.


from collections import namedtuple
import re

import cf_units
Expand All @@ -15,6 +17,30 @@
_TOKEN_PARSE = re.compile(r"""^[a-zA-Z0-9][\w\.\+\-@]*$""")


class Names(
namedtuple("Names", ["standard_name", "long_name", "var_name", "STASH"])
):
"""
Immutable container for name metadata.

Args:

* standard_name:
A string representing the CF Conventions and Metadata standard name, or
None.
* long_name:
A string representing the CF Conventions and Metadata long name, or
None
* var_name:
A string representing the associated NetCDF variable name, or None.
* STASH:
A string representing the `~iris.fileformats.pp.STASH` code, or None.

"""

__slots__ = ()


def get_valid_standard_name(name):
# Standard names are optionally followed by a standard name
# modifier, separated by one or more blank spaces
Expand Down Expand Up @@ -181,6 +207,22 @@ def _check(item):

return result

@property
def names(self):
"""
A tuple containing all of the metadata names. This includes the
standard name, long name, NetCDF variable name, and attributes
STASH name.

"""
standard_name = self.standard_name
long_name = self.long_name
var_name = self.var_name
stash_name = self.attributes.get("STASH")
if stash_name is not None:
stash_name = str(stash_name)
return Names(standard_name, long_name, var_name, stash_name)

def rename(self, name):
"""
Changes the human-readable name.
Expand Down
Loading