Analysis percentile method - update applicability test for fast_percentile_method #3301

bayliffe · 2019-04-03T10:31:44Z

Since Iris 2 was introduced all cube data arrays have become MaskedArray type, though by default no mask is set. The fast_percentile_method implemented in analysis uses the test isMaskedArray to determine if the input cube is masked, which precludes use of the fast percentile method, however this is overly restrictive. The now default MaskedArray type used for cubes does not by default have a mask set (instead it is simply mask=False). If no mask is set, the faster numpy method works fine and remains orders of magnitude faster that the scipy equivalent.

This PR changes the test to use np.ma.is_masked which differentiates between a MaskedArray with and without an applied mask. This enables the use of the fast_percentile_method once again for most cubes.

Tests have been added to cover the case of using a MaskedArray with no set mask. The functions used in the unit tests have also been modified for all tests to check that the returned cube.data type matches that which was passed in.

rcomer · 2019-08-28T14:58:47Z

Hi @bayliffe, at least some of your Travis test failures are things that have been fixed in the master branch since you submitted this. Can I suggest you try rebasing to pick up the changes?

bjlittle · 2019-10-03T17:42:26Z

Hey @bayliffe, thanks for this PR. Would you mind rebasing?

… restrictive. This numpy method will work with arrays of masked type is no mask is set.

bayliffe · 2019-10-04T07:26:32Z

@bjlittle
PR rebased, and now travis is my friend. Thanks for the prompt (sorry I missed the previous one).

pp-mo · 2019-10-14T16:21:47Z

@bjlittle @rcomer from what I see this can be merged,
but perhaps we were all waiting for the others ??

rcomer · 2019-10-14T17:36:09Z

@pp-mo I have not read through in detail so, if you have, please go ahead. I’m looking forward to taking the workarounds out of my user code 😀

bjlittle · 2019-10-17T14:21:37Z

@pp-mo @rcomer There's a bigger question here, as to whether we deal with this in a systemic way across iris i.e., should all masked arrays with no mask points automatically be converted to ordinary arrays?

I'd pretty much say yes, given the poor performance of masked arrays, see SciTools/iris-agg-regrid#37

Anyways, regardlessly, I'm banking this... Awesome thanks @bayliffe 👍

rcomer · 2019-10-17T15:00:26Z

@bjlittle I guess I’d want to understand the rationale for the NetCDF4 folks making this change, and then decide if I agree with it!

I don’t know what iris-agg-regrid does, but we have a similar check in the area weighted regridder. Fortunately that one already uses is_masked.

bjlittle · 2019-10-17T16:12:21Z

@rcomer See https://github.com/Unidata/netcdf4-python and the associated news for release 1.4.0. There have been many breaking changes IMHO in minor releases which have caused issues to the community. I don't think we can reasonably expect netcdf4-python to change or back out certain behaviour that we take issue with... Well I don't think so.

rcomer · 2019-10-17T18:41:39Z

@bjlittle sorry, I wasn’t clear. I meant that if their rationale for this behaviour makes sense, then it might also make sense for Iris and therefore we shouldn’t override it. Wasn’t thinking we could ask them to revert it.

rcomer · 2019-10-17T19:52:35Z

Possibly related: #2046 #2856

bjlittle · 2019-10-17T20:29:37Z

@rcomer I'm assuming their rational is related to consistency of the type of array exposed to users i.e., no need to check if it's masked or not - everything is just masked. That sounds attractive, as you don't need to "if" and "but" within application code... however, this rational is independent to performance. If I'm guessing right, they bought in to the utility of consistency, but at the cost of perhaps being (4-5 times) slower, but surely unbeknownst to them.

Having such consistency at the cost of performance is a pretty hard sell... if only I got a penny from every scientist who wanted their Python code to run faster 🤪

…ask_test Analysis percentile method - update applicability test for fast_percentile_method

* Bump version to 2.4.0rc0. * unpin mpl (#3468) * Merge pull request #3301 from bayliffe/fastpercentilemethod_mask_test Analysis percentile method - update applicability test for fast_percentile_method * Have Travis test with iris-grib, remove problem tests (#3469) * Have Travis test with iris-grib, remove problem tests * mock GribInternalError correctly * Update license headers * account for changes in handling of grib message defaults * Test against the latest version of python-eccodes * Moved irir-grib skip to iris.tests * Merge pull request #2608 from cpelley/PICKLEABLE_FORMATS TEST: Extends #2569 to unpickle * _regrid_area_weighted_array: Move axes creation over which weights are calculated to before loop (#3519) * Purge iris.experimental.regrid np<1.7 support (#3539) * Add NameConstraint with relaxed name loading (#3463) * _regrid_area_weighted_array: move indices variable nearer to use (#3564) * _regrid_area_weighted_array: Tweak variable order to near other use in code (#3571) * Fix problems with export and echo command. (#3577) * Pushdocs fix2 (#3580) * Revert to single-line command for doctr invocation. * Added script comment, partly to force Github respin. * Bracketed six.moves and __future__ imports. * Fixes required due to the release of iris-grib v0.15.0 (#3582) * Fix python-eccodes pin in travis (#3593) * PI-2472: Optimise the area weighted regridding routine (#3598) * PI-2472: Tweak area weighting regrid move xdim and ydim axes (#3594) * _regrid_area_weighted_array: Set axis order to y_dim, x_dim last dimensions * _regrid_area_weighted_array: Extra tests for axes ordering * PI-2472: Tweak area weighting regrid enforce xdim ydim (#3595) * _regrid_area_weighted_array: Set axis order to y_dim, x_dim last dimensions * _regrid_area_weighted_array: Extra tests for axes ordering * _regrid_area_weighted_array: Ensure x_dim and y_dim * PI-2472: Tweak area weighting regrid move averaging out of loop (#3596) * _regrid_area_weighted_array: Refactor weights and move averaging outside loop * Pin pillow to make graphics tests work again. (#3630) * PI-2472: Area weighted regridding (#3623) * The Area-weights routine is refactored into the "__prepare" and "__perform" structure, in-line with our other regridding methods. * The area-weights are now computed in the "__prepare", so are calculated once. * The information required for checking the regridding weights and target grid info are now cached on the "regridder" object. * This is inline with the general use already described in the documentation. * pep8 conformance. * pep8 compliance. * Allow some 'key=None' args in Geostationary creation. (#3628) LGTM * Allow some 'key=None' args in Geostationary creation. * Integration test loading netcdf geostationary without offset properties. * pep8 conformance. * pep8 conformance. * pep8 conformance. * test_NameConstraint get mock from iris.tests. * Remove use of super() in _constraints.py for Py2 compatibility. * Updated license headers. * Updated iris-grib reference in extensions.txt. * Py2 support for iris-grib in Travis. * Updates for auto docs for Iris 2.4 release. * What's new entry to unpinning mpl. * Edited Py2 support for iris-grib in Travis. * Renamed whatsnew contributions folder for v2.4. * Hacked tests.integration.test_grib2 to avoid import error from iris-grib version < 0.15. * Only test grib with python 3. * Compiled v2.4 whatsnew. Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> Co-authored-by: Bill Little <bill.little@metoffice.gov.uk> Co-authored-by: stephenworsley <49274989+stephenworsley@users.noreply.github.com> Co-authored-by: abooton <anna.booton@metoffice.gov.uk> Co-authored-by: lbdreyer <lbdreyer@users.noreply.github.com> Co-authored-by: Emma Hogan <ehogan@users.noreply.github.com>

bayliffe added 2 commits October 4, 2019 08:05

Modifies the fast_percentile_method test for masked arrays to be less…

1854796

… restrictive. This numpy method will work with arrays of masked type is no mask is set.

Update license date in file header.

1f93de2

bayliffe force-pushed the fastpercentilemethod_mask_test branch from 93bb244 to 1f93de2 Compare October 4, 2019 07:06

bjlittle self-assigned this Oct 4, 2019

bjlittle added Release: Major Type: Performance labels Oct 4, 2019

bjlittle added this to the v3.0.0 milestone Oct 4, 2019

bjlittle merged commit f0d4a1d into SciTools:master Oct 17, 2019

bjlittle mentioned this pull request Oct 17, 2019

Masked array performance #3470

Closed

6 tasks

trexfeathers pushed a commit to trexfeathers/iris that referenced this pull request Jan 13, 2020

Merge pull request SciTools#3301 from bayliffe/fastpercentilemethod_m…

f6a209d

…ask_test Analysis percentile method - update applicability test for fast_percentile_method

pp-mo pushed a commit to pp-mo/iris that referenced this pull request Jan 14, 2020

Merge pull request SciTools#3301 from bayliffe/fastpercentilemethod_m…

a7407a9

…ask_test Analysis percentile method - update applicability test for fast_percentile_method

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Analysis percentile method - update applicability test for fast_percentile_method #3301

Analysis percentile method - update applicability test for fast_percentile_method #3301

bayliffe commented Apr 3, 2019 •

edited

Loading

rcomer commented Aug 28, 2019

bjlittle commented Oct 3, 2019

bayliffe commented Oct 4, 2019

pp-mo commented Oct 14, 2019

rcomer commented Oct 14, 2019

bjlittle commented Oct 17, 2019

rcomer commented Oct 17, 2019

bjlittle commented Oct 17, 2019

rcomer commented Oct 17, 2019

rcomer commented Oct 17, 2019

bjlittle commented Oct 17, 2019 •

edited

Loading

Analysis percentile method - update applicability test for fast_percentile_method #3301

Analysis percentile method - update applicability test for fast_percentile_method #3301

Conversation

bayliffe commented Apr 3, 2019 • edited Loading

rcomer commented Aug 28, 2019

bjlittle commented Oct 3, 2019

bayliffe commented Oct 4, 2019

pp-mo commented Oct 14, 2019

rcomer commented Oct 14, 2019

bjlittle commented Oct 17, 2019

rcomer commented Oct 17, 2019

bjlittle commented Oct 17, 2019

rcomer commented Oct 17, 2019

rcomer commented Oct 17, 2019

bjlittle commented Oct 17, 2019 • edited Loading

bayliffe commented Apr 3, 2019 •

edited

Loading

bjlittle commented Oct 17, 2019 •

edited

Loading