Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analysis percentile method - update applicability test for fast_percentile_method #3301

Merged
merged 2 commits into from
Oct 17, 2019

Conversation

bayliffe
Copy link
Contributor

@bayliffe bayliffe commented Apr 3, 2019

Since Iris 2 was introduced all cube data arrays have become MaskedArray type, though by default no mask is set. The fast_percentile_method implemented in analysis uses the test isMaskedArray to determine if the input cube is masked, which precludes use of the fast percentile method, however this is overly restrictive. The now default MaskedArray type used for cubes does not by default have a mask set (instead it is simply mask=False). If no mask is set, the faster numpy method works fine and remains orders of magnitude faster that the scipy equivalent.

This PR changes the test to use np.ma.is_masked which differentiates between a MaskedArray with and without an applied mask. This enables the use of the fast_percentile_method once again for most cubes.

Tests have been added to cover the case of using a MaskedArray with no set mask. The functions used in the unit tests have also been modified for all tests to check that the returned cube.data type matches that which was passed in.

@rcomer
Copy link
Member

rcomer commented Aug 28, 2019

Hi @bayliffe, at least some of your Travis test failures are things that have been fixed in the master branch since you submitted this. Can I suggest you try rebasing to pick up the changes?

@bjlittle
Copy link
Member

bjlittle commented Oct 3, 2019

Hey @bayliffe, thanks for this PR. Would you mind rebasing?

… restrictive. This numpy method will work with arrays of masked type is no mask is set.
@bayliffe bayliffe force-pushed the fastpercentilemethod_mask_test branch from 93bb244 to 1f93de2 Compare October 4, 2019 07:06
@bayliffe
Copy link
Contributor Author

bayliffe commented Oct 4, 2019

@bjlittle
PR rebased, and now travis is my friend. Thanks for the prompt (sorry I missed the previous one).

@bjlittle bjlittle self-assigned this Oct 4, 2019
@bjlittle bjlittle added this to the v3.0.0 milestone Oct 4, 2019
@pp-mo
Copy link
Member

pp-mo commented Oct 14, 2019

@bjlittle @rcomer from what I see this can be merged,
but perhaps we were all waiting for the others ??

@rcomer
Copy link
Member

rcomer commented Oct 14, 2019

@pp-mo I have not read through in detail so, if you have, please go ahead. I’m looking forward to taking the workarounds out of my user code 😀

@bjlittle
Copy link
Member

@pp-mo @rcomer There's a bigger question here, as to whether we deal with this in a systemic way across iris i.e., should all masked arrays with no mask points automatically be converted to ordinary arrays?

I'd pretty much say yes, given the poor performance of masked arrays, see SciTools/iris-agg-regrid#37

Anyways, regardlessly, I'm banking this... Awesome thanks @bayliffe 👍

@bjlittle bjlittle merged commit f0d4a1d into SciTools:master Oct 17, 2019
@bjlittle bjlittle mentioned this pull request Oct 17, 2019
6 tasks
@rcomer
Copy link
Member

rcomer commented Oct 17, 2019

@bjlittle I guess I’d want to understand the rationale for the NetCDF4 folks making this change, and then decide if I agree with it!

I don’t know what iris-agg-regrid does, but we have a similar check in the area weighted regridder. Fortunately that one already uses is_masked.

@bjlittle
Copy link
Member

@rcomer See https://github.com/Unidata/netcdf4-python and the associated news for release 1.4.0. There have been many breaking changes IMHO in minor releases which have caused issues to the community. I don't think we can reasonably expect netcdf4-python to change or back out certain behaviour that we take issue with... Well I don't think so.

@rcomer
Copy link
Member

rcomer commented Oct 17, 2019

@bjlittle sorry, I wasn’t clear. I meant that if their rationale for this behaviour makes sense, then it might also make sense for Iris and therefore we shouldn’t override it. Wasn’t thinking we could ask them to revert it.

@rcomer
Copy link
Member

rcomer commented Oct 17, 2019

Possibly related: #2046 #2856

@bjlittle
Copy link
Member

bjlittle commented Oct 17, 2019

@rcomer I'm assuming their rational is related to consistency of the type of array exposed to users i.e., no need to check if it's masked or not - everything is just masked. That sounds attractive, as you don't need to "if" and "but" within application code... however, this rational is independent to performance. If I'm guessing right, they bought in to the utility of consistency, but at the cost of perhaps being (4-5 times) slower, but surely unbeknownst to them.

Having such consistency at the cost of performance is a pretty hard sell... if only I got a penny from every scientist who wanted their Python code to run faster 🤪

trexfeathers pushed a commit to trexfeathers/iris that referenced this pull request Jan 13, 2020
…ask_test

Analysis percentile method - update applicability test for fast_percentile_method
pp-mo pushed a commit to pp-mo/iris that referenced this pull request Jan 14, 2020
…ask_test

Analysis percentile method - update applicability test for fast_percentile_method
trexfeathers added a commit that referenced this pull request Jan 16, 2020
* Bump version to 2.4.0rc0.

* unpin mpl (#3468)

* Merge pull request #3301 from bayliffe/fastpercentilemethod_mask_test

Analysis percentile method - update applicability test for fast_percentile_method

* Have Travis test with iris-grib, remove problem tests (#3469)

* Have Travis test with iris-grib, remove problem tests

* mock GribInternalError correctly

* Update license headers

* account for changes in handling of grib message defaults

* Test against the latest version of python-eccodes

* Moved irir-grib skip to iris.tests

* Merge pull request #2608 from cpelley/PICKLEABLE_FORMATS

TEST: Extends #2569 to unpickle

* _regrid_area_weighted_array: Move axes creation over which weights are calculated to before loop (#3519)

* Purge iris.experimental.regrid np<1.7 support (#3539)

* Add NameConstraint with relaxed name loading (#3463)

* _regrid_area_weighted_array:  move indices variable nearer to use (#3564)

* _regrid_area_weighted_array: Tweak variable order to near other use in code (#3571)

* Fix problems with export and echo command. (#3577)

* Pushdocs fix2 (#3580)

* Revert to single-line command for doctr invocation.

* Added script comment, partly to force Github respin.

* Bracketed six.moves and __future__ imports.

* Fixes required due to the release of iris-grib v0.15.0 (#3582)

* Fix python-eccodes pin in travis (#3593)

* PI-2472: Optimise the area weighted regridding routine (#3598)

* PI-2472: Tweak area weighting regrid move xdim and ydim axes (#3594)

* _regrid_area_weighted_array: Set axis order to y_dim, x_dim last dimensions

* _regrid_area_weighted_array: Extra tests for axes ordering

* PI-2472: Tweak area weighting regrid enforce xdim ydim (#3595)

* _regrid_area_weighted_array: Set axis order to y_dim, x_dim last dimensions

* _regrid_area_weighted_array: Extra tests for axes ordering

* _regrid_area_weighted_array: Ensure x_dim and y_dim

* PI-2472: Tweak area weighting regrid move averaging out of loop (#3596)

* _regrid_area_weighted_array: Refactor weights and move averaging outside loop

* Pin pillow to make graphics tests work again. (#3630)

* PI-2472: Area weighted regridding (#3623)

* The Area-weights routine is refactored into the "__prepare" and "__perform" structure, in-line with our other regridding methods.
* The area-weights are now computed in the "__prepare", so are calculated once.
* The information required for checking the regridding weights and target grid info are now cached on the "regridder" object.
* This is inline with the general use already described in the documentation.

* pep8 conformance.

* pep8 compliance.

* Allow some 'key=None' args in Geostationary creation. (#3628)

LGTM

* Allow some 'key=None' args in Geostationary creation.

* Integration test loading netcdf geostationary without offset properties.

* pep8 conformance.

* pep8 conformance.

* pep8 conformance.

* test_NameConstraint get mock from iris.tests.

* Remove use of super() in _constraints.py for Py2 compatibility.

* Updated license headers.

* Updated iris-grib reference in extensions.txt.

* Py2 support for iris-grib in Travis.

* Updates for auto docs for Iris 2.4 release.

* What's new entry to unpinning mpl.

* Edited Py2 support for iris-grib in Travis.

* Renamed whatsnew contributions folder for v2.4.

* Hacked tests.integration.test_grib2 to avoid import error from iris-grib version < 0.15.

* Only test grib with python 3.

* Compiled v2.4 whatsnew.

Co-authored-by: Martin Yeo <40734014+trexfeathers@users.noreply.github.com>
Co-authored-by: Bill Little <bill.little@metoffice.gov.uk>
Co-authored-by: stephenworsley <49274989+stephenworsley@users.noreply.github.com>
Co-authored-by: abooton <anna.booton@metoffice.gov.uk>
Co-authored-by: lbdreyer <lbdreyer@users.noreply.github.com>
Co-authored-by: Emma Hogan <ehogan@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants