Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with xarray indexing in v0.9.0 #56

Closed
CapeHobbit opened this issue Feb 14, 2023 · 15 comments
Closed

Issue with xarray indexing in v0.9.0 #56

CapeHobbit opened this issue Feb 14, 2023 · 15 comments

Comments

@CapeHobbit
Copy link

CapeHobbit commented Feb 14, 2023

Hi. Firstly, thanks for this excellent package, I am using it extensively in my Copernicus marine training.

Historically, I was using version 0.8.0 without issue. To do this, though, I had to "pin" to an older version of xarray, which is getting harder to maintain as part of a wider environment. I wanted to move up to 0.9.0 and free up xarray to iterate to a newer version, but I run into the following problem when I run the threshold routine;

Error

This may be related to how I treat the initial data as I average it into a 1 dimensional time series before I try to run the thresholding. However, I have tried this without the spatial averaging and get different errors related to dask.

My processing is as follows;

SST_data = xr.open_mfdataset(glob.glob(os.path.join(os.getcwd(),'OSTIA_*.nc')))
SST_spatial_average = SST_data.mean(dim='latitude').mean(dim='longitude')
SST_time_series = SST_spatial_average["analysed_sst"].load() - 273.15
clim = threshold(SST_time_series)

Are you able to offer any insight?

Cheers,

Ben

You can find my test data set here (needs to be unzipped):

OSTIA_SST.nc.zip

@paolap
Copy link
Member

paolap commented Feb 15, 2023

Hi Ben, I never had this issue before, could you just confirm the following, so I'm sure I'm going to test with the same packages:
xarray, pandas version
and if the exact same workflow was working with 0.8.0 and an older version of xarray

@CapeHobbit
Copy link
Author

My working example has the following versions under Python 3.9:

xarray 2022.3.0 pyhd8ed1ab_0 conda-forge
pandas 1.5.3 py39hecff1ad_0 conda-forge
xmhw 0.8.0 py_0 coecms

The xarray and xmhw versions are "pinned" in my yaml file. If I let them float freely then my versions are:
xarray 2023.2.0 pyhd8ed1ab_0 conda-forge
pandas 1.5.3 py39hecff1ad_0 conda-forge
xmhw 0.9.0 py_0 coecms

I haven't made any changes to the routine.

@florianboergel
Copy link

Hey could you check if you haven a singleton coordinate left that has no use? For example, I iteratate over depth levels and need to drop the depth information.

sst = temperatureData.drop("depth").TEMP
sst = sst.chunk({'time': -1, 'lat': 'auto', 'lon': 'auto'})
sst.data

with that I can avoid this error.

@CapeHobbit
Copy link
Author

Will do and report back tomorrow. Thanks!

@CapeHobbit
Copy link
Author

Unfortunately, I can confirm that I have no singleton dimensions, and yet the error persists. Below is the xarray dump of the array I submit to threshold;

`<xarray.DataArray 'analysed_sst' (time: 3804)>
dask.array<sub, shape=(3804,), dtype=float32, chunksize=(3804,), chunktype=numpy.ndarray>
Coordinates:

  • time (time) datetime64[ns] 2012-01-01T12:00:00 ... 2022-05-31T12:00:00`

It has dims of ('time',)

I admit, that final comma in the dims makes me nervous, but it seems to be how xarray displays the dimension data, and not indicative of a singleton dimension.

@florianboergel
Copy link

Maybe it is related to the dimension point that is added in the function land_check

    dims = list(temp.dims)
    # Add an extra fake dimensions if array 1-dimensional
    # so a 'cell' dimension can still be created
    dims.remove(tdim)
    if len(dims) == 0:
        temp = temp.expand_dims({"point": [0.0]})
        dims = ["point"]

Since your error is also pointing towards a variable called point. But I cannot really offer help to be honest.

@CapeHobbit
Copy link
Author

Thanks @florianboergel - I'll dig a little deeper. I don't suppose you could add a small, known working netCDF file here, could you? It would be good to separate the toolkit from the data source before I start debugging.

@paolap
Copy link
Member

paolap commented Feb 19, 2023

It looks like the workaround to fix the coordinates issue with the new xarray dimension doesn't work for this extra dimension. Point is added if you have a "point" time series so stacking the dimensions will still work. I'll try to fix this today.

@paolap paolap mentioned this issue Feb 20, 2023
@paolap
Copy link
Member

paolap commented Feb 20, 2023

I made a new release (0.9.2) that should fix this. As it was getting annoying to fix the weird coordinates behaviour, I just introduced a check at the start of each function. If a 1-dimensional series is passed it sets a boolean variable 'point' to True and uses this to skip all the functions that handle multiple dimensions. Before it was trying to add the "point" fake dimension, so there was only one workflow independently from the number of dimensions.

@CapeHobbit
Copy link
Author

Yup, works perfectly! Thank you so much.

@florianboergel
Copy link

florianboergel commented Feb 22, 2023

I think the new point variable needs to be initialized as False, to avoid erros if the time series is not 1D. Referring to
xmhw.py line 133 @paolap

Something like

    if len(dims) == 1:
        point = True
    else:
        point = False

since later on point is evaluated:

    if point:
        ts = temp
    else:
        ts = land_check(temp, tdim=tdim, anynans=anynans)

@paolap
Copy link
Member

paolap commented Feb 22, 2023

No you're right! I forgot as I first introduced it as an extra argument and then decided it was easier to just work it out! I should add a test for this as none of them failed.
I'm pushing the changes now, thanks for pointing that out!

@paolap
Copy link
Member

paolap commented Feb 22, 2023

Done a new release 0.9.3 with the correction above, thanks!

@florianboergel
Copy link

@CapeHobbit I guess the issue can be closed then ;-)

@paolap paolap closed this as completed Feb 26, 2023
@CapeHobbit
Copy link
Author

@florianboergel @paolap - yes! Sorry about that :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants