-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with xarray indexing in v0.9.0 #56
Comments
Hi Ben, I never had this issue before, could you just confirm the following, so I'm sure I'm going to test with the same packages: |
My working example has the following versions under Python 3.9: xarray 2022.3.0 pyhd8ed1ab_0 conda-forge The xarray and xmhw versions are "pinned" in my yaml file. If I let them float freely then my versions are: I haven't made any changes to the routine. |
Hey could you check if you haven a singleton coordinate left that has no use? For example, I iteratate over depth levels and need to drop the depth information. sst = temperatureData.drop("depth").TEMP
sst = sst.chunk({'time': -1, 'lat': 'auto', 'lon': 'auto'})
sst.data with that I can avoid this error. |
Will do and report back tomorrow. Thanks! |
Unfortunately, I can confirm that I have no singleton dimensions, and yet the error persists. Below is the xarray dump of the array I submit to threshold; `<xarray.DataArray 'analysed_sst' (time: 3804)>
It has dims of ('time',) I admit, that final comma in the dims makes me nervous, but it seems to be how xarray displays the dimension data, and not indicative of a singleton dimension. |
Maybe it is related to the dimension dims = list(temp.dims)
# Add an extra fake dimensions if array 1-dimensional
# so a 'cell' dimension can still be created
dims.remove(tdim)
if len(dims) == 0:
temp = temp.expand_dims({"point": [0.0]})
dims = ["point"] Since your error is also pointing towards a variable called |
Thanks @florianboergel - I'll dig a little deeper. I don't suppose you could add a small, known working netCDF file here, could you? It would be good to separate the toolkit from the data source before I start debugging. |
It looks like the workaround to fix the coordinates issue with the new xarray dimension doesn't work for this extra dimension. Point is added if you have a "point" time series so stacking the dimensions will still work. I'll try to fix this today. |
I made a new release (0.9.2) that should fix this. As it was getting annoying to fix the weird coordinates behaviour, I just introduced a check at the start of each function. If a 1-dimensional series is passed it sets a boolean variable 'point' to True and uses this to skip all the functions that handle multiple dimensions. Before it was trying to add the "point" fake dimension, so there was only one workflow independently from the number of dimensions. |
Yup, works perfectly! Thank you so much. |
I think the new Something like if len(dims) == 1:
point = True
else:
point = False since later on point is evaluated: if point:
ts = temp
else:
ts = land_check(temp, tdim=tdim, anynans=anynans) |
No you're right! I forgot as I first introduced it as an extra argument and then decided it was easier to just work it out! I should add a test for this as none of them failed. |
Done a new release 0.9.3 with the correction above, thanks! |
@CapeHobbit I guess the issue can be closed then ;-) |
@florianboergel @paolap - yes! Sorry about that :) |
Hi. Firstly, thanks for this excellent package, I am using it extensively in my Copernicus marine training.
Historically, I was using version 0.8.0 without issue. To do this, though, I had to "pin" to an older version of xarray, which is getting harder to maintain as part of a wider environment. I wanted to move up to 0.9.0 and free up xarray to iterate to a newer version, but I run into the following problem when I run the threshold routine;
This may be related to how I treat the initial data as I average it into a 1 dimensional time series before I try to run the thresholding. However, I have tried this without the spatial averaging and get different errors related to dask.
My processing is as follows;
SST_data = xr.open_mfdataset(glob.glob(os.path.join(os.getcwd(),'OSTIA_*.nc')))
SST_spatial_average = SST_data.mean(dim='latitude').mean(dim='longitude')
SST_time_series = SST_spatial_average["analysed_sst"].load() - 273.15
clim = threshold(SST_time_series)
Are you able to offer any insight?
Cheers,
Ben
You can find my test data set here (needs to be unzipped):
OSTIA_SST.nc.zip
The text was updated successfully, but these errors were encountered: