Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Netcdf improvements #27

Merged
merged 8 commits into from
Feb 14, 2020
Merged

Netcdf improvements #27

merged 8 commits into from
Feb 14, 2020

Conversation

jessicaaustin
Copy link

@jessicaaustin jessicaaustin commented Jan 30, 2020

note: i based this off of gen_agg_take2, so that would be merged into master before this PR. and if the approach for generating the aggregate flag changes, then this branch would need updates as well

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

You'll be able to see Jupyter notebook diff and discuss changes. Powered by ReviewNB.

@jessicaaustin jessicaaustin mentioned this pull request Jan 30, 2020
@jessicaaustin jessicaaustin changed the title WIP: Netcdf updates WIP: Netcdf improvements Jan 31, 2020
@jessicaaustin jessicaaustin changed the base branch from master to gen_agg_take2 February 3, 2020 19:25
@jessicaaustin
Copy link
Author

note: i based this off of gen_agg_take2, so that would be merged into master before this PR. and if the approach for generating the aggregate flag changes, then this branch would need updates as well

@jessicaaustin
Copy link
Author

@kwilcox could you please review?

In particular, these are some changes I'd love to get feedback on:

  • I updated the NcQcConfig constructor to add arguments to indicate the name of the dimensions in the netcdf file
    • this is useful so you don't have to extract that data out yourself when calling run() -- it can just use the data from the proper variables in the netcdf file
    • see TestRunNcConfigClimatology for an example of the difference in approach here
    • there are a lot of ways to solve this same problem; curious if this approach feels pythonic/netcdf-y or not
  • In save_to_netcdf, I added a fill value attribute to the qc variable. Seems reasonable?
  • In save_to_netcdf, I removed valid_min and valid_max attributes. Does it make sense to have those for a list of flags anyway? (as opposed to a data variable which is continuous)
  • To store the proper standard_name with the variable, I added an annotation, add_metadata, to each of the test functions

@jessicaaustin jessicaaustin requested a review from kwilcox February 3, 2020 23:09
@kwilcox kwilcox self-assigned this Feb 14, 2020
@jessicaaustin jessicaaustin changed the base branch from gen_agg_take2 to master February 14, 2020 20:23
@kwilcox
Copy link
Member

kwilcox commented Feb 14, 2020

I updated the NcQcConfig constructor to add arguments to indicate the name of the dimensions in the netcdf file, this is useful so you don't have to extract that data out yourself when calling run() -- it can just use the data from the proper variables in the netcdf file.

I like it, wish this could be represented in the NcQcConfig file as well, but for now this is helpful. Would also be nice to default to the CF convention (extract the time and z axes if they are not provided).

In save_to_netcdf, I added a fill value attribute to the qc variable. Seems reasonable?

There was some discussion if a flag value of UNKNOWN should be filled/masked. Do you remember where the convo happened and what the outcome was? Would you assume that if the quality flag was UNKNOWN it should be masked? The opposing side would be to only mask the quality flag when the data was masked (we do that now). Right now: data input mask and the quality flag mask are always be equal. After PR: the quality flag mask would include input data mask and all UNKNOWN).

In save_to_netcdf, I removed valid_min and valid_max attributes. Does it make sense to have those for a list of flags anyway? (as opposed to a data variable which is continuous)

Yeah, this is still useful. Some tools use it for color bar bounds and other things. They could pull the info from flag_values, but this makes it more generic. I'll add back in unless you object.

To store the proper standard_name with the variable, I added an annotation, add_metadata, to each of the test functions

This is cool, works great for the purpose.

@kwilcox kwilcox assigned jessicaaustin and unassigned kwilcox Feb 14, 2020
@jessicaaustin
Copy link
Author

I updated the NcQcConfig constructor to add arguments to indicate the name of the dimensions in the netcdf file, this is useful so you don't have to extract that data out yourself when calling run() -- it can just use the data from the proper variables in the netcdf file.

I like it, wish this could be represented in the NcQcConfig file as well, but for now this is helpful. Would also be nice to default to the CF convention (extract the time and z axes if they are not provided).

Agreed. I'll add an issue for that to track it.

In save_to_netcdf, I added a fill value attribute to the qc variable. Seems reasonable?

There was some discussion if a flag value of UNKNOWN should be filled/masked. Do you remember where the convo happened and what the outcome was? Would you assume that if the quality flag was UNKNOWN it should be masked? The opposing side would be to only mask the quality flag when the data was masked (we do that now). Right now: data input mask and the quality flag mask are always be equal. After PR: the quality flag mask would include input data mask and all UNKNOWN).

Hmmmm I see what you mean. When you put it that way, I think "data input mask and the quality flag mask are always be equal" is the correct behavior. So I will revert my changes.

In save_to_netcdf, I removed valid_min and valid_max attributes. Does it make sense to have those for a list of flags anyway? (as opposed to a data variable which is continuous)

Yeah, this is still useful. Some tools use it for color bar bounds and other things. They could pull the info from flag_values, but this makes it more generic. I'll add back in unless you object.

Makes sense, thanks for the explanation.

Copy link
Member

@kwilcox kwilcox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with whatever you decide on the fill_value, good to go after that

@jessicaaustin jessicaaustin changed the title WIP: Netcdf improvements Netcdf improvements Feb 14, 2020
@jessicaaustin jessicaaustin merged commit b21e7cb into master Feb 14, 2020
@jessicaaustin jessicaaustin deleted the netcdf_updates branch March 16, 2020 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NcQcConfig Object Ancillary Variable Issues
2 participants