-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compress #123
Compress #123
Conversation
A new property added to namelist: compressed. Defaults to False. If True, the output of the namelist is compressed by writing successive identical values of arrays as n*v, where n is the number of successive values and v is the value. In order to do so, two new methods were added: _compress: converts a list of values into a list of list of values, e.g. [1, 1, 1, 2, 3, 3] -> [[3, 1], [1, 2], [2, 3]] _f90comprepr: takes the elements of this compressed list and converts them to their string components. In the above example: "3*1", "2", and "2*3" Call to these methods has been added to the _var_strings method
TODO: No changes have yet been made to the version number or the documentation. |
The coverage decrease is not due to my code changes. All my code is covered by unit tests. |
Thanks Holger, this looks like very nice work, and thank you also for adding the tests. I am very happy to merge this. The code is good, the only real issue I have is with describing it as compression. The Fortran standard often describes as a "repeat" or "repeat factor". For example, the "data-stmt-repeat" is the label used for the repeat counter in a data statement. (Annoyingly, the IO section calls them "r* form" which really doesn't help with naming conventions. My old Metcalf/Reid/Cohen book also calls them "repeat counters". (That name is suspiciously absent from the standard, but nevermind that for now...) The other reason is that "compression" can also refer to more elaborate forms of data compression. I guess this is a kind of compression, but might not be what comes to mind for most people. I can make changes here if you are happy with them, or you are welcome to do them yourself. In the future, we could modify this so that No worries about the coverage. I think something changed with respect to YAML support on either the Travis or Ubuntu side of things. |
Also
|
I have one more comment which is more of a long-term concern and which need not be addressed right now. I am a little concerned that we are now implicitly purposing the list-of-list I have not been able to find any case which causes problems, and it obviously passes all the current tests, so I don't mind merging this. But we may want to at least add a comment describing the potential conflict in any future work. |
Instead of storing repeat lists as lists of lists, use a new subclass RepeatValue. This makes it easier to distinguish between repeat lists and multidimensional lists. Also, use itertools.groupby to replace the whole _repeat method.
I've taken your suggestions to heart.
|
Still TODO:
|
Thanks, I like this very much. I think there could also be additional benefit by introducing more of these micro-classes. I agree that Don't worry about versioning or version numbers, I will handle that after this is merged. |
I think I will go ahead and merge this, despite a few minor issues (repeat counters with nulls?). This should not effect default behaviour and will only be noticed when this feature is enabled. The other issues like documentation can be sorted out in future PRs. If no objections from you, I'd like to merge it. |
Allows output in compressed form, resolves issue #122
New namelist property:
compressed
. Defaults to False, if True, arrays are compressed on output.Changes method _var_strings by two methods:
_compress
converts thev_values
list to a compressed list where each element is a 2-element list with the number of repetitions and the original element._f90comprepr
converts each of these individual 2-element lists into the correct compressed format.Several tests were added for this feature.