Cleanup and simplification of code structure (Suggestion) #4

hernot · 2019-12-15T21:13:43Z

Hi Paul

Finally i managed to come up with a suggestion for simplifying and cleaning code structure such that future extensions are simple and easy and that dropping python2 support is nearly as simple as removing related code sections from core.py. And converting some other places marked with TODO back to proper and clean python >= 3 code.

Here a short summary on the changes the elaborate details you find in the commit message.
Admitted some codes may still need some comments and doc strings.

All parsing now only handled by grammar.py file including parsing of AmiraMesh data-streams
HyperSurface data structure is parsed along with File header removing asymmetry between HyperSurface and AmiraMesh file trees
HyperSurface data-streams are loaded along with structure information unless stream_loading set to HEADER_ONLY or False in that case not loaded at all
Added functions for defining and modifying global stream_loading behaviour, possible values are HEADERONLY, ONDEMMAND (current default open to discussion), IMMEDIATE. Can be overridden by load_streams argument of Header and AmiraFile classes
AmiraFile is subclass of Header not Wrapper
introduced Meta array_declarations which allow proper mapping of BoundaryCurves, Patches and Surfaces declarations on HyperSurface Files
Extended AmiraDispatchProcessor to group lists of array_declarations which share same base name and differ only be integer counter using meta_array_declarations. Can be in future be extended by more complex and sophisticated heuristics for deciding whether grouping is valid or should not happen. Example could be HxSpreadSheet files.
Improved access to attributes added using add_attr, no more need for falling back to getattr exempt for loading of data streams on demand
parent attribute is now weak reference with automatic resetting to None if parent is deleted
some metaclass magic to flag attributes declared on slots to be read-only when accessed from outside Block class or subclass.
renamed decodesmodule.cpp to decodersmodule.c skipping static_cast method which was only one requiring to compile as c++ which caused compiler error when compiling for python2
all tests should be running ok for python3 and python2 as long as still valid all others currently commented. please verify.
extended lists of relative imports by if __package__: else: ``clause switching between relative import if imported from external ahds package or called from within ahds directory for debugging purpose. In contrast to try: catch:` these work for python2 and python3 flawlessly
in core.py changed your assert statement based argument checks back to simple if based checks as asserts are converted to no-ops if python is called with -O or -OO options for basic optimizations. And than check always succeeds. Should not rely on people playing fair and using code properly or not ask python for optimization.
first suggestion on how to identify and support various ContentTypes especially from AmiraMesh files see suggestion for implementation using HxSpreadSheet as an example.

hernot · 2019-12-15T21:57:15Z

Think Travis setup for python2 i have no idea what the pywavelets package is used for it is not inside installation requirements. Are they really needed or could they for python 2 be kicked.
For python 3 i fixed it by limiting scicit-image to <= 0.15 one more as for python 2.7 Results for python3.6 and python3.7 are working after undoing some temporary direct debugging hack.
For testcoverage we should make a plan how to best cover the new stuff.

Will continue to improve this pullrequest, if i come across something while working on other project which could be beneficial for ahds, or further simplify things.

hernot · 2020-01-03T13:43:53Z

List of packages which in latest version do not support anything below Python 3.5 or even 3.6 again increased. Fear that till end of first quarter of 2020 we will do nothing else any more than identifying latest package versions of indirect dependencies still supporting Python 2.7. This time Pillow stopped support for Python 2.7 starting with version 7.0.0.

hernot · 2020-02-11T05:38:56Z

Any idea why Travis not running?

hernot · 2020-02-11T05:41:46Z

Any idea why Travis not running?

hernot · 2020-08-31T18:43:12Z

Not sure why pytest hicksup on Python 3.5 possibly cause tarvis.net is not any more hosted and we should switcht finally to tarvis.com. What ever the reason is i do not have any idea.

Despite from that, added unit tests and tests to achieve 100 % coverage for core modules only extra is not tested at all these tests you have to do.

There are serveral lines which i marked with nocover for one of the following reasons

unless code broken else where not reached -> Exception if executed
lack of test data to be able to check if correct or should throw execption if code broken ->
warning if executed remembering to collect test data triggering line and recording what
has happened to reach there
exclusively relevant for Python 2 or Python >= 3 marked with cover_py2 or cover_py3

EDIT: pathlib2 required by pytest for python < 3.6 tries to compare three element list sys.version_info with two element tuple which is valid and possible on Python2 but not Python3 fear to fix this would require to open related issue on pytest which would be rejected anyway (see). Alternative drop python3.5 testing on travis to avoid pytest switchback to pathlib2 for python3.5. How much installations of ahds you would expect to break if not tested under python3.5? I will remove for now as the only fix i could come up with.

paulkorir · 2020-09-17T08:21:45Z

Hi Christoph,
I'm very happy to merge these. I would like to suggest some cleanup:

squash commits so that we don't have 'oh no!' commits
make the commit message helpful
Thanks so much for your contribution.
Paul

hernot · 2020-09-18T13:24:00Z

Hi Paul
good to hear from you. Yes same procedure as every time. Maybe, if i encounter some more things to make less complicated hassl i will do in addition. But i think i have all so far.

Parsing: ======== Collects all code dealing with parsing in grammar.py. This includes regular expressions for and code handling parsing of data streams, ASCII and binary for AmiraMesh and Hyper Surface files. As a result the structure of HyperSurface files is now fully parsed before get_parsed_data method returns. This removes an annoying asymmetry between HyperSurface and AmiraMesh files. For both header now shows the full structure. For extracting AmiraMesh data-streams binary and ASCII two new methods are introduced in grammar.py next_amiramesh_binary_stream and next_amiramesh_ascii_stream. The encoded stream data for HyperSurface files is read and loaded during parsing of file structure. No second pass for reading HyperSurface data is used. Data stream loading: ==================== Added set_stream_policy and get_stream_policy methods. The two allow to select global default setting for loading streams. This can be overridden by the load_streams argument of the Header and AmiraFile classes. The following values can be used to control stream loading globally using set_stream_policy and load_streams attribute. HEADERONLY ... Only header data and structural information is loaded trying to access data attributes of AmiraDataStream type objects triggers an AttributeError Exception ONDEMMAND .... Load the stream data as soon as the data attribute of specific AmiraDataStream is accessed for the first time. The stream data for any stream which is encountered while searching for the targeted will be loaded implicitly (*) (**) IMMEDIATE .... Load the stream data immediately during creation of Header or AmiraFile object. (*) (**) (*) For HyperSurface files ONDEMMAND and IMMEDIATE expose the same behavior (see above) and can be considered synonymous. (**) decoding of data streams will happen in any case on first access of data attribute The load_streams attribute of the Header and AmiraFile class may be set in in addition to the following values: True ......... Selects IMMEDIATE for specific instance False ........ Selects HEADERONLY for specific instance None ......... Selects the global default setting as returned by get_stream_policy method AmiraFile: ========== AmiraFile class is subclass of Header class. Meta array_declarations: ======================== In order to simplify parsing and creation of Header structure of HyperSurface files parsing prepends so called meta_array_declarations to the array_declaratoins structure. Their array_dimension is set to the number of BondaryCurves, Patches, Surfaces as for any other ArrayDeclaration. In addition they have an array_blocktype field reading 'list' which is used by the header load_declarations method to create a ListBock instead of a normal Block attribute. Further for any following BoundaryCurve, Patch and Surface an additional declaration is added for which the array_dimension is set to 0, the additional array_parent field references the corresponding meta_array_declaration and the additional array_itemid field hosts the item index at which the load_declarations method shall insert it into the corresponding BondaryCurves, Patches or Surfaces ListBlock. The Blocks for each BoundaryCurve, Patch and/or Surface is temporarily added to Header class instance __dict__ prefixing its name by '_@' sequence this ensures that the can only be found and retrieved using getattr and not the '.' operator. the load_definitions method is extended such that it also test for hidden class attributes prefixed by '_@'. Extended the AmiraDispatchProcessor class such that array_declaration method now applies ContentType specific filters to all array declaration. Such a filter for example would check the name of an array_declaration for trailing numbers which could comprise a valid consecutive counter such as for __Column1, __Column2, ... __ColumnN used by HxSpreadSheet type AmiraMesh files. If the filter finds a match the triggering array_declaration is extended as described above for HyperSurface file structure. In Addition on the first hit a meta_array_declaration is created. This in addition contains a field used to count the number of array_declarations belonging to the same set of array_declarations forming the meta_array_declaration. If the final content_type of file corresponds to content_type assumed by filter function it is prepended to the list of array_declaratoins and discarded otherwise. A meta_array_declaration is not prepended to the list of array_declarations unless a corresponding ContentType parameter entry is encountered and the meta_array_declaration referes to more than one array_declaration read from the file. All other collected meta_array_declarations are ignored. Blocks representing MetaArrays like Patches, Surfaces, HxSpreadSheet have to register a dedicated filter function. This functoin has to return a tuple containing ``` (base_name, array_index, block_type, extra_meta_data, extra_declaration_data) ``` in order to indicate that the inspected array_declaration shall be linked to the meta_array_declaration representing the specified ContentType. As a first example the HxSpreadSheet content_type and the hxsurface contenttype assumed for HyperSurface files is implemented. The AnmiraSpreadSheet provides the static array_declaration filter function which returns constant string reading Sheet1 instead of simply striping the counter and returning the base_name. The aim is to demonstrate that filter functions not only may identify array_declarations which shall be grouped together into a meta_array but also can specify the name of this meta_meta array, rename their child arrays by modifying the inspected array declaration and inject addtional parameters into any of the array_delcarations eventh_though not triggering the creation of a meta_array_declaration. For all other AmiraMesh files the ContentType and corresponding array_declaratoins have to be still captured and the appropriate appropriate filterfunctions have to be implemented. Like for other explicit ContentTypes Dynamic attributes created by add_attr ====================================== All dynamic attributes are now accessible through a custom data_descriptor which is maintained and created by the add_attr method and automatically vanishes from Block class hierarchy as soon as the last Block type object hosting the corresponding attribute is deleted or the attribute of that specific instance is deleted or renamed. This made calls to __getattr__ including the repeated AttributeError thrown by builtin object.__getattribute__ method obsolete. The data_descriptor is configured such that it behaves like an instance attribute. As a consequence an attribute which was added to instance of Block type class using by setattr can not be added to it using add_attr and vice versa. Attributes the name of which has been listed on __slots__ special attribute for the Class any any of its base classes can not be added to instance using add_attr in any case. parent attribute: ================= parent attribute is now a weak reference which switches back to None as soon as the parent Block type class instance is deleted or garbage collected. The parent attribute has two interfaces a public one named 'parent' and a protected named '_parent'. Like for readnoly attributes below the 'parent' attribute is readonly and may only modified by assigning to the '_parent' attribute. Read only attributes: ===================== Replaced @attributes methods which just access protected __slot__ attributes with not much other logic by read_only_datadescriptor which directly guards corresponding public attribute. From within Block type class and subclass it can be freely assigned using the protected __slots__ attribute. For example the public 'name' attribute allows to read the name of the block but assigning a new name to the Block is only possible by assigning it to the '_name' attribute. To flag a __slots__ attribute read-only its Name has to be decorated by the READONLY method when listed on the __slots__ attribute of the new Block type class. The rest is handled by some MetaClass magic, which also ensures that the READONLY method is available from within Block class declaration code. details see core.py file. NOTE: Protected, private and special attributes the names of which start by convention with at least one leading '_' are ignored by READONLY decorator. NOTE: in Python 2 due to lack of __prepare__ method READONLY is injected into list __builtins__ methods and functions. Byte-RLE decoders: ================== Droped state machine used by byterle_decoder methods to decode hxbyterle stream switching between read counter, expand rleencoded block or copy non compressed block. Implementation resorts to memcpy and memset based approach for c/c++ native encoder function and slicing for slow pure python based implementation. Both changes yields a significant speedup. The arguments accepted by both implementatoins can also be passed as keywords and have been adjusted to allwo to use byterle_decodder methods interchangeably with numpy.frombuffer method at least when called with count and dtype keyword arguments only. Python & packaging: =================== Renamed decodersmodule source: ----------------------------- As i got gcc compiler error on calling python setup.py build about ISO C++ forbids not sure what, renamed the decodersmodule.cpp file to decodersmodule.c as it was a cpp file only to be able to use static_cast on one single place. works with unsafe c cast too. Pytest: ------ Improoved pytest unittest collection required to properly test and verify the suggested approach. All necessary AmiraMesh and HyperSurface files are included for properly testing. Added warnings for code pathes which currently can not be tested or checked if they are requird at all due to lack of appropriate test data These code pathes are hidden form coverage check using nocover pragma Introduced dedicatd coverage configuration file to ensure coverage does not report code pathes which are not supported by Python 2 and thus to be ignored by coverage. Code pathes to be considered for Python 3 are marked by cover_py3 pragma and those to be considered for Python 2 only are marked by cover_py2 pragma. Updated tox.ini and .coveragerc and .coverage_27 accordingly. tox.ini is also used to configure pytest Travis: ------- Switched to tox-travis for seamless interaction between Tox and Travis and make Pytest and Pytest-cov changes taking effect for Travis too Python 2.7: ----------- Additional scikit-image requirements: PyWavelets<=1.0.3, Pillow<=6.2.1, kiwisolver<=1.1.0 total: 9 Some minor mocking fix for object.__dir__ which has been introduced in Python 3 but caused exception on Python 2 for overloaded __dir__ method filtering file defined attributes which are not hosted by the inspected Block instance. Python 3.5: ----------- Additional scikit-image requirements: scikit-image<=0.15 , scipy<=1.4 total: 3 Disabled pytest and pytest-cov: Disable was necessary as pytest team decided to switchc back on python 3.5 to pathlib2 python2 backport as builtin pathlib does not meet their requirements until Python 3.6. At the same or some time later pathlib2 team has decided not to care any more about whether pathlib2 is usable under Python3 or not as Python 3 versions have anyway native pathlib. This two opposit decisions lead to the fact that pytest turns out to be at least in Travis disfunct for Python 3.5. The reason is that in Python comparison of List and Tuple is properly possibl in Python 3 the two can not be compared and pathlib 2 does excatly that when checking whether it has to do some compatibility thingy for Python2 <= 2.7. The offending line is ```sys.version_info <= (2,7)``` which causes TypeError unorderable types in Python 3.5 Thus suggest to not test anymore on Python 3.5 and rely upon that testing on >= 3.6 is sufficient. In worst drop python 3.5 support in case not.

hernot · 2020-09-18T20:40:45Z

Unless Travis finds some packages which meanwhile dropped python2 or python3.5 support squashing should be done and commit message should be updated accordingly.

… proper build of byterele_decoder from c-source in tox virtual environment with pytest. old structure caused import or source and not installed packages

hernot · 2021-01-18T19:46:25Z

Thank you very much even though i meant that you just update dev with your changes from master related to travis moving to .com only,so that i can than amend pull request before you merge it finally.

paulkorir · 2021-01-19T13:05:58Z

We ran out of free travis credits and I don't have the bandwidth to figure it out. Therefore, I've got things working with Gihub actions. We can then unmerge your changes then you can simply copy the .github folder and make the necessary changes to get the tests running. Is that OK?

hernot · 2021-01-19T20:04:09Z

Sure would have been easier to just put the file in dev, cause thatn it would have fetched the changes and force pushed them back to my fork but no worry. I already have fetched the updated master and the updated dev so is unmerge stil necessary or which files did you change to make github actions work, maybe i can merge over to dev on my fork and than pullrequest against upstream dev. Would assumably easier than unmerging again.

paulkorir · 2021-01-20T11:55:46Z

That's OK. I'll find a workaround.

hernot closed this Feb 11, 2020

hernot reopened this Feb 11, 2020

hernot force-pushed the clean-file-attr-loading branch from 3b452d4 to ccf1ea7 Compare September 18, 2020 20:33

squash if works on travis: changed repo structure to be able to tests…

964eff1

… proper build of byterele_decoder from c-source in tox virtual environment with pytest. old structure caused import or source and not installed packages

paulkorir merged commit 71a1477 into emdb-empiar:dev Jan 18, 2021

hernot mentioned this pull request Sep 20, 2021

Codecleanup and streamlining #6

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleanup and simplification of code structure (Suggestion) #4

Cleanup and simplification of code structure (Suggestion) #4

hernot commented Dec 15, 2019 •

edited

Loading

hernot commented Dec 15, 2019 •

edited

Loading

hernot commented Jan 3, 2020 •

edited

Loading

hernot commented Feb 11, 2020

hernot commented Feb 11, 2020

hernot commented Aug 31, 2020 •

edited

Loading

paulkorir commented Sep 17, 2020

hernot commented Sep 18, 2020

hernot commented Sep 18, 2020

hernot commented Jan 18, 2021

paulkorir commented Jan 19, 2021

hernot commented Jan 19, 2021

paulkorir commented Jan 20, 2021

Cleanup and simplification of code structure (Suggestion) #4

Cleanup and simplification of code structure (Suggestion) #4

Conversation

hernot commented Dec 15, 2019 • edited Loading

hernot commented Dec 15, 2019 • edited Loading

hernot commented Jan 3, 2020 • edited Loading

hernot commented Feb 11, 2020

hernot commented Feb 11, 2020

hernot commented Aug 31, 2020 • edited Loading

paulkorir commented Sep 17, 2020

hernot commented Sep 18, 2020

hernot commented Sep 18, 2020

hernot commented Jan 18, 2021

paulkorir commented Jan 19, 2021

hernot commented Jan 19, 2021

paulkorir commented Jan 20, 2021

hernot commented Dec 15, 2019 •

edited

Loading

hernot commented Dec 15, 2019 •

edited

Loading

hernot commented Jan 3, 2020 •

edited

Loading

hernot commented Aug 31, 2020 •

edited

Loading