-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attributes of type NDAttrString are not written as H5T_STRING in HDF5 files #176
Comments
I have implemented a fix for this issue in branch hdf5_string2. I added a new parameter, NDFileHDF5_stringAttributeDataType. This has 2 possible values: hdf5::nativeChar and hdf5::CString. These are new enums added to NDFile HDF5Layout.h The default is hdf5::nativeChar, which uses the current data type of an array of H5T_NATIVE_CHAR[NumArrays, MAX_ATTRIBUTE_STRING_SIZE] If the parameter is set to hdf5::CString then NDAttributes of type NDAttrString are written as H5T_C_S1[numArrays] with size MAX_ATTRIBUTE_STRING_SIZE. There is a new bo record in NDFileHDF5.template, This is the contents of an NDAttribute of type NDAttrString when strAttrDataType=Char (default, old behavior) corvette:~/scratch>h5dump --dataset=/entry/instrument/NDAttributes/CameraModel test_001.h5 HDF5 "test_001.h5" { DATASET "/entry/instrument/NDAttributes/CameraModel" { DATATYPE H5T_STD_I8LE DATASPACE SIMPLE { ( 1, 256 ) / ( H5S_UNLIMITED, H5S_UNLIMITED ) } DATA { (0,0): 66, 97, 115, 105, 99, 32, 115, 105, 109, 117, 108, 97, 116, 111, (0,14): 114, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (0,35): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (0,57): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (0,79): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (0,101): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (0,123): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (0,145): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (0,167): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (0,189): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (0,211): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (0,233): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (0,255): 0 } This is the result when strAttrDataType=String corvette:~/scratch>h5dump --dataset=/entry/instrument/NDAttributes/CameraModel test_003.h5 HDF5 "test_003.h5" { DATASET "/entry/instrument/NDAttributes/CameraModel" { DATATYPE H5T_STRING { STRSIZE 256; STRPAD H5T_STR_NULLTERM; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; } DATASPACE SIMPLE { ( 1 ) / ( H5S_UNLIMITED ) } DATA { (0): "Basic simulator" } This is the result for a file containing 10 arrays rather than just 1: corvette:~/scratch>h5dump --dataset=/entry/instrument/NDAttributes/CameraModel test_004.h5 HDF5 "test_004.h5" { DATASET "/entry/instrument/NDAttributes/CameraModel" { DATATYPE H5T_STRING { STRSIZE 256; STRPAD H5T_STR_NULLTERM; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; } DATASPACE SIMPLE { ( 10 ) / ( H5S_UNLIMITED ) } DATA { (0): "Basic simulator", "Basic simulator", "Basic simulator", (3): "Basic simulator", "Basic simulator", "Basic simulator", (6): "Basic simulator", "Basic simulator", "Basic simulator", (9): "Basic simulator" } It seems like a reasonable fix to me, so I am issuing a pull request for this branch. Mark |
Note that when the NDFileNexus plugin is used, the resulting HDF5 file for this same NDAttribute CameraModel is corvette:~/scratch>h5dump --dataset=/Collected_Data/rawData/model nexus_test_001.h5 HDF5 "nexus_test_001.h5" { DATASET "/Collected_Data/rawData/model" { DATATYPE H5T_STRING { STRSIZE 16; STRPAD H5T_STR_NULLTERM; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; } DATASPACE SIMPLE { ( 1 ) / ( 1 ) } DATA { (0): "Basic simulator" } } } So it has DATATYPE=H5T_STRING, which is what makes sense. There appears to be no way with the existing version of the HDF5 plugin to produce such a Nexus file. |
Fixed in #177, merged into master. |
NDFileHDF5 does not treat attributes of type string consistently. Constant attributes are written with datatype H5T_C_S1 and appear as type H5T_STRING. However, NDArray attributes of type NDAttrString are written as type H5T_NATIVE_CHAR, and appear as 2-D arrays of type H5T_STD_I8LE (on my Linux system). This seems to me to be in violation of the HDF5 documentation, which states that H5T_NATIVE_CHAR is a numeric datatype, not intended for storing string data.
This is an example of using h5dump to display the CameraModel attribute. Note that h5dump is displaying integer values. This is not very user-friendly to say the least. Note also that the NDAttrDescription is an H5T_STRING, and is nicely legible.
I made a trivial 1-line change to the plugin to use type H5T_C_S1 for NDAttributes of type NDAttrString. This results in the following output:
This is better, since at least the ASCII value of each character is shown. However, it is a 2-D array of 1 character strings, which is of course not what we really want, which is a 1-D array of strings.
I am willing to change the code to do this. However, I have a few questions:
Mark
The text was updated successfully, but these errors were encountered: