Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use decode with std::array #32

Open
theodelrieu opened this issue Aug 10, 2017 · 9 comments
Open

Cannot use decode with std::array #32

theodelrieu opened this issue Aug 10, 2017 · 9 comments

Comments

@theodelrieu
Copy link

Hello,

It seems you are forcing the Result template type to have a push_back method, which std::array doesn't have.

I was wondering if it would be easy to use the iterator range constructors of standard containers instead, to allow using a lot more container types.

I didn't dive into the code yet, do you think this is an easy change? If so, I could open a PR soon.

Thanks

@jpetso
Copy link
Collaborator

jpetso commented Aug 10, 2017

Hi,

cppcodec/data/access.hpp contains the basic level of of access, which is not push_back() (that's just the default) but a set of template functions to implement for the given type. Quoting https://github.com/tplgy/cppcodec/blob/master/cppcodec/data/access.hpp#L32

// This file contains a number of templated data accessors that can be
// implemented in the cppcodec::data namespace for types that don't fulfill
// the default type requirements:
// For result types: init(Result&, ResultState&, size_t capacity),
//     put(Result&, ResultState&, char), finish(Result&, State&)
// For const (read-only) types: char_data(const T&)
// For both const and result types: size(const T&)

(Also just noticed that I wrote "State" instead of "ResultState" in one place. Please excuse this mistake.)

I'd have to look into the trade-offs of using iterator ranges for constructors. Off the top of my head, iterators don't have .size() functions so unless the size is encoded in the type itself, like for std::array, the iterator-based constructor will have to do consecutive increasing memory allocations since it can't know the entire size in advance. (Or it would have to iterate through twice, which is also not much fun.) So I'm a bit wary of making that the default.

Do you want to try implementing the above-mentioned functions for std::array? I could eventually take a shot at it, but it might take me a while.

@theodelrieu
Copy link
Author

They don't indeed, but one can use std::distance to compute the size.

I just remembered std::array only uses aggregate initialization (i.e. no constructors) so iterator-range won't solve this issue.

If I understood correctly, one has to specialize several methods for its custom type, including put.

However, this is inconvenient for fixed-size containers like std::array, it would require an additional index to store the last position inserted. (like boost::container::static_vector IIRC).

I could take a look this week-end yes.

@jpetso
Copy link
Collaborator

jpetso commented Aug 10, 2017

std::distance is merely a helper function that counts the number of iterations, so that's the same thing.

The last position inserted is something that's always going to be necessary - iterators would keep one such variable around, in my case here I'm using a ResultState& to keep the position around in between calls.

For std::array, it might be necessary to use the "second layer" of template specializations - see https://github.com/tplgy/cppcodec/blob/master/cppcodec/data/access.hpp#L103 and the following bits. In fact, this bit might be what you want for std::array in particular - I guess the existing implementation doesn't catch it because std::array doesn't provide the reserve() and resize() functions. Maybe another such specialization can be added that only applies to types where std::tuple_size exists, and then check the size to make sure it matches the expected one.

Actually this brings up an interesting point - how were you going to call decode()? Can you paste the line of code that failed for you? Thanks!

@jpetso
Copy link
Collaborator

jpetso commented Aug 10, 2017

Also note that you're always able to use the decode() variant with [uint8_t|char]* binary_result, size_t binary_buffer_size. Since the array is fixed-size, this might actually be the correct way to go about it (it returns you the decoded size as result, as opposed to the Result templated version which assumes that the result type provides that number for you).

@jpetso
Copy link
Collaborator

jpetso commented Aug 10, 2017

Finally, I was going to correct my above statement about std::distance. Since it can be specialized by the author of the iterator, it's indeed possible to make it perform well without having to iterate through the whole thing. As long as it's being used by the allocating constructor and a specialized std::distance implementation exists, it would indeed be perfectly fine to use.

@theodelrieu
Copy link
Author

I believe it was something like this:

auto decoded = base64::decode<std::array<uint8_t, 32>>(encodedText);

The last position inserted is something that's always going to be necessary

From that statement, I assume you are storing characters one by one? That would explain the put method that only takes one character.

Maybe we could try to replace that mechanism (or at least overhaul it) with the support of iterator ranges, e.g. by adding methods in direct_data_access_result_state.

I will take a further look to the code to better understand the design choices, to avoid asking questions here and make you lose time.

@jpetso
Copy link
Collaborator

jpetso commented Aug 11, 2017

Thanks, I appreciate your thoughtful replies.

The challenge with using the API like this is that you have to know the size of decoded in advance, statically. I would assume that decode() will throw if your result array is either too small (decoded bytes don't fit) or too large (you're in danger of using an array with invalid bytes). It has to fit exactly. This needs to be checked at runtime. If you're not storing the last position inserted somewhere during the process, you'll have to store some other kind of iterator.

put() is where this information is embedded, so far I haven't seen a performance need to replace it by range-copy mechanisms. Encode/decode operations read blocks of bytes but populating the result is character by character, as you say. If the API can be improved by changing the current system, I'm open to considering that.

That said, since cppcodec takes responsibility for both initializing and populating the array, and you don't need it to be constructed as const, I feel like there might be an easier (iterator-less) solution to the issue.

I look forward to your findings :)

@theodelrieu
Copy link
Author

Hello, did you have time to look at the PR?

@theodelrieu
Copy link
Author

I fixed the PR, but I did introduce a copy for simplicity. Let me know if you have a better solution, I'll try to allocate a bit of time to dig deeper in the library's code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants