Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protobuf messages contains Any aren't serialized deterministically even specifying so #5731

Closed
lizan opened this issue Feb 14, 2019 · 5 comments
Assignees
Labels
bug c++ customer issue inactive Denotes the issue/PR has not seen activity in the last 90 days.

Comments

@lizan
Copy link
Contributor

lizan commented Feb 14, 2019

What version of protobuf and what language are you using?
Version: master (currently at 7492b56)
Language: C++

What operating system (Linux, Windows, ...) and version?
Linux

What runtime / compiler are you using (e.g., python version or gcc version)
gcc-7 / clang-7

What did you do?
Deterministic serialize a message and calculate hash based on the serialized binary.
Source: https://github.com/envoyproxy/envoy/blob/master/source/common/protobuf/utility.h#L162

What did you expect to see
Same proto generates same serialized binary and hash.

What did you see instead?
Same proto generates different serialized binary and hash.

Anything else we should know about your project / environment
This is a follow up of #5668, even we use the CodedOutputStream with SetSerializationDeterministic(true), the same protobuf message (from JSON debug dump) doesn't produce same binary serialization. My suspect is that the Any in the message has different value while they are same. Deterministic serialization should normalize value in Any too.

@wattli
Copy link

wattli commented Feb 15, 2019

@liujisi , any ideas? The fix is critical and your help is greatly appreciated.

@lizan
Copy link
Contributor Author

lizan commented Feb 19, 2019

@acozzette any thoughts?

@acozzette
Copy link
Member

@lizan This may be a hard problem to solve. The way Any was designed, during parsing and serialization we don't treat an Any field in a special way and we just treat its payload as an opaque blob. If we want to serialize it deterministically, we would probably need to parse and reserialize the Any payload during serialization. To do that we need to figure out what kind of message the Any is. We could do that by looking up the name in the generated descriptor pool, but that solution is not ideal since it won't work with lite protos (i.e. protos built without reflection support).

If you need a quick short-term fix, I think the best solution would be to have your hash function reflectively examine the proto and normalize all Any fields before doing the deterministic serialization of the full message. That may end up being slow, though. It might also be worthwhile to just avoid Any fields in messages that you want to be able to hash. Do you have a lot of Any fields or just a few?

Copy link

github-actions bot commented May 5, 2024

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.

This issue is labeled inactive because the last activity was over 90 days ago.

@github-actions github-actions bot added the inactive Denotes the issue/PR has not seen activity in the last 90 days. label May 5, 2024
Copy link

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please reopen it.

This issue was closed and archived because there has been no new activity in the 14 days since the inactive label was added.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug c++ customer issue inactive Denotes the issue/PR has not seen activity in the last 90 days.
Projects
None yet
Development

No branches or pull requests

5 participants