Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip hash validation when it's unsafe when downloading Storage objects #9507

Closed
jskeet opened this issue Jan 6, 2023 · 3 comments · Fixed by #9553
Closed

Skip hash validation when it's unsafe when downloading Storage objects #9507

jskeet opened this issue Jan 6, 2023 · 3 comments · Fixed by #9553
Assignees
Labels
api: storage Issues related to the Cloud Storage API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@jskeet
Copy link
Collaborator

jskeet commented Jan 6, 2023

In #1641 I proposed a HashValidationMode of WhenAvailable or similar - but we couldn't implement it at the time.

With the new x-goog-stored-content-encoding header, we may be able to implement this - see the Node PR for an example.

Ideally, we'd want to make this the default - but that should only happen at the next major version of Google.Cloud.Storage.V1.

@jskeet jskeet added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. api: storage Issues related to the Cloud Storage API. labels Jan 6, 2023
@jskeet jskeet changed the title Skip hash validation when it's unsafe Skip hash validation when it's unsafe when downloading Storage objects Jan 6, 2023
@amanda-tarafa
Copy link
Contributor

FYI: There was some recent movement in internal issues around this. One of the issues with hash validations was that for files doubly compressed, Storage was removing the outer compression layer on serve, see #7154 as well.
@frankyn should know more about the state of things in general. There was also go/decompression-hash-validation and the newly added header seems to work towards the option of providing a strong signal regarding the hash matching the content.

@frankyn
Copy link
Contributor

frankyn commented Jan 7, 2023

Hey folks, skipping validation can now be done through new headers in the JSON API x-goog-stored-content-encoding and x-goog-stored-content-length which were originally only available in the XML API; It's not exactly what we are looking for as a boolean that would tell clients that decompression occurred (it can happen outside of GCS' control in GFE); so this is the next best thing. Go and Python have this support for download.

@amanda-tarafa amanda-tarafa assigned jskeet and unassigned shivgautam Jan 11, 2023
@amanda-tarafa
Copy link
Contributor

Assigning to @jskeet as per internal discussion with the wider team.

jskeet added a commit to jskeet/google-cloud-dotnet that referenced this issue Jan 13, 2023
Fixes googleapis#9507 (modulo the change to default in the next majro version)

This detects if the x-goog-stored-content-encoding header is set to gzip, but the actual content encoding of the response is non-gzip, and disables validation in that specific case - while validating in the same way as normal for everything else.

We will make this the default in the next major version. (This is recorded in the internal document.)

Implementation note: this now *always* creates a HashValidatingDownloader, and puts that in control of whether validation is actually performed or not. That's simpler than the previous split of the choice being in StorageClientImpl.DownloadObject.cs with the hashing in HashValidatingDownloader.
jskeet added a commit to jskeet/google-cloud-dotnet that referenced this issue Jan 17, 2023
Fixes googleapis#9507 (modulo the change to default in the next majro version)

This detects if the x-goog-stored-content-encoding header is set to gzip, but the actual content encoding of the response is non-gzip, and disables validation in that specific case - while validating in the same way as normal for everything else.

We will make this the default in the next major version. (This is recorded in the internal document.)

Implementation note: this now *always* creates a HashValidatingDownloader, and puts that in control of whether validation is actually performed or not. That's simpler than the previous split of the choice being in StorageClientImpl.DownloadObject.cs with the hashing in HashValidatingDownloader.
jskeet added a commit that referenced this issue Jan 17, 2023
Fixes #9507 (modulo the change to default in the next majro version)

This detects if the x-goog-stored-content-encoding header is set to gzip, but the actual content encoding of the response is non-gzip, and disables validation in that specific case - while validating in the same way as normal for everything else.

We will make this the default in the next major version. (This is recorded in the internal document.)

Implementation note: this now *always* creates a HashValidatingDownloader, and puts that in control of whether validation is actually performed or not. That's simpler than the previous split of the choice being in StorageClientImpl.DownloadObject.cs with the hashing in HashValidatingDownloader.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants