-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storage: Compressed files and Content-Encoding #7154
Comments
For background see #1641 and #1784. If you want to validate the download, then the only solution is to add If you want to skip hash validation you can use As you can see in #1784, which is a similar corner case, automatically detecting all of these is unreliable, as there's no reliable header that describes whether the server has stripped one compression layer or not, and the file metadata information is not enough for us to know whether to ignore the hash. |
Thanks for the quick response. It looks like there is no way to validate the download if GCS decompresses the object, short of re-compressing and re-hashing the downloaded data and checking if that matches the value on the object in GCS. Ideally we want to leave server-side decompression enabled for those clients that need it. In our case we are in control of file uploads so we could perhaps add the uncompressed hash to the object metadata and then add our own hash validation on download but it is a shame this can't be solved inside the SDK. We also maintain a list of types not to apply gzip compression to and we can continue to add content-types to this list as we see this error but obviously this isn't a great solution either. I know this is a question for the API team but do you know what the use case is for forcibly decompressing the outer gzip compression for certain "compressed" types even if the user specifically sends |
I'll raise these issues again with the API team, as they are the best positioned to offer a solution that works for all, instead of us trying to patch the .NET library based on assumptions. As for why they are removing the outer compression layer, I really don't know. I'll move this issue to the backlog now, where #1784 is but I'll update it if/when I know more. Do feel free to add a comment if you think there's something else we can address. |
The .NET SDK will throw and exception like
IOException: Incorrect hash: expected 'jdwXEg==' (base64), was '0RBSEw==' (base64)
when trying to download a ZIP file which hasContent-Type: application/zip
andContent-Encoding: gzip
unless file hasCache-Control: no-transform
. This is because of the feature documented at https://cloud.google.com/storage/docs/transcoding#gzip-gzipI am not really sure of the logic behind the GCS feature but I guess the horse has bolted on that one and so now my question is around how I can tell if GCS is going to ignore the
Accept-Encoding
header sent by the .NET SDK. The documentation doesn't have a definitive list of content types where this holds true. Is there any way this list can be built into the .NET SDK so that it knows that it should ignore the hash for these files? Maybe the SDK could check if theX-GUploader-Response-Body-Transformations: gunzipped
response header exists and no check hash if this is true? Is there some other solution where we can still validate the download?To avoid the suggestion that there is no point compressing already compressed files a ZIP can benefit from GZIP compression if for example it contains duplicate files.
The text was updated successfully, but these errors were encountered: