Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails to upload changed file when multiple S3 keys have the same md5sum #156

Closed
mgreensmith opened this issue Apr 9, 2015 · 2 comments
Closed

Comments

@mgreensmith
Copy link

If a file is changed, but a different file key already exists on S3 with the same md5sum, the file is not recognized as changed, and is not uploaded.

It appears that the UploadHelper just takes the set of S3 md5sums and assumes that if the md5sum of the current file is found at all, then the file is up to date. However, that md5sum could belong to a different file with the same content.(https://github.com/laurilehmijoki/s3_website/blob/master/src/main/scala/s3/website/UploadHelper.scala#L30)

Example:

~$ find ./dupetest -type f -print0 | xargs -0 md5
MD5 (./dupetest/1.txt) = c157a79031e1c40f85931829bc5fc552   # file contents: 'bar'
MD5 (./dupetest/2.txt) = d3b07384d113edec49eaa6238ad5ff00  # file contents: 'foo'
MD5 (./dupetest/3.txt) = c157a79031e1c40f85931829bc5fc552   # file contents: 'bar'

~$ bundle exec s3_website push --site dupetest --verbose
[debg] Using /usr/local/var/rbenv/versions/2.0.0-p451/gemsets/marketing-jekyll/gems/s3_website-2.8.4/s3_website-2.8.4.jar
[info] Deploying dupetest/* to mg-testbucket
[debg] Querying S3 files
[succ] Created 3.txt (text/plain; charset=utf-8 | 4 B | 0 B/s)
[succ] Created 1.txt (text/plain; charset=utf-8 | 4 B | 0 B/s)
[succ] Created 2.txt (text/plain; charset=utf-8 | 4 B | 0 B/s)
[info] Summary: Created 3 files. Transferred 12 B, 0 B/s.
[info] Successfully pushed the website to http://mg-testbucket.s3-website-us-east-1.amazonaws.com

~$ echo "foo" > dupetest/3.txt

~$ find ./dupetest -type f -print0 | xargs -0 md5
MD5 (./dupetest/1.txt) = c157a79031e1c40f85931829bc5fc552   # file contents: 'bar'
MD5 (./dupetest/2.txt) = d3b07384d113edec49eaa6238ad5ff00  # file contents: 'foo'
MD5 (./dupetest/3.txt) = d3b07384d113edec49eaa6238ad5ff00  # file contents: 'foo'

~$ bundle exec s3_website push --site dupetest --verbose
[debg] Using /usr/local/var/rbenv/versions/2.0.0-p451/gemsets/marketing-jekyll/gems/s3_website-2.8.4/s3_website-2.8.4.jar
[info] Deploying dupetest/* to mg-testbucket
[debg] Querying S3 files
[info] Summary: There was nothing to push

Expected: 3.txt to be uploaded.
Actual: 3.txt was not uploaded, because md5sum of the file was already present via 2.txt

Possible solution: the UploadHelper could map the existing S3 file key names to their md5sums and compare md5sums for the file in question only, rather than searching all md5sums.

@laurilehmijoki
Copy link
Owner

Thanks @mgreensmith for pointing out this bug. I also appreciate your detailed bug report.

I've just released the version 2.8.6, which contains a fix for this problem. Please try it out. Does the new version solve the problem?

@mgreensmith
Copy link
Author

👍 This fixes the issue for me. Thanks for the fast response!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants