Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix Swin Transformer inplace mutation #6266

Merged
merged 5 commits into from
Jul 14, 2022
Merged

fix Swin Transformer inplace mutation #6266

merged 5 commits into from
Jul 14, 2022

Conversation

ain-soph
Copy link
Contributor

@ain-soph ain-soph commented Jul 13, 2022

Fix an issue that in previous codes, shift_size is modified in-place, which might modifies the attribute value self.shift_size as well.

cc @YosuaMichael

@ain-soph
Copy link
Contributor Author

@YosuaMichael Do you think this fix requires a model retraining for the current released pretrained weights for Swin Transformer?

ain-soph added 3 commits July 13, 2022 11:45

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@ain-soph ain-soph changed the title fix inplace mutation fix Swin Transformer inplace mutation Jul 13, 2022
Copy link
Contributor

@YosuaMichael YosuaMichael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ain-soph , the PR looks good!

I don't think we need to do any retraining since the model parameters layout is not changed.

@NicolasHug
Copy link
Member

Thanks for the fix @ain-soph !

The layout isn't changed, but if self.shift_size was set to 0 in some cases as in the current main branch, it's still possible that this affects the training procedure (and thus the accuracy) eventually, in particular when

# If window size is larger than feature size

I don't really know enough to know whether that case gets hit during our trainings though.

@ain-soph
Copy link
Contributor Author

@NicolasHug It should be okay. For each attn layer, its input size is actually fixed during training for ImageNet dataset because we resize the input images. If the shift_size is set to 0 for 1 image, it means shift_size should be set to 0 for all images.

And different blocks have individual shift_size attribute (while share the same window_size though), so I think it doesn't influence current training pipeline.

@YosuaMichael
Copy link
Contributor

Just to give more context for @NicolasHug, so previously when we train the models the shift_size actually an int, therefore it doesnt have any in-place mutation danger. And actually I am the one introducing the bug in this PR: https://github.com/pytorch/vision/pull/6088/files#diff-58328880bdc0a38a5a29afcfedd7bca85a4b72631a5d0416d312f6505331deefL64
when I need the shift_size to be a list.

So this PR should fix the problem on that and the training behaviour should be more similar with the one when we train the weight.

@YosuaMichael YosuaMichael merged commit 418d8a6 into pytorch:main Jul 14, 2022
@github-actions
Copy link

Hey @YosuaMichael!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

@ain-soph ain-soph deleted the fix-in-place-mutation branch July 14, 2022 15:21
facebook-github-bot pushed a commit that referenced this pull request Jul 21, 2022
Summary:
* fix inplace mutation

* Different attn shouldn't share the same attribute

* a simpler solution

Reviewed By: jdsgomes

Differential Revision: D37993419

fbshipit-source-id: 2a08a62168c4e6ee6c5a2ca934de88aa04361016

Co-authored-by: YosuaMichael <yosuamichaelm@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants