-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
There is a Difference between Torchvsion Shear and PIL Shear #5204
Comments
@SamuelGabriel Thanks for reporting. I agree that's important to investigate. @vfdev-5 I think you might be the original author of affine (#2444). Do you have the bandwidth to have a look? |
@SamuelGabriel thanks for asking that. I'd say the answer would be more like it was a convention made. I think I was inspired from keras at the time of implementing and submitting #2444 PR.
Here is how affine matrix is defined: vision/torchvision/transforms/functional.py Lines 952 to 967 in 4946827
We may have some differences on how shear is parametrized. EDIT: we can see that the shear angle is not exactly the same, due to different parametrizations. EDIT2:
where |
Related: #5194 |
Fixes #5204 Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>
🐛 Describe the bug
As @agaldran pointed out in the TrivialAugment repo (automl/trivialaugment#6), there seems to be difference between the behavior of a shear implemented with the PIL and the TorchVision affine transform.
This is an issue for the autoaugment algorithms, as this might yield to different results compared to other implementations of these algorithms. It might be an issue for other applications as well. Both the AutoAugment (https://github.com/tensorflow/models/blob/fd34f711f319d8c6fe85110d9df6e1784cc5a6ca/research/autoaugment) and the TrivialAugment (https://github.com/automl/trivialaugment) reference implementations use PIL, while RandAugment has no reference implementation.
This script will yield the following images:
Which should be the same.
It looks like torchvision has a fixed center, while PIL uses a fixed top. I did not dig into the code much, yet, though. Is there someone here who implemented
affine
maybe and can give a reasoning for the different shearing.Versions
Collecting environment information...
PyTorch version: 1.10.1+cu102
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.3 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: 10.0.0-4ubuntu1
CMake version: version 3.16.3
Libc version: glibc-2.31
Python version: 3.9.7 (default, Sep 16 2021, 13:09:58) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.4.0-94-generic-x86_64-with-glibc2.31
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.22.0
[pip3] torch==1.10.1
[pip3] torchvision==0.11.2
[conda] mypy-extensions 0.4.3 pypi_0 pypi
[conda] numpy 1.22.0 pypi_0 pypi
[conda] torch 1.10.1 pypi_0 pypi
[conda] torchvision 0.11.2 pypi_0 pypi
cc @vfdev-5 @datumbox
The text was updated successfully, but these errors were encountered: