There is a Difference between Torchvsion Shear and PIL Shear #5204

SamuelGabriel · 2022-01-17T17:21:18Z

🐛 Describe the bug

As @agaldran pointed out in the TrivialAugment repo (automl/trivialaugment#6), there seems to be difference between the behavior of a shear implemented with the PIL and the TorchVision affine transform.

This is an issue for the autoaugment algorithms, as this might yield to different results compared to other implementations of these algorithms. It might be an issue for other applications as well. Both the AutoAugment (https://github.com/tensorflow/models/blob/fd34f711f319d8c6fe85110d9df6e1784cc5a6ca/research/autoaugment) and the TrivialAugment (https://github.com/automl/trivialaugment) reference implementations use PIL, while RandAugment has no reference implementation.

from PIL import Image
import math
import torchvision # >= 0.11

from torchvision.transforms import functional as F
interpolation = torchvision.transforms.InterpolationMode.NEAREST
fill = None

img = Image.new('RGB', (32,32), (255,255,0))
magnitude = .7

# shear_x as seen in torchvision https://github.com/pytorch/vision/blob/b5aa0915fe16e82ee4c24919032b4e7afae3ae1b/torchvision/transforms/autoaugment.py#L17
im_torch = F.affine(img, angle=0.0, translate=[0, 0], scale=1.0, shear=[math.degrees(magnitude), 0.0],
                    interpolation=interpolation, fill=fill)

# shear_x as seen in https://github.com/automl/trivialaugment/blob/3bfd06552336244b23b357b2c973859500328fbb/aug_lib.py#L156 and https://github.com/tensorflow/models/blob/fd34f711f319d8c6fe85110d9df6e1784cc5a6ca/research/autoaugment/augmentation_transforms.py#L290
im_pil = img.transform(img.size, Image.AFFINE, (1, magnitude, 0, 0, 1, 0))

im_torch.show()
im_pil.show()

This script will yield the following images:

Which should be the same.

It looks like torchvision has a fixed center, while PIL uses a fixed top. I did not dig into the code much, yet, though. Is there someone here who implemented affine maybe and can give a reasoning for the different shearing.

Versions

Collecting environment information...
PyTorch version: 1.10.1+cu102
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.3 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: 10.0.0-4ubuntu1
CMake version: version 3.16.3
Libc version: glibc-2.31

Python version: 3.9.7 (default, Sep 16 2021, 13:09:58) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.4.0-94-generic-x86_64-with-glibc2.31
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.22.0
[pip3] torch==1.10.1
[pip3] torchvision==0.11.2
[conda] mypy-extensions 0.4.3 pypi_0 pypi
[conda] numpy 1.22.0 pypi_0 pypi
[conda] torch 1.10.1 pypi_0 pypi
[conda] torchvision 0.11.2 pypi_0 pypi

cc @vfdev-5 @datumbox

The text was updated successfully, but these errors were encountered:

datumbox · 2022-01-17T17:58:07Z

@SamuelGabriel Thanks for reporting. I agree that's important to investigate.

@vfdev-5 I think you might be the original author of affine (#2444). Do you have the bandwidth to have a look?

vfdev-5 · 2022-01-17T18:28:57Z

@SamuelGabriel thanks for asking that. I'd say the answer would be more like it was a convention made. I think I was inspired from keras at the time of implementing and submitting #2444 PR.
IMO, rotational part inside affine makes sense if the pivot point is in the center. So, thus everything follows the same convention.
Maybe, we can consider this issue as a feature request to introduce center arg, as it is done for F.rotate.

Torchvision vs PIL affine parametrization ?

Here is how affine matrix is defined:

vision/torchvision/transforms/functional.py

Lines 952 to 967 in 4946827

    
           # As it is explained in PIL.Image.rotate 
        
           # We need compute INVERSE of affine transformation matrix: M = T * C * RSS * C^-1 
        
           # where T is translation matrix: [1, 0, tx | 0, 1, ty | 0, 0, 1] 
        
           #       C is translation matrix to keep center: [1, 0, cx | 0, 1, cy | 0, 0, 1] 
        
           #       RSS is rotation with scale and shear matrix 
        
           #       RSS(a, s, (sx, sy)) = 
        
           #       = R(a) * S(s) * SHy(sy) * SHx(sx) 
        
           #       = [ s*cos(a - sy)/cos(sy), s*(-cos(a - sy)*tan(x)/cos(y) - sin(a)), 0 ] 
        
           #         [ s*sin(a + sy)/cos(sy), s*(-sin(a - sy)*tan(x)/cos(y) + cos(a)), 0 ] 
        
           #         [ 0                    , 0                                      , 1 ] 
        
           # 
        
           # where R is a rotation matrix, S is a scaling matrix, and SHx and SHy are the shears: 
        
           # SHx(s) = [1, -tan(s)] and SHy(s) = [1      , 0] 
        
           #          [0, 1      ]              [-tan(s), 1] 
        
           # 
        
           # Thus, the inverse is M^-1 = C * RSS^-1 * C^-1 * T^-1

We may have some differences on how shear is parametrized.

EDIT:
If we use the pivot point as top-left corner, the result will be the following:

we can see that the shear angle is not exactly the same, due to different parametrizations.

EDIT2:
Maybe, it makes sense to have the following matrix definition:

M = T^-1 C^-1 * RS * C * SHy(sy) * SHx(sx)

where RS is now only rotation and scale

vadimkantorov · 2022-01-18T11:20:42Z

Related: #5194

Fixes pytorch#5204

Fixes #5204 Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>

…5285) Summary: Fixes #5204 Reviewed By: NicolasHug Differential Revision: D34140243 fbshipit-source-id: 0b7c01b3479d5ef0eb9dfab64e317bb31eff0b31 Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>

datumbox added the module: transforms label Jan 17, 2022

vadimkantorov mentioned this issue Jan 18, 2022

[docs] PIL image/enhance ; OpenCV; scikit-image ops <> torchvision transforms migration advice / summary table / test+diff images / comments in individual functions #5194

Open

vfdev-5 self-assigned this Jan 18, 2022

vfdev-5 mentioned this issue Jan 18, 2022

Added center arg to F.affine and RandomAffine ops #5208

Merged

vfdev-5 added a commit to vfdev-5/vision that referenced this issue Jan 26, 2022

Added center as top-left for shear X/Y ops for autoaugment

6f865b8

Fixes pytorch#5204

vfdev-5 mentioned this issue Jan 26, 2022

Added center as top-left for shear X/Y ops for autoaugment #5285

Merged

datumbox closed this as completed in #5285 Feb 2, 2022

datumbox added a commit that referenced this issue Feb 2, 2022

Added center as top-left for shear X/Y ops for autoaugment (#5285)

8eb9fb1

Fixes #5204 Co-authored-by: Vasilis Vryniotis <datumbox@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

There is a Difference between Torchvsion Shear and PIL Shear #5204

There is a Difference between Torchvsion Shear and PIL Shear #5204

SamuelGabriel commented Jan 17, 2022 •

edited by pytorch-probot bot

Loading

datumbox commented Jan 17, 2022

vfdev-5 commented Jan 17, 2022 •

edited

Loading

vadimkantorov commented Jan 18, 2022

There is a Difference between Torchvsion Shear and PIL Shear #5204

There is a Difference between Torchvsion Shear and PIL Shear #5204

Comments

SamuelGabriel commented Jan 17, 2022 • edited by pytorch-probot bot Loading

🐛 Describe the bug

Versions

datumbox commented Jan 17, 2022

vfdev-5 commented Jan 17, 2022 • edited Loading

vadimkantorov commented Jan 18, 2022

SamuelGabriel commented Jan 17, 2022 •

edited by pytorch-probot bot

Loading

vfdev-5 commented Jan 17, 2022 •

edited

Loading