Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantizatize bayesian cifar runtime error #27

Closed
szhaoesat opened this issue Sep 19, 2023 · 4 comments
Closed

Quantizatize bayesian cifar runtime error #27

szhaoesat opened this issue Sep 19, 2023 · 4 comments

Comments

@szhaoesat
Copy link

Hi,
I am trying to run the quantization script with following command:

sh scripts/quantize_bayesian_cifar.sh

The error log is:

['resnet110', 'resnet20', 'resnet32', 'resnet44', 'resnet56']
Files already downloaded and verified
Files already downloaded and verified
Preparing model for quantization....
Calibrating...
Traceback (most recent call last):
  File "/esat/thalassa1/users/szhao/wrk/deep-learning/BNN/bayesian-torch/bayesian_torch/examples/main_bayesian_cifar_dnn2bnn.py", line 618, in <module>
    main()
  File "/esat/thalassa1/users/szhao/wrk/deep-learning/BNN/bayesian-torch/bayesian_torch/examples/main_bayesian_cifar_dnn2bnn.py", line 338, in main
    model_int8 = quantize(model, calib_loader, args)
  File "/esat/thalassa1/users/szhao/wrk/deep-learning/BNN/bayesian-torch/bayesian_torch/examples/main_bayesian_cifar_dnn2bnn.py", line 577, in quantize
    _ = prepared_model(data)
  File ".conda-env/envs/bnn/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File ".conda-env/envs/bnn/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py", line 157, in forward
    raise RuntimeError("module must have its parameters and buffers "
RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cpu

@ranganathkrishnan
Copy link
Contributor

ranganathkrishnan commented Sep 19, 2023

Hi @szhaoesat, You seem to be performing model quantization in cuda based pytorch environment. Can you try quantization in cpuonly pytorch environment? - "conda install pytorch torchvision torchaudio cpuonly -c pytorch"

@szhaoesat
Copy link
Author

Hi @ranganathkrishnan, thanks for your help! I tried the CPU version. However, it reported another error log:

['resnet110', 'resnet20', 'resnet32', 'resnet44', 'resnet56']
Files already downloaded and verified
Files already downloaded and verified
Preparing model for quantization....
Calibrating...
Calibration complete....
Traceback (most recent call last):
  File "/esat/thalassa1/users/szhao/wrk/deep-learning/BNN/bayesian-torch/bayesian_torch/examples/main_bayesian_cifar_dnn2bnn.py", line 618, in <module>
    main()
  File "/esat/thalassa1/users/szhao/wrk/deep-learning/BNN/bayesian-torch/bayesian_torch/examples/main_bayesian_cifar_dnn2bnn.py", line 346, in main
    traced_model = torch.jit.trace(model_int8, data)
  File ".conda-env/envs/bnn/lib/python3.10/site-packages/torch/jit/_trace.py", line 794, in trace
    return trace_module(
  File ".conda-env/envs/bnn/lib/python3.10/site-packages/torch/jit/_trace.py", line 1056, in trace_module
    module._c._create_method_from_trace(
  File ".conda-env/envs/bnn/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File ".conda-env/envs/bnn/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1488, in _slow_forward
    result = self.forward(*input, **kwargs)
  File ".conda-env/envs/bnn/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py", line 153, in forward
    return self.module(*inputs, **kwargs)
  File ".conda-env/envs/bnn/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File ".conda-env/envs/bnn/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1488, in _slow_forward
    result = self.forward(*input, **kwargs)
  File ".conda-env/envs/bnn/lib/python3.10/site-packages/bayesian_torch/models/deterministic/resnet.py", line 122, in forward
    out = F.avg_pool2d(out, out.size()[3])
TypeError: avg_pool2d(): argument 'kernel_size' (position 2) must be tuple of ints, not Tensor

@ranganathkrishnan
Copy link
Contributor

Hi @szhaoesat, The issue seem to be from Jit trace model, I have pushed a fix f5c7126. Can you pull in this fix and try?

Thanks!

@szhaoesat
Copy link
Author

Thanks for your help, now it works!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants