v0.4.0+ breaks compatibility with Nvidia Deep Learning Containers #892

tasercake · 2022-10-18T02:35:20Z

Describe the bug

Nvidia's deep learning containers are a popular way to run machine learning workloads on top of Docker.

With diffusers 0.4.0+, I'm unable to import diffusers inside of this container because torch.backends.mps doesn't exist.

Culprit appears to be this line in src/diffusers/utils/testing_utils.py:

if is_torch_higher_equal_than_1_12:
        torch_device = "mps" if torch.backends.mps.is_available() else torch_device

The torch installation in the container doesn't have MPS, so it raises the following error:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/conda/lib/python3.8/site-packages/diffusers/__init__.py", line 1, in <module>
    from .utils import (
  File "/opt/conda/lib/python3.8/site-packages/diffusers/utils/__init__.py", line 43, in <module>
    from .testing_utils import floats_tensor, load_image, parse_flag_from_env, slow, torch_device
  File "/opt/conda/lib/python3.8/site-packages/diffusers/utils/testing_utils.py", line 23, in <module>
    torch_device = "mps" if torch.backends.mps.is_available() else torch_device
AttributeError: module 'torch.backends' has no attribute 'mps'

Reproduction

(assuming you have Docker installed & configured)

Start the deep learning container

docker run -it --rm --platform linux/amd64 nvcr.io/nvidia/pytorch:22.04-py3 bash

Inside the container:

# Install diffusers (>=0.4.0)
pip install diffusers==0.4.0

# Import diffusers from python
python -c python -c 'import diffusers'

Workaround

Adding another condition to the MPS check seems to fix at least the import issue for me:

if is_torch_higher_equal_than_1_12:
        torch_device = "mps" if (hasattr(torch.backends, "mps") and torch.backends.mps.is_available()) else torch_device

Happy to contribute this as a PR if appropriate.

System Info

I've only tested this with version 22.04 of the deep learning container from nvidia because it's the latest one that comes with torch==1.12.0

Output from running diffusers-cli env inside the container:

diffusers version: 0.4.0 (also tested with 0.5.1)
Platform: Linux-4.19.121-linuxkit-x86_64-with-glibc2.10
Python version: 3.8.13
PyTorch version (GPU?): 1.12.0a0+bd13bc6 (False)
Huggingface_hub version: 0.10.1
Transformers version: 4.23.1
Using GPU in script?: No
Using distributed or parallel set-up in script?: No

The text was updated successfully, but these errors were encountered:

keturn · 2022-10-18T03:43:43Z

Odd. Why would the pytorch api define a torch.backends.mps.is_built function if torch.backends.mps wasn't supposed to be available even on non-mps builds?

#849 suggests diffusers is PyTorch 1.13 compatible, does it work if you use the newest version of the NGC container instead?

tasercake · 2022-10-18T11:16:52Z

Odd indeed. Tested this on nvcr.io/nvidia/pytorch:22.09-py3 which ships with PyTorch 1.13 and I'm able to import diffusers just fine since torch.backends.mps is present.

Interestingly, the v1.12 docs for torch.backends don't mention mps in the header section (but the 1.13 docs do). Not sure if this was just an oversight.

pcuenca · 2022-10-18T11:31:13Z

As far as I knew, the mps backend was added in PyTorch 1.12: https://pytorch.org/docs/1.12/notes/mps.html. This looks like this must be an issue with that container.

Feel free to open a PR to make it a bit more robust :)

tasercake · 2022-10-18T11:43:00Z

Ran a few tests:

# Pull image & run container
docker run -it --platform linux/amd64 nvcr.io/nvidia/pytorch:<version>-py3 bash

# Inside container, install & import diffusers
pip install diffusers
python -c 'import diffusers'

My results:

Nvidia container version	PyTorch version	Did it work?
22.09	1.13.0a0+d0d6b1f	✅
22.06	1.13.0a0+340c412	✅
22.05	1.12.0a0+8a1a93a	❌
22.04	1.12.0a0+bd13bc6	❌

Will open a PR with the above workaround shortly

tasercake added the bug Something isn't working label Oct 18, 2022

tasercake mentioned this issue Oct 20, 2022

Fix Compatibility with Nvidia NGC Containers #919

Merged

patrickvonplaten closed this as completed in #919 Oct 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.4.0+ breaks compatibility with Nvidia Deep Learning Containers #892

v0.4.0+ breaks compatibility with Nvidia Deep Learning Containers #892

tasercake commented Oct 18, 2022 •

edited

Loading

keturn commented Oct 18, 2022

tasercake commented Oct 18, 2022 •

edited

Loading

pcuenca commented Oct 18, 2022

tasercake commented Oct 18, 2022 •

edited

Loading

v0.4.0+ breaks compatibility with Nvidia Deep Learning Containers #892

v0.4.0+ breaks compatibility with Nvidia Deep Learning Containers #892

Comments

tasercake commented Oct 18, 2022 • edited Loading

Describe the bug

Reproduction

Workaround

System Info

keturn commented Oct 18, 2022

tasercake commented Oct 18, 2022 • edited Loading

pcuenca commented Oct 18, 2022

tasercake commented Oct 18, 2022 • edited Loading

tasercake commented Oct 18, 2022 •

edited

Loading

tasercake commented Oct 18, 2022 •

edited

Loading

tasercake commented Oct 18, 2022 •

edited

Loading