Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Pyright static type checking by replacing if-else imports with try-except #16578

Merged

Conversation

d-miketa
Copy link
Contributor

@d-miketa d-miketa commented Apr 4, 2022

What does this PR do?

Fixes #11642.

The Transformers library has a system for importing optional dependencies which replaces objects from missing libraries with "dummy" versions; this logic is implemented by a simple if-else statement in transformers/__init__.py. Unfortunately Pyright/Pylance, the type-hinting software behind VSCode and widely used in other editors, always assumes that the dummy objects are the ones being imported. This breaks diverse useful functionalities such as "Go to reference" (you are always taken to the dummy implementation), documentation (empty dummy docstrings are shown) and type hinting (the dummy's properties are assumed to hold).

This PR replaces the if-else import logic with slightly more verbose try-except(-else) - the form supported by Pyright/Pylance for this precise use case microsoft/pylance-release#2402.

WIP Status

utils/check_inits.py currently doesn't pass, but I've been able to modify it (locally, not in PR yet) to account for changes to the main __init__.py. Unfortunately this surfaced the fact that there are hundreds of other small changes that need to be made across the entire library. I can do this work, but it'll be a longer process. In the meantime I'm asking for guidance on the form of import blocks. I can see two options. The difference is in the location of the "true" (i.e. non-dummy) imports:

A

try:
    if not is_dependency_available():
          raise OptionalDependencyNotAvailableError()
    _import_structure["models.mymodel"].append("MyModel")
except OptionalDependencyNotAvailableError:
    _import_structure["utils.dummy_dependency_objects"] = ...

B

try:
    if not is_dependency_available(): 
        raise OptionalDependencyNotAvailableError()
except OptionalDependencyNotAvailableError:
    _import_structure["utils.dummy_dependency_objects"] = ...
else:
    _import_structure["models.mymodel"].append("MyModel")

B is slightly more verbose but seems to be preferred by official Python guidelines: https://docs.python.org/3.9/tutorial/errors.html#handling-exceptions. I'm currently leaning towards B but perhaps I've missed a more elegant solution.

Before submitting

Who can review?

Possibly @sgugger?

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Apr 4, 2022

The documentation is not available anymore as the PR was closed or merged.

@sgugger
Copy link
Collaborator

sgugger commented Apr 4, 2022

There is another way to fix the "Go to Reference" in VsCode that has been shown here. I would like to avoid changing something as critical at the main init.

@d-miketa
Copy link
Contributor Author

d-miketa commented Apr 4, 2022

There is another way to fix the "Go to Reference" in VsCode that has been shown here. I would like to avoid changing something as critical at the main init.

Unfortunately that switches out the entire language server (Pyright/Pylance for Jedi) and is in effect equivalent to saying that Huggingface doesn’t support Pyright. Many people prefer Pyright as it’s significantly snappier than Jedi-based LSPs and it’s often recommended over Jedi in IDEs that support both (notably Vim and Emacs).

@sgugger
Copy link
Collaborator

sgugger commented Apr 4, 2022

I wouldn't say Hugging Face does not support Pyright when it's Pyright that doesn't support a simple if/then/else. Honestly, to me, this is an issue that should be fixed at that level instead of asking package maintainers to add some hacky workarounds like the one you are suggesting (not your fault, I understand it's the only thing supported as of now ;-) ).

Pinging @LysandreJik and @patrickvonplaten for your advice, but I'd personally leave things as is and say that we don't support Pyright.

@d-miketa
Copy link
Contributor Author

d-miketa commented Apr 4, 2022

I fully understand the hesitation to take a scalpel to something as important as the init system. (Which is, by the way, extremely pleasant to use from a user ergonomics perspective. I have learnt a lot from HF's design decisions - thank you!) If Pyright were a niche tool I'd drop it, but it's genuinely one of the best and most popular ways to interact with Python code inside VSCode, Vim and Emacs.

Pyright has good reason for its current behaviour: as a static checker it has no way of knowing whether a given if-else condition will evaluate true or false at runtime. The authors made a decision at some point early in development to assume that the last import declaration is the one that counts - even if it's inside two branches of the same if-else block:

if cond:
    from a import X  # Pyright ignores this one...
else:
    from b import X  # ...and selects this one

They could've chosen otherwise, but there's no obvious reason why they should have. The problem with if-else is that there's no sense of which branch corresponds to correct or desirable behaviour, whereas try-except-else includes a kind of value judgment: try-else is the correct path. That makes try-except-else much easier to parse with a static type checker.

There may be an alternative resolution that fixes Pyright and leaves your init system intact, by the way. Many libraries (including Pandas and Numpy) now produce "type stubs", which are essentially what .h files are to .c[pp] files in C or C++. Pyright and other type checkers can then follow these type stubs as sources of type-hinting truth. I've never generated them and can't guarantee they'd fix "Go to definition" etc, but perhaps it's worth a shot.

@sgugger
Copy link
Collaborator

sgugger commented Apr 5, 2022

I've given it a bit more thought and I agree that this would be a welcome change. My main issues are around the necessary workarounds in the scripts we use to check the repo consistency.

I also think that syntax B is better in your suggestion.

Would we need to do this for each of the model subfolders init as well?

@d-miketa
Copy link
Contributor Author

d-miketa commented Apr 5, 2022

Yeah it needs to be done for model imports too. It’s probably best to automate it using something like awk or batch processing in vim. I had a stab at manual correction and there’s just too much.

I agree that checking for consistency is tricky. Does someone on your side have capacity to work on this change, perhaps to guide modifications to check_init, or should I finish this PR?

@sgugger
Copy link
Collaborator

sgugger commented Apr 5, 2022

I don't think anyone has the bandwidth to work on this presently, so if you have time to finish the PR, that would be best!

@d-miketa
Copy link
Contributor Author

That's fine, I'll finish the PR myself. :)

So far I've made two non-trivial decisions that I'd like to check with you.

  1. In your inits, TYPE_CHECKING blocks mirror blocks with _import_structure statements. This entire issue's caused by static file checkers, so it's in principle possible to fix it by touching only the code inside TYPE_CHECKING blocks. (The fix would consist of replacing if/else statements with try/except/else, as discussed above.) However, I suggest changing the _import_structure blocks, in the exact same way, to preserve their symmetry. It's strictly speaking not necessary - and touching that part of the code brings its risks - but I believe that keeping the two structures in sync improves readability and makes it easier to automatically compare them using a simpler check_inits.py CI/CD script.

  2. I am only touching code inside __init__.py files under the hypothesis that this will be enough to satisfy type checkers and minimise the amount of work necessary.

Are both fine with you?

@d-miketa d-miketa force-pushed the optional-imports-in-try-except-pyright-fix branch from c8bfb73 to 0a8e936 Compare April 10, 2022 18:51
@sgugger
Copy link
Collaborator

sgugger commented Apr 11, 2022

Completely agree with both point! Thanks a lot for tackling this!

@d-miketa d-miketa closed this Apr 16, 2022
@d-miketa d-miketa force-pushed the optional-imports-in-try-except-pyright-fix branch from 6800bc5 to 60d27b1 Compare April 16, 2022 18:18
@d-miketa d-miketa reopened this Apr 16, 2022
@d-miketa
Copy link
Contributor Author

Alright, I think the bulk of the work is done, but there are a few failing tests.

  1. run_examples_flax and run_tests_flax are, I believe, not my fault! jax-ml/jax@1af5c88
  2. "Add model like runner" is interesting: https://github.com/huggingface/transformers/runs/6049632592?check_suite_focus=true#step:5:15 is triggered because is_tokenizers_available is somehow, strangely, erased from the new model __init__.py file. I did add an from ...utils import OptionalDependencyNotAvailable, but I don't see from commands/add_new_model.py why that would be a problem. Would appreciate a second pair of eyes on this.
  3. run_tests_templates I can look into, though I'm not at all familiar with that part of the codebase. If you know someone who can at least vaguely gesture at what's going on, that'd be super helpful. :)

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for rolling this out to all inits. I can help with the scripts once the PR is ready but you moved all imports in the submodules init before the copyright files, for some reason, which needs to be undone.

Once this is cleaned up what there is left to do is:

  • adapt the file to_replace.py in the add new model template to use the same format as the new init. The "start lines" for all the imports in the TYPE_CHECKING section should be adapted, this one and the next.
  • adapt the add new model like command file, in particular the functions clean_frameworks_in_init and add_model_to_main_init
  • fix the test in check_inits.py which is currently not testing anything anymore.

@@ -59,6 +59,7 @@
EntryNotFoundError,
ExplicitEnum,
ModelOutput,
OptionalDependencyNotAvailable,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to add it here as this file only exists for backward compatibility reasons and this is a new object.

Comment on lines 1 to 13
from typing import TYPE_CHECKING

from ...utils import (
OptionalDependencyNotAvailable,
_LazyModule,
is_flax_available,
is_sentencepiece_available,
is_tf_available,
is_tokenizers_available,
is_torch_available,
)


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why more the imports before the copyright? They should go below as before.

Comment on lines 1 to 13
from typing import TYPE_CHECKING

from ...utils import (
OptionalDependencyNotAvailable,
_LazyModule,
is_flax_available,
is_sentencepiece_available,
is_tf_available,
is_tokenizers_available,
is_torch_available,
)


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why more the imports before the copyright? They should go below as before.


class OptionalDependencyNotAvailable(BaseException):
"""Internally used error class for signalling an optional dependency was not found."""

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change


class OptionalDependencyNotAvailable(BaseException):
"""Internally used error class for signalling an optional dependency was not found."""

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@d-miketa
Copy link
Contributor Author

Many thanks for the review and apologies for no update yet - this is very much on my mind, I've just been distracted by other priorities. Hope to look at it over one of the coming evenings.

@d-miketa d-miketa force-pushed the optional-imports-in-try-except-pyright-fix branch from b534ce0 to 26e21d6 Compare April 30, 2022 20:11
@d-miketa
Copy link
Contributor Author

Fixed imports - sorry, that was an artefact of my automated init fixer.

I haven't been able to do much about to_replace because it's not a simple replacement. In the original version it's enough to paste these lines right under if is_xxx_available(), whereas now there's a (variable) number of lines between if not is_xxx_available() and the corresponding else block, where these blocks should be placed.

Will tackle "add new model like" command next.

@sgugger
Copy link
Collaborator

sgugger commented May 2, 2022

For the changes in to_replace.py, if I'm not mistaken, all those if not is_xxx_available() prompts are for the main init. We can add a new comment (like the " # PyTorch models structure") we already have to flag where we want those new models.

@d-miketa d-miketa force-pushed the optional-imports-in-try-except-pyright-fix branch from 957e850 to 1b5ef4c Compare May 8, 2022 13:14
@d-miketa
Copy link
Contributor Author

d-miketa commented May 8, 2022

Alright, I think we're done here! :) All tests pass and Pyright picks up the correct classes.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Terrific work on this, thanks a lot! Since this touches a lot of files, pinging @LysandreJik for a second review before we merge.

@@ -59,6 +59,7 @@
EntryNotFoundError,
ExplicitEnum,
ModelOutput,
OptionalDependencyNotAvailable,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to add it here as this file only exists for backward compatibility reasons and this is a new object.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, super impressive work! LGTM, thanks for your work @d-miketa!

@sgugger sgugger merged commit df735d1 into huggingface:main May 9, 2022
@d-miketa
Copy link
Contributor Author

d-miketa commented May 9, 2022

Thanks both! :) Happy to contribute back!

@d-miketa d-miketa changed the title [WIP] Fix Pyright static type checking by replacing if-else imports with try-except Fix Pyright static type checking by replacing if-else imports with try-except May 9, 2022
elusenji pushed a commit to elusenji/transformers that referenced this pull request Jun 12, 2022
…ith try-except (huggingface#16578)

* rebase and isort

* modify cookiecutter init

* fix cookiecutter auto imports

* fix clean_frameworks_in_init

* fix add_model_to_main_init

* blackify

* replace unnecessary f-strings

* update yolos imports

* fix roberta import bug

* fix yolos missing dependency

* fix add_model_like and cookiecutter bug

* fix repository consistency error

* modify cookiecutter, fix add_new_model_like

* remove stale line

Co-authored-by: Dom Miketa <dmiketa@exscientia.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants