Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why do we have test cases? #8505

Closed
AlexWaygood opened this issue Aug 8, 2022 · 11 comments
Closed

Why do we have test cases? #8505

AlexWaygood opened this issue Aug 8, 2022 · 11 comments
Labels
project: policy Organization of the typeshed project

Comments

@AlexWaygood
Copy link
Member

AlexWaygood commented Aug 8, 2022

We currently only have test cases for our stdlib stubs, and don't currently have the infrastructure set up to allow us to add test cases for any of our third-party stubs. This has come up a few times now, so it would be good to figure out a way to get this to work.

(This issue relates to the test cases in our test_cases directory. Currently we test these samples with mypy and pyright.)

This issue was originally meant to discuss the mechanics of adding test cases for third-party stubs. The discussion quickly strayed away from that, so #8569 is now the issue for that discussion.

@AlexWaygood AlexWaygood added the project: infrastructure typeshed build, test, documentation, or distribution related label Aug 8, 2022
@AlexWaygood AlexWaygood self-assigned this Aug 8, 2022
@srittau

This comment was marked as off-topic.

@kkirsche
Copy link
Contributor

What kinds of things are you looking to test with these tests? I think there are a few approaches that could be used, but from previous discussions, to me at least, it hasn't been made clear what specifically is being tested for in these circumstances

@AlexWaygood
Copy link
Member Author

AlexWaygood commented Aug 18, 2022

it hasn't been made clear what specifically is being tested for in these circumstances

I'm sorry to hear that. Did the revisions I made to the README in #8445 not help at all?

Essentially, the vast majority of changes to typeshed do not require test cases to be added. There's little point in simply repeating what's said in the stub in a test file; e.g. the following would be a fairly pointless test:

# STUB
def func() -> str: ...

# TEST
assert_type(func(), str)

The cases where we want test cases broadly fall into the following categories:

  • Situations where the annotations are highly complex. For example, builtins.pow, which has 16 overloads. For builtins.pow, we want to make sure that the type checker picks the right overload for various calls.
  • Situations where we want to ensure that idiomatic usage of a function or class doesn't result in type checkers emitting false-positive errors. We might be especially concerned about false-positive errors for a particular class because the class is unusual in some way (e.g. the tests in test_logging.py), because previous changes to typeshed that looked harmless have accidentally caused regressions in the past (e.g. the tests in test_object.py), or because a change to the stub proposes to use a brand-new feature in the typing system, for which we expect there might be latent bugs in type checkers (the tests in test_all.py fall into this bucket).

@kkirsche
Copy link
Contributor

kkirsche commented Aug 18, 2022

Thanks for responding, Alex. It helps, but I guess from seeing when they're requested in practice, it's not actually accurately defined, as we saw in the weakref disagreements related to your second bullet about false-positives.

But, with that said, I realize I'm the outlier among this group as I agree with #7724 (comment) which states that the cautious approach taken by the team neuters part of the point of type checkers, which are intended in most cases to raise errors if they can't prove something to be the case.

I think I misunderstood the comment

So while it's well documented, as an outsider, the team's actual actions don't match the written goal, in my limited awareness and experience. There certainly are complexity checks related to situations with a large volume of overloads, which makes perfect sense, but from looking through various comments by the team, requests for these tests can (often?) state that the maintainers aren't as comfortable with those areas of the standard library or stub libraries which is why they want the tests. As a result, as a contributor this reads as the team asking contributors to create documentation for areas of the standard library or stubs where the typeshed team isn't as comfortable, not actually performing type-related testing. That's fine, it's just not what's written.

As a result, I have felt that the written explanation is incongruous with the actions / requests. It's not a problem, but it's led me to not feel confident that the tests are actually for what was described.

I hope that doesn't come off in the wrong way as it's intended only as constructive feedback, not a criticism.

EDIT: fix haven't -> have
EDIT: unsure if I misunderstood the comment, so strikethrough of that

@JelleZijlstra
Copy link
Member

@kkirsche could you give examples of cases where we asked for tests that don't match the criteria Alex listed? Tests are a relatively new part of typeshed and we're still figuring out where they're effective. If you feel our docs don't match reality, that's important to know.

As for your struck paragraph, I think that's a separate issue from testing, but it's worth revisiting whether the policy to prefer false negatives still makes sense. Feel free to open a new issue if you want to discuss that. A related issue is python/typing#1096.

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Aug 19, 2022

kkirsche mentioned "weakref disagreements", which appears to be a reference to this PR: #8438
Haven't yet read that thread, but +1 on linking to PRs where a test request has felt inappropriate

@kkirsche
Copy link
Contributor

@kkirsche could you give examples of cases where we asked for tests that don't match the criteria Alex listed? Tests are a relatively new part of typeshed and we're still figuring out where they're effective. If you feel our docs don't match reality, that's important to know.

As for your struck paragraph, I think that's a separate issue from testing, but it's worth revisiting whether the policy to prefer false negatives still makes sense. Feel free to open a new issue if you want to discuss that. A related issue is python/typing#1096.

No worries, I'll need to go back and locate where I'm thinking of as I didn't hold onto the references to them, the PR hauntsaninja referenced was the instance that I could point to most easily as it was one I was involved with.

If for whatever reason I can't find them, I want to apologize in advance as that would mean I must be mistaken and mixing up repos. Earliest I'll be free to locate them is going to be next week, as today and this weekend is heavily booked for me.

Thanks for all your time, and sorry for mixing topics. My hope is to understand the point of the tests well enough to understand how to make a proposal for integrating the equivalent for third party libraries where there could be less comfort compared to the standard library and idiomatic documentation may end up being more critical for maintainability of those tests.

@AlexWaygood AlexWaygood changed the title Figure out a way to have test cases for third-party stubs Why do we have test cases? Aug 19, 2022
@AlexWaygood AlexWaygood removed their assignment Aug 19, 2022
@AlexWaygood AlexWaygood added project: policy Organization of the typeshed project and removed project: infrastructure typeshed build, test, documentation, or distribution related labels Aug 19, 2022
@AlexWaygood
Copy link
Member Author

(I've changed the title of this issue, since we're pretty off-topic from the original purpose of this issue. We can carry on talking about what the purpose of the test cases are here. #8569 is now the issue for figuring out the technical difficulties of adding test cases for third-party stubs :)

@kkirsche
Copy link
Contributor

Thanks, Alex. Apologies for derailing the conversation like that

@AlexWaygood
Copy link
Member Author

Thanks, Alex. Apologies for derailing the conversation like that

No worries, it happens :) it's a useful discussion to have!

@AlexWaygood
Copy link
Member Author

I think this discussion has run its course, and we've made further updates to the documentation, so I'll close this out for now. But further feedback in this vein is always welcome; anybody should feel free to open a new issue if they want to further discuss one of typeshed's current policies :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
project: policy Organization of the typeshed project
Projects
None yet
Development

No branches or pull requests

5 participants