Why do we have test cases? #8505

AlexWaygood · 2022-08-08T12:46:44Z

We currently only have test cases for our stdlib stubs, and don't currently have the infrastructure set up to allow us to add test cases for any of our third-party stubs. This has come up a few times now, so it would be good to figure out a way to get this to work.

~~(This issue relates to the test cases in our test_cases directory. Currently we test these samples with mypy and pyright.)~~

This issue was originally meant to discuss the mechanics of adding test cases for third-party stubs. The discussion quickly strayed away from that, so #8569 is now the issue for that discussion.

The text was updated successfully, but these errors were encountered:

kkirsche · 2022-08-18T12:45:43Z

What kinds of things are you looking to test with these tests? I think there are a few approaches that could be used, but from previous discussions, to me at least, it hasn't been made clear what specifically is being tested for in these circumstances

AlexWaygood · 2022-08-18T15:43:47Z

it hasn't been made clear what specifically is being tested for in these circumstances

I'm sorry to hear that. Did the revisions I made to the README in #8445 not help at all?

Essentially, the vast majority of changes to typeshed do not require test cases to be added. There's little point in simply repeating what's said in the stub in a test file; e.g. the following would be a fairly pointless test:

# STUB
def func() -> str: ...

# TEST
assert_type(func(), str)

The cases where we want test cases broadly fall into the following categories:

Situations where the annotations are highly complex. For example, builtins.pow, which has 16 overloads. For builtins.pow, we want to make sure that the type checker picks the right overload for various calls.
Situations where we want to ensure that idiomatic usage of a function or class doesn't result in type checkers emitting false-positive errors. We might be especially concerned about false-positive errors for a particular class because the class is unusual in some way (e.g. the tests in test_logging.py), because previous changes to typeshed that looked harmless have accidentally caused regressions in the past (e.g. the tests in test_object.py), or because a change to the stub proposes to use a brand-new feature in the typing system, for which we expect there might be latent bugs in type checkers (the tests in test_all.py fall into this bucket).

kkirsche · 2022-08-18T16:08:40Z

Thanks for responding, Alex. It helps, but I guess from seeing when they're requested in practice, it's not actually accurately defined, as we saw in the weakref disagreements related to your second bullet about false-positives.

But, with that said, I realize I'm the outlier among this group as I agree with #7724 (comment) which states that the cautious approach taken by the team neuters part of the point of type checkers, which are intended in most cases to raise errors if they can't prove something to be the case.

I think I misunderstood the comment

So while it's well documented, as an outsider, the team's actual actions don't match the written goal, in my limited awareness and experience. There certainly are complexity checks related to situations with a large volume of overloads, which makes perfect sense, but from looking through various comments by the team, requests for these tests can (often?) state that the maintainers aren't as comfortable with those areas of the standard library or stub libraries which is why they want the tests. As a result, as a contributor this reads as the team asking contributors to create documentation for areas of the standard library or stubs where the typeshed team isn't as comfortable, not actually performing type-related testing. That's fine, it's just not what's written.

As a result, I have felt that the written explanation is incongruous with the actions / requests. It's not a problem, but it's led me to not feel confident that the tests are actually for what was described.

I hope that doesn't come off in the wrong way as it's intended only as constructive feedback, not a criticism.

EDIT: fix haven't -> have
EDIT: unsure if I misunderstood the comment, so strikethrough of that

JelleZijlstra · 2022-08-19T02:49:29Z

@kkirsche could you give examples of cases where we asked for tests that don't match the criteria Alex listed? Tests are a relatively new part of typeshed and we're still figuring out where they're effective. If you feel our docs don't match reality, that's important to know.

As for your struck paragraph, I think that's a separate issue from testing, but it's worth revisiting whether the policy to prefer false negatives still makes sense. Feel free to open a new issue if you want to discuss that. A related issue is python/typing#1096.

hauntsaninja · 2022-08-19T03:05:42Z

kkirsche mentioned "weakref disagreements", which appears to be a reference to this PR: #8438
Haven't yet read that thread, but +1 on linking to PRs where a test request has felt inappropriate

kkirsche · 2022-08-19T10:08:11Z

@kkirsche could you give examples of cases where we asked for tests that don't match the criteria Alex listed? Tests are a relatively new part of typeshed and we're still figuring out where they're effective. If you feel our docs don't match reality, that's important to know.

As for your struck paragraph, I think that's a separate issue from testing, but it's worth revisiting whether the policy to prefer false negatives still makes sense. Feel free to open a new issue if you want to discuss that. A related issue is python/typing#1096.

No worries, I'll need to go back and locate where I'm thinking of as I didn't hold onto the references to them, the PR hauntsaninja referenced was the instance that I could point to most easily as it was one I was involved with.

If for whatever reason I can't find them, I want to apologize in advance as that would mean I must be mistaken and mixing up repos. Earliest I'll be free to locate them is going to be next week, as today and this weekend is heavily booked for me.

Thanks for all your time, and sorry for mixing topics. My hope is to understand the point of the tests well enough to understand how to make a proposal for integrating the equivalent for third party libraries where there could be less comfort compared to the standard library and idiomatic documentation may end up being more critical for maintainability of those tests.

AlexWaygood · 2022-08-19T10:44:14Z

(I've changed the title of this issue, since we're pretty off-topic from the original purpose of this issue. We can carry on talking about what the purpose of the test cases are here. #8569 is now the issue for figuring out the technical difficulties of adding test cases for third-party stubs :)

kkirsche · 2022-08-19T11:56:15Z

Thanks, Alex. Apologies for derailing the conversation like that

AlexWaygood · 2022-08-19T11:57:20Z

Thanks, Alex. Apologies for derailing the conversation like that

No worries, it happens :) it's a useful discussion to have!

AlexWaygood · 2022-09-28T11:24:21Z

I think this discussion has run its course, and we've made further updates to the documentation, so I'll close this out for now. But further feedback in this vein is always welcome; anybody should feel free to open a new issue if they want to further discuss one of typeshed's current policies :)

AlexWaygood added the project: infrastructure typeshed build, test, documentation, or distribution related label Aug 8, 2022

AlexWaygood self-assigned this Aug 8, 2022

AlexWaygood mentioned this issue Aug 8, 2022

Add classmethod to stripe.PaymentIntent.confirm. #8498

Merged

This comment was marked as off-topic.

Sign in to view

AlexWaygood changed the title ~~Figure out a way to have test cases for third-party stubs~~ Why do we have test cases? Aug 19, 2022

AlexWaygood removed their assignment Aug 19, 2022

AlexWaygood added project: policy Organization of the typeshed project and removed project: infrastructure typeshed build, test, documentation, or distribution related labels Aug 19, 2022

AlexWaygood mentioned this issue Aug 19, 2022

Figure out a way to have test cases for third-party stubs #8569

Closed

AlexWaygood closed this as completed Sep 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why do we have test cases? #8505

Why do we have test cases? #8505

AlexWaygood commented Aug 8, 2022 •

edited

Loading

This comment was marked as off-topic.

kkirsche commented Aug 18, 2022

AlexWaygood commented Aug 18, 2022 •

edited

Loading

kkirsche commented Aug 18, 2022 •

edited

Loading

JelleZijlstra commented Aug 19, 2022

hauntsaninja commented Aug 19, 2022 •

edited

Loading

kkirsche commented Aug 19, 2022

AlexWaygood commented Aug 19, 2022

kkirsche commented Aug 19, 2022

AlexWaygood commented Aug 19, 2022

AlexWaygood commented Sep 28, 2022

Why do we have test cases? #8505

Why do we have test cases? #8505

Comments

AlexWaygood commented Aug 8, 2022 • edited Loading

This comment was marked as off-topic.

kkirsche commented Aug 18, 2022

AlexWaygood commented Aug 18, 2022 • edited Loading

kkirsche commented Aug 18, 2022 • edited Loading

JelleZijlstra commented Aug 19, 2022

hauntsaninja commented Aug 19, 2022 • edited Loading

kkirsche commented Aug 19, 2022

AlexWaygood commented Aug 19, 2022

kkirsche commented Aug 19, 2022

AlexWaygood commented Aug 19, 2022

AlexWaygood commented Sep 28, 2022

AlexWaygood commented Aug 8, 2022 •

edited

Loading

AlexWaygood commented Aug 18, 2022 •

edited

Loading

kkirsche commented Aug 18, 2022 •

edited

Loading

hauntsaninja commented Aug 19, 2022 •

edited

Loading