introduce per-file timing-stats #12639

huguesb · 2022-04-21T08:45:14Z

When profiling mypy over a large codebase, it can be useful
to know which files are slowest to typecheck.

Gather per-file timing stats and expose them through a new
(hidden) command line switch

When profiling mypy over a large codebase, it can be useful to know which files are slowest to typecheck. Gather per-file timing stats and expose them through a new (hidden) command line switch

JukkaL

Cool, I've wanted to have this feature myself when debugging performance issues.

Can you add at least one integration test that ensures that mypy can be run with the flag successfully? Possibly also check the format of the generated file. Obviously the timings aren't repeatable, so they would have to be ignored.

Also, the builds were failing.

mypy/build.py

huguesb · 2022-04-22T09:40:28Z

Can you add at least one integration test that ensures that mypy can be run with the flag successfully? Possibly also check the format of the generated file. Obviously the timings aren't repeatable, so they would have to be ignored.

Done. I added a way to specify test output files via regex to achieve that.

Also, the builds were failing.

Yeah, I didn't realize mypy still supported py3.6
Fixed

JukkaL

Looks good, just a few optional comments.

JukkaL · 2022-04-26T17:10:09Z

mypy/build.py

        with self.wrap_context():
            self.type_checker().check_first_pass()
+        self.time_spent_us += int((time.perf_counter() - t0) * 1e6)


Since we do int((time.perf_counter() - t0) * 1e6) a lot, it could be a bit cleaner to refactor (most of) it to a helper function. E.g. something like this:

def time_us() -> int: return int(time.perf_counter() * 1e6)

What do you think?

sure, that seems fine. That can then be refactored to use perf_counter_ns whenever support for py3.6 is dropped

JukkaL · 2022-04-26T17:39:23Z

mypy/build.py

+    Dump timing stats for each file in the given graph
+    """
+    with open(path, 'w') as f:
+        for k in sorted(graph.keys()):


What about sorting by the time spent in file instead of the module name? It will be easy to search by module name within the stats dump, but sorting by the time is slightly more involved.

A straightforward sort -n -k2 on the output can give you time-sorting from the current format.

I originally thought about the module order being nice to get diffs between runs but the jitter is going to be high enough that such a diff would probably need a little extra massaging (e.g only show changes above a certain % threshold) to actually be helpful.

TL;DR I don't think the sort order makes much of a difference in practice so I would be fine going with either order, or even letting it be unsorted. In follow-up commit we can introduce a script similar to go's benchstat to get more value out of the raw data.

github-actions · 2022-04-27T21:24:19Z

According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉

JukkaL

Thanks! Looks good.

97littleleaf11 · 2022-05-01T04:30:14Z

cool feature!

huguesb · 2022-10-11T09:21:54Z

On Thu, Apr 21, 2022, 7:54 AM Jukka Lehtosalo ***@***.***> wrote: ***@***.**** commented on this pull request. Cool, I've wanted to have this feature myself when debugging performance issues. Can you add at least one integration test that ensures that mypy can be run with the flag successfully? Possibly also check the format of the generated file. Obviously the timings aren't repeatable, so they would have to be ignored.

will do

Also, the builds were failing.

yeah,I didn't realize mypy still supported running under 3.6. I will need to use a different timing API

…

------------------------------ In mypy/build.py <#12639 (comment)>: > @@ -1808,6 +1810,9 @@ class State: fine_grained_deps_loaded = False + # Cumulative time spent on this file (for profiling stats) Include the unit of time in the name of the variable (e.g. time_spent_ns). ------------------------------ In mypy/build.py <#12639 (comment)>: > @@ -3091,6 +3120,8 @@ def process_graph(graph: Graph, manager: BuildManager) -> None: manager.log("No fresh SCCs left in queue") + + These extra empty lines seem redundant. — Reply to this email directly, view it on GitHub <#12639 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABX3CSN56YWIOZNXXKVPOG3VGFT2ZANCNFSM5T6P2QDA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

JelleZijlstra · 2022-10-11T18:48:05Z

@huguesb did you comment on the wrong PR? For what it's worth we dropped 3.6 support.

huguesb · 2022-10-11T20:39:46Z

uh, that's very weird. I did not intentionally send that... it looks relevant to the history of this PR but not after its closure, and I don't even see this message as a "sent" item my email client...

JelleZijlstra · 2022-10-11T20:48:51Z

Maybe some email queue in GitHub had a very long delay :)

introduce per-file timing-stats

e191a32

When profiling mypy over a large codebase, it can be useful to know which files are slowest to typecheck. Gather per-file timing stats and expose them through a new (hidden) command line switch

huguesb force-pushed the pr-timing-stats branch from baa880e to e191a32 Compare April 21, 2022 08:47

huguesb mentioned this pull request Apr 21, 2022

Bump minimum python version to 3.7 #12640

Closed

This comment has been minimized.

Sign in to view

JukkaL reviewed Apr 21, 2022

View reviewed changes

mypy/build.py Outdated Show resolved Hide resolved

mypy/build.py Outdated Show resolved Hide resolved

hugues-aff added 3 commits April 22, 2022 02:35

address comments and use py3.6 compatible timer

6d09759

add test

30f19d9

fix flake8

e839c67

huguesb mentioned this pull request Apr 22, 2022

Release 0.950 planning #12579

Closed

This comment has been minimized.

Sign in to view

huguesb requested a review from JukkaL April 26, 2022 04:03

JukkaL reviewed Apr 26, 2022

View reviewed changes

This comment has been minimized.

Sign in to view

huguesb force-pushed the pr-timing-stats branch 2 times, most recently from edeed8d to caddce9 Compare April 27, 2022 20:17

This comment has been minimized.

Sign in to view

Factor out time_spent computation

e2ce6c7

huguesb force-pushed the pr-timing-stats branch from caddce9 to e2ce6c7 Compare April 27, 2022 20:58

JukkaL approved these changes Apr 29, 2022

View reviewed changes

JukkaL merged commit d48d548 into python:master Apr 29, 2022

huguesb deleted the pr-timing-stats branch April 29, 2022 18:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

introduce per-file timing-stats #12639

introduce per-file timing-stats #12639

huguesb commented Apr 21, 2022 •

edited

Loading

This comment has been minimized.

This comment has been minimized.

JukkaL left a comment

huguesb commented Apr 22, 2022 •

edited

Loading

This comment has been minimized.

This comment has been minimized.

JukkaL left a comment

JukkaL Apr 26, 2022

huguesb Apr 27, 2022

JukkaL Apr 26, 2022

huguesb Apr 27, 2022

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Apr 27, 2022

JukkaL left a comment

97littleleaf11 commented May 1, 2022

huguesb commented Oct 11, 2022 via email

JelleZijlstra commented Oct 11, 2022

huguesb commented Oct 11, 2022

JelleZijlstra commented Oct 11, 2022

introduce per-file timing-stats #12639

introduce per-file timing-stats #12639

Conversation

huguesb commented Apr 21, 2022 • edited Loading

This comment has been minimized.

This comment has been minimized.

JukkaL left a comment

Choose a reason for hiding this comment

huguesb commented Apr 22, 2022 • edited Loading

This comment has been minimized.

This comment has been minimized.

JukkaL left a comment

Choose a reason for hiding this comment

JukkaL Apr 26, 2022

Choose a reason for hiding this comment

huguesb Apr 27, 2022

Choose a reason for hiding this comment

JukkaL Apr 26, 2022

Choose a reason for hiding this comment

huguesb Apr 27, 2022

Choose a reason for hiding this comment

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Apr 27, 2022

JukkaL left a comment

Choose a reason for hiding this comment

97littleleaf11 commented May 1, 2022

huguesb commented Oct 11, 2022 via email

JelleZijlstra commented Oct 11, 2022

huguesb commented Oct 11, 2022

JelleZijlstra commented Oct 11, 2022

huguesb commented Apr 21, 2022 •

edited

Loading

huguesb commented Apr 22, 2022 •

edited

Loading