Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH Add performance test + GH action to run it weekly #289

Conversation

BenjaminBossan
Copy link
Collaborator

@BenjaminBossan BenjaminBossan commented Feb 2, 2023

Resolves #275

Ready for review @skops-dev/maintainers

Description

This PR adds a GitHub action that is run on a weekly basis on Sunday. For all
tested sklearn estimators, it persists them once with pickle and once with skops,
and then records the timing difference. If the absolute difference is too large,
an error is raised and we can take a look at what went wrong.

For convenience, that 10 slowest estimators are printed to stdout. That way, we
can find out problematic candidates by looking at the log.

The persistence of an estimator is considered to be too slow when it takes more
than 1 sec longer to persist with skops than with pickle.

Comments

If someone has a good idea how to test that the GH action is specified correctly
without actually merging it, let me know. For reference, this action is based on
https://github.com/skorch-dev/skorch/pull/904/files, which works.

On top of running the action, we could also add a badge to the README. This
would show at a glance if the last run failed, but it's not strictly necessary.

Results

Running the script locally, I found these results:

10 largest differences:
                                             name  pickle (s)  skops (s)  abs_diff   rel_diff  too_slow
0            ExtraTreesClassifier(random_state=0)    0.003677   0.243817  0.240140  66.312771     False
1       GradientBoostingRegressor(random_state=0)    0.003622   0.243115  0.239493  67.118231     False
2      GradientBoostingClassifier(random_state=0)    0.003112   0.242550  0.239438  77.943117     False
3          RandomForestClassifier(random_state=0)    0.003590   0.236872  0.233282  65.975329     False
4            RandomTreesEmbedding(random_state=0)    0.003292   0.208397  0.205106  63.312033     False
5                 IsolationForest(random_state=0)    0.003295   0.207483  0.204188  62.967271     False
6             ExtraTreesRegressor(random_state=0)    0.003347   0.194232  0.190885  58.030454     False
7           RandomForestRegressor(random_state=0)    0.003112   0.184979  0.181867  59.431591     False
8  HistGradientBoostingClassifier(random_state=0)    0.001690   0.127146  0.125456  75.243695     False
9   HistGradientBoostingRegressor(random_state=0)    0.001283   0.125468  0.124185  97.791886     False

So the slowest absolute difference is 0.24 sec, which I think is more than
acceptable. Relative differences can be quite large (100x), which is maybe not
too surprising, given that pickle is highly optimized and skops performs some
extra checks.

According to the criterion described above, none of the estimators is
unacceptably slow, so the action would pass successfully.

@BenjaminBossan
Copy link
Collaborator Author

Out of curiosity, I did some profiling of the script (with slight modifications) using scalene to see if there are any bottlenecks in our persistence code. The result was not too enlightening, not sure if really everything that's relevant is being shown. At the very least, there doesn't appear to be a single function call that takes a big chunk of time.

I have attached a screenshot (can't upload the html here unfortunately).

image

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite nice!

Could you maybe add a bit of comments/basic docstrings to the methods in the python file? It took me a while the first time reading it to figure out what's doing what. Otherwise LGTM.

Comment on lines +62 to +63
time_pickle = timeit.timeit(run_pickle, number=number) / number
time_skops = timeit.timeit(run_skops, number=number) / number
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably would take the median rather than the mean here, but that probably complicates things.

BenjaminBossan and others added 2 commits February 28, 2023 17:17
This should hopefully make it easier to understand what the functions
are doing.

Also had to update pre-commit dependencies because I was running into
this issue:

pre-commit/pre-commit#2782

There seems to have been something wrong with its dependencies. Pretty
sure it's unrelated to this PR.
@BenjaminBossan
Copy link
Collaborator Author

@adrinjalali I added docstrings and type annotations.

This should hopefully make it easier to understand what the functions
are doing.

Also had to update pre-commit dependencies because I was running into
this issue:

https://github.com/pre-commit/pre-commit/issues/2782

There seems to have been something wrong with its dependencies. Pretty
sure it's unrelated to this PR.

@adrinjalali adrinjalali merged commit 21b4c0a into skops-dev:main Feb 28, 2023
@BenjaminBossan BenjaminBossan deleted the issue-275-persistence-performance-tests branch March 3, 2023 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Have some kind of runtime performance check for skops persistence
2 participants