Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster get lookup name #16

Closed
wants to merge 9 commits into from
Closed

Faster get lookup name #16

wants to merge 9 commits into from

Conversation

mmatera
Copy link
Contributor

@mmatera mmatera commented Sep 24, 2021

This PR should provide a small improvement in the pattern matching mechanism. @rocky, benchmarks are coming...

@mmatera
Copy link
Contributor Author

mmatera commented Sep 29, 2021

Funny thing: as it is now, docpipeline took 4:48 vs 4:34 in master. But, if symbols.py is cythonized,
this branch takes 4:28 while master takes 04:47

@rocky
Copy link
Member

rocky commented Sep 29, 2021

Funny thing: as it is now, docpipeline took 4:48 vs 4:34 in master. But, if symbols.py is cythonized, this branch takes 4:28 while master takes 04:47

From the above it sounds like master is somewhere between 4:34 and 4:47. The fact that in the cython run it happened to get the slower number is immaterial. And the spread of 13 seconds may be in the noise range.

@TiagoCavalcante
Copy link
Contributor

Here is what I got:
image

@rocky
Copy link
Member

rocky commented Oct 1, 2021

@TiagoCavalcante - thanks. I am guessing that these results wobble around less since they are more targeted.

It would be good to have somehow in the reports something about the number of iterations since the belief is that the larger the number of iterations, the more confidence we should have in the results.

@rocky
Copy link
Member

rocky commented Oct 1, 2021

@mmatera is this ready for final review or are you contemplating more work on this?

@TiagoCavalcante
Copy link
Contributor

It would be good to have somehow in the reports something about the number of iterations since the belief is that the larger the number of iterations, the more confidence we should have in the results.

@rocky what do you think of putting it in the title: "String_vs_StringQ - n iterations"?
(I ran theses benchmarks with 50 iterations)

@rocky
Copy link
Member

rocky commented Oct 1, 2021

It would be good to have somehow in the reports something about the number of iterations since the belief is that the larger the number of iterations, the more confidence we should have in the results.

@rocky what do you think of putting it in the title: "String_vs_StringQ - n iterations"? (I ran theses benchmarks with 50 iterations)

@TiagoCavalcante - that is a simple and low-tech way to address this. (Because it not automated, it does rely on a person getting the title correct and updating the title should the number of iterations change.)

As I look again on both benchmarks, it looks to me like the results are marginal....

The benchmark over all doctests the improvement seems within the margin of wobble.

In the targeted benchmarks, the improvement is greatest in a construct we know to now to avoid: using _String, instead of ?StringQ swamps any performance improvement that this PR provide. In other words, if you compare orange bars changes between ?String and _String that is much greater than the consecutive pairs of orange and blue bars, although overall there is an smaller improvement with this PR.

@mmatera and @TiagoCavalcante your reading and thoughts on the benchmarks?

@TiagoCavalcante
Copy link
Contributor

@rocky I didn't express myself well. The information would be grabbed from the result file (that already contains the number of iterations) and automatically be put in the title.

@TiagoCavalcante
Copy link
Contributor

@rocky I used String_vs_StringQ because that seems to be the existing test that uses more pattern matching. But yes, probably it is not the greatest change.

@TiagoCavalcante
Copy link
Contributor

@rocky now I remember that each group can have its own number of iterations, so I guess this would just confuse people. Another idea is show the time it took to run the benchmarks, as more iterations = more time.

@rocky
Copy link
Member

rocky commented Oct 1, 2021

@rocky now I remember that each group can have its own number of iterations, so I guess this would just confuse people.

Yes, but in many times that is not the case. So in those common situation this could be used.

Another idea is show the time it took to run the benchmarks, as more iterations = more time.

The number of iterations generally, not the time spent, is an indicator of the standard deviation or variation which is what we want a sense of. Also, I don't want to have to do mental calculations.

There could be a legend for the number of iterations like there is for the colors. The color of a test or maybe a box after the test could indicate how many iterations for that test or test pair/group.

Which reminds me, if the legend were put outside the graph then there wouldn't be any chance of it obscuring boxes or running into boundaries.

@mmatera
Copy link
Contributor Author

mmatera commented Nov 25, 2021

Update of benchmarks (mathics-benchmark):
imagen
imagen
imagen
imagen
imagen
imagen
imagen
imagen

I think this is now ready for review

@mmatera mmatera marked this pull request as ready for review November 25, 2021 20:52
@@ -170,8 +170,7 @@ def is_machine_precision(self) -> bool:

def get_lookup_name(self):
"Returns symbol name of leftmost head"

return self.get_name()
return ""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In master, BaseExpression returns self.get_name(). This requires at least one function call that at the end is useless: you cannot associate definitions to Atoms that are not Symbols

@mmatera
Copy link
Contributor Author

mmatera commented Nov 28, 2021

@TiagoCavalcante @mmatera I have been thinking about this name, and I think it should be changed.

@rocky: I do not understand: Is this related to this PR or to the ShowSteps? BTW: I call it ShowSteps because the idea comes to my mind from someone who asked for an option like in Ruby Integrate routines and Wolfram Alpha, that allows following the steps of the evaluation as (I guess) a didactical tool. But sure, if you have in mind a better name, go ahead!

@rocky
Copy link
Member

rocky commented Nov 28, 2021

@TiagoCavalcante @mmatera I have been thinking about this name, and I think it should be changed.

@rocky: I do not understand: Is this related to this PR or to the ShowSteps? BTW: I call it ShowSteps because the idea comes to my mind from someone who asked for an option like in Ruby Integrate routines and Wolfram Alpha, that allows following the steps of the evaluation as (I guess) a didactical tool. But sure, if you have in mind a better name, go ahead!

It was related to ShowSteps, not this PR.

Sure, I gathered it was the first thing that came to mind. The first thing though sometimes is not the best thing. PR #80 is now in.

@TiagoCavalcante
Copy link
Contributor

@mmatera what is the status of this? I think the slowdowns are just fluctuations, in Python function calls are slow.

@mmatera
Copy link
Contributor Author

mmatera commented Dec 3, 2021

I think this PR is better implemented now in #86

@mmatera mmatera closed this Dec 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants