-
-
Notifications
You must be signed in to change notification settings - Fork 260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add some benchmarking guidance #267
Conversation
CONTRIBUTE.md
Outdated
|
||
1. Test for a variety of sample sizes for most algorithms [1_000, 10_000, 20_000] will be sufficient | ||
2. Test of variety feature dimensions | ||
- Spatial algorithms: [3, 5, 8] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 and 8 are representative enough.
CONTRIBUTE.md
Outdated
- Spatial algorithms: [3, 5, 8] | ||
- Dimentionality reduction algorithms: [10, 20, 50] | ||
- Others: [5, 10] | ||
3. Use Criterion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go into why we use Criterion over Iai.
CONTRIBUTE.md
Outdated
- Dimentionality reduction algorithms: [10, 20, 50] | ||
- Others: [5, 10] | ||
3. Use Criterion | ||
4. Test various alg implementations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be more specific and use examples like PLS
CONTRIBUTE.md
Outdated
3. Use Criterion | ||
4. Test various alg implementations | ||
5. Set a random seed for algorithm if applicable | ||
6. Test multi-target case if algorithm supports it: [4 targets] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mention that generally we only want to benchmark one or two target counts, and give a range (3 to 8), for example
CONTRIBUTE.md
Outdated
- Others: [5, 10] | ||
3. Use Criterion | ||
4. Test various alg implementations | ||
5. Set a random seed for algorithm if applicable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"For algorithms that require an RNG or random seed as input, use a constant seed for reproducibility"
CONTRIBUTE.md
Outdated
It is important to the project that we have benchmarks in place to evaluate the benefit of performance related changes. To make that process easier we provide some guidelines for writing benchmarks. | ||
|
||
1. Test for a variety of sample sizes for most algorithms [1_000, 10_000, 20_000] will be sufficient | ||
2. Test of variety feature dimensions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Test for a variety of feature dimensions. Two is usually enough for most algorithms. The following are suggested feature dimensions to use for different types of algorithms:"
CONTRIBUTE.md
Outdated
|
||
It is important to the project that we have benchmarks in place to evaluate the benefit of performance related changes. To make that process easier we provide some guidelines for writing benchmarks. | ||
|
||
1. Test for a variety of sample sizes for most algorithms [1_000, 10_000, 20_000] will be sufficient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For algorithms where it's not too slow, use 100k instead of 20k
Codecov ReportBase: 38.68% // Head: 38.75% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #267 +/- ##
==========================================
+ Coverage 38.68% 38.75% +0.06%
==========================================
Files 93 93
Lines 6087 6087
==========================================
+ Hits 2355 2359 +4
+ Misses 3732 3728 -4
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
CONTRIBUTE.md
Outdated
3. Use Criterion. Iai another popular benchmarking tool is not being actively maintained at the moment and at the time of this writing it hasn't been updated since Feb 25, 2021. | ||
4. Test various alg implementations for instance Pls has the following algorithms: Nipals and Svd. | ||
5. For algorithms that require an RNG or random seed as input, use a constant seed for reproducibility | ||
6. In most cases we only want to benchmark 1D or 2D targets. When benchmarking 2D targets the 2nd axis should be within the following range: [2, 4]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete the first sentence. In the 2nd sentence put "multi-target" instead of "2D targets" and "target count" instead of "2nd axis".
This PR adds some documentation to the Contribute.MD. It aims to provide some guidance for writing benchmarks. At a later time, a benchmark will be referenced in the section to assist the community.
Resolves #265