Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add per-node scoring functions #127

Merged
merged 9 commits into from
Apr 8, 2024
Merged

Conversation

AndyM1098
Copy link
Contributor

@AndyM1098 AndyM1098 commented Jan 1, 2024

This PR is for the implementation of the per-node scoring functions. Right now there are four scoring functions. They take in a columnar joined thicket object, and a list of columns to apply a per-node scoring function to.

Note: This is an updated PR since this branch was migrated from LLNL's main repo to TauferLab's fork of this repo. Replaces #114

@AndyM1098 AndyM1098 added area-stats Issues and PRs related to Thicket's stats subpackage priority-normal Normal priority issues and PRs status-work-in-progress PR is currently being worked on type-feature Requests for new features or PRs which implement new features labels Jan 1, 2024
@AndyM1098 AndyM1098 requested a review from slabasan January 1, 2024 06:49
@AndyM1098 AndyM1098 self-assigned this Jan 1, 2024
@slabasan slabasan changed the title Andrew scoring (Migrated branch) Add per-node scoring functions Jan 16, 2024
@ilumsden ilumsden self-requested a review February 26, 2024 16:36
@Yejashi Yejashi force-pushed the andrew_scoring branch 4 times, most recently from dafc6a1 to dbe7558 Compare March 11, 2024 18:36
@Yejashi Yejashi added status-ready-for-review This PR is ready to be reviewed by assigned reviewers and removed status-work-in-progress PR is currently being worked on labels Mar 13, 2024
@slabasan slabasan marked this pull request as ready for review March 13, 2024 19:28
import thicket as th


def _scoring_1(means_1, means_2, stds_1, stds_2, num_nodes):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename scoring_{1,2..} to your more descriptive names? Maybe score_delta_mean_delta_stdnorm() calls score(..., _calc_delta_mean_delta_stdnorm).

(means_1[i] - means_2[i]) ** 2 / (stds_1[i] ** 2 + stds_2[i] ** 2)
)
except ZeroDivisionError:
# print("Score 3 std's: ", stds_1[i], stds_2[i], i)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove these debug statements if they aren't needed anymore.

(stds_1[i] - stds_2[i]) / (np.abs(means_1[i] - means_2[i]))
)
except RuntimeWarning:
print("Score 1 means's: ", means_1[i], means_2[i], i)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this print statement should be removed/commented out like the others below.


idx = list(combined_th.dataframe.columns.levels[0][0:2])
columns = [(idx[0], "Min time/rank"), (idx[1], "Min time/rank")]
output_column_name = "score_1"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use longer descriptive name?

import numpy as np
from ..utils import verify_thicket_structures
import math
import thicket as th
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like following PEP8 for grouping imports:

Imports should be grouped in the following order:

Standard library imports.
Related third party imports.
Local application/library specific imports.

So let's update this list to be:

import math

import numpy as np

import thicket as th
from ..utils import verify_thicket_structures



def _scoring_2(means_1, means_2, stds_1, stds_2, num_nodes):

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove the newline between the function definition and start of the function body as you've done below in your score() function.

@slabasan slabasan merged commit 441e5e3 into LLNL:develop Apr 8, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-stats Issues and PRs related to Thicket's stats subpackage priority-normal Normal priority issues and PRs status-ready-for-review This PR is ready to be reviewed by assigned reviewers type-feature Requests for new features or PRs which implement new features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants