Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Loop in Unify #185

Merged
merged 6 commits into from
Jun 27, 2024
Merged

Conversation

michaelmckinsey1
Copy link
Collaborator

@michaelmckinsey1 michaelmckinsey1 commented Jun 26, 2024

This PR:

  1. Refactors _merge_dicts and _replace_graph_df_nodes by removing the functions and putting the code inline.
  2. Optimizes the double for loop that merges old_to_new and temp_dict from O(n^2) to O(n). This was an oversight during original implementation.

The performance improvements of this PR are documented in #170. #170 avoids needing to merge dictionaries as it only calls concat_thickets with 2 Thickets, however using the concat_thickets function with 3 or more Thickets will still benefit from this PR.

This PR is better for fewer files with many nodes, while #170 is better for many files since it avoids merging old_to_new and temp_dict.

@michaelmckinsey1 michaelmckinsey1 self-assigned this Jun 26, 2024
@michaelmckinsey1 michaelmckinsey1 added area-thicket Issues and PRs involving Thicket's core Thicket datastructure and associated classes priority-normal Normal priority issues and PRs status-work-in-progress PR is currently being worked on type-internal-cleanup PRs or Issues related to the structure of the codebase, directories, and refactors labels Jun 26, 2024
@michaelmckinsey1 michaelmckinsey1 changed the title Optimize Unify Optimize Loop in Unify Jun 26, 2024
@michaelmckinsey1 michaelmckinsey1 marked this pull request as ready for review June 26, 2024 21:31
@michaelmckinsey1 michaelmckinsey1 added status-ready-for-review This PR is ready to be reviewed by assigned reviewers and removed status-work-in-progress PR is currently being worked on labels Jun 26, 2024
Copy link
Collaborator

@ilumsden ilumsden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good. Just one issue that needs to be addressed to ensure we don't run into unexpected (and possibly undefined) behavior.

for cur_id, cur_node in old_to_new.items():
new_id = id(cur_node)
if new_id in temp_dict:
old_to_new[cur_id] = temp_dict[new_id]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never update the thing you are iterating over. You may get lucky, but 9 times out of 10, this is at best undefined behavior.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch I messed up when converting it from a function a1479b3

Copy link
Collaborator

@ilumsden ilumsden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ilumsden ilumsden requested a review from pearce8 June 26, 2024 23:29
@ilumsden
Copy link
Collaborator

@pearce8 this PR is ready for your review

@ilumsden ilumsden added status-approved No more revisions are required on this PR and it is ready for merge and removed status-ready-for-review This PR is ready to be reviewed by assigned reviewers labels Jun 26, 2024
@pearce8 pearce8 merged commit 5722de3 into LLNL:develop Jun 27, 2024
4 checks passed
michaelmckinsey1 added a commit to michaelmckinsey1/thicket that referenced this pull request Jul 9, 2024
* Optimize double for loop

* Remove functions

* Fix bug

* Improve variable names for clarity

* Change list to set

* Rm space
michaelmckinsey1 added a commit to michaelmckinsey1/thicket that referenced this pull request Jul 9, 2024
* Optimize double for loop

* Remove functions

* Fix bug

* Improve variable names for clarity

* Change list to set

* Rm space
Yejashi pushed a commit to TauferLab/thicket that referenced this pull request Sep 4, 2024
* Optimize double for loop

* Remove functions

* Fix bug

* Improve variable names for clarity

* Change list to set

* Rm space
@slabasan slabasan added this to the 2024.2.0 milestone Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-thicket Issues and PRs involving Thicket's core Thicket datastructure and associated classes priority-normal Normal priority issues and PRs status-approved No more revisions are required on this PR and it is ready for merge type-internal-cleanup PRs or Issues related to the structure of the codebase, directories, and refactors
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants