Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework _unify #129

Merged
merged 14 commits into from
Feb 15, 2024
Merged

Rework _unify #129

merged 14 commits into from
Feb 15, 2024

Conversation

michaelmckinsey1
Copy link
Collaborator

@michaelmckinsey1 michaelmckinsey1 commented Jan 10, 2024

Summary

Rework the _unify function to leverage the old_to_new object returned by graph.union() to more cleanly keep track of node updates during the unification process. Currently we update everything afterwards using _sync_nodes_frame, which requires some assumptions about the structure of the data. These implicit assumptions have resulted in bugs from our users. By leveraging the node map returned by the union function, we do not have to make any assumptions, which will reduce the amount of bugs we encounter.

Changes

  • Rewrite the _unify function
  • Remove _sync_nodes and _sync_nodes_frame
  • Add unify unit test
  • Add literal reader for unit test

Performance Impact

Where n=len(thickets):
The current code unify is approx O(n+n) for computing unions n and sync nodes time (approx n). The new code adds a loop in each union step that makes the union block now O(n^2). new-v2 and new-v3 are optimizations on the naive implementation of "new".

Small study

files develop new new-v2 new-v3 (latest commit)
10 0.16s 0.22s 0.2s 0.15s
50 0.8s 2.2s 1.2s 0.77s
172 2.7s 20.1s 8.2s 3.35s
656 10.7s 4m 1m 22s
1438 24.3s 21m 6m 1m21s
1614 28s 26m 8m 1m39s
1808 31.3s ? 9m 2m3s

Current data line of best fit is linear (Y = 0.07X + -1.73)
New data line of best fit is nlog(n) (Y = -123 + .5X*log_10(X)). Empirically this is nlog(n) but if we had a worst case example at each size with n distinct graphs (improbable) I imagine it could actually fit n^2.

@michaelmckinsey1 michaelmckinsey1 changed the title Replace _sync_nodes and _sync_nodes_frame Rework _unify Jan 10, 2024
@michaelmckinsey1 michaelmckinsey1 marked this pull request as ready for review January 10, 2024 20:47
@michaelmckinsey1 michaelmckinsey1 added area-tests Issues and PRs involving Thicket's automated tests area-thicket Issues and PRs involving Thicket's core Thicket datastructure and associated classes priority-normal Normal priority issues and PRs status-ready-for-review This PR is ready to be reviewed by assigned reviewers type-bug Identifies bugs in issues and identifies bug fixes in PRs labels Jan 10, 2024
@michaelmckinsey1
Copy link
Collaborator Author

Resolves #16

@michaelmckinsey1 michaelmckinsey1 added status-work-in-progress PR is currently being worked on and removed status-ready-for-review This PR is ready to be reviewed by assigned reviewers labels Jan 16, 2024
@michaelmckinsey1 michaelmckinsey1 added status-ready-for-review This PR is ready to be reviewed by assigned reviewers and removed status-work-in-progress PR is currently being worked on labels Feb 12, 2024
@slabasan slabasan merged commit 33827a8 into LLNL:develop Feb 15, 2024
4 checks passed
@michaelmckinsey1 michaelmckinsey1 deleted the fix-new_unify branch February 29, 2024 22:02
Yejashi pushed a commit to TauferLab/thicket that referenced this pull request Mar 6, 2024
* Replace _sync_nodes and _sync_nodes_frame

* Perform union every time to correctly keep nodes up-to-date

* Add unit test and add literal reader for test

* Add docstring
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-tests Issues and PRs involving Thicket's automated tests area-thicket Issues and PRs involving Thicket's core Thicket datastructure and associated classes priority-normal Normal priority issues and PRs status-ready-for-review This PR is ready to be reviewed by assigned reviewers type-bug Identifies bugs in issues and identifies bug fixes in PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants