feat: add separate DCE pass #1902

acl-cqc · 2025-02-03T18:45:28Z

Still non-breaking, so we leave Constant Folding as assuming inputs apply to main: the DCE pass allows explicitly specifying entry points, constant folding specifies main if appropriate. (I could add a flag to ConstantFoldPass = preserve all FuncDefns/FuncDecls?)

The callback mechanism generalizes the previous might_diverge mechanism but is significantly more general. (Too general??)

closes #1807

…Module

…in if necessary

…in DFGs

codecov · 2025-02-04T15:52:36Z

Codecov Report

Attention: Patch coverage is 93.02326% with 15 lines in your changes missing coverage. Please review.

Project coverage is 83.71%. Comparing base (5e5cfcc) to head (9dd3ddf).

Files with missing lines	Patch %	Lines
hugr-passes/src/dead_code.rs	92.73%	13 Missing ⚠️
hugr-passes/src/const_fold.rs	95.83%	0 Missing and 1 partial ⚠️
hugr-passes/src/lib.rs	90.90%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1902      +/-   ##
==========================================
+ Coverage   83.70%   83.71%   +0.01%     
==========================================
  Files         196      198       +2     
  Lines       37543    37673     +130     
  Branches    34356    34486     +130     
==========================================
+ Hits        31425    31539     +114     
- Misses       4335     4351      +16     
  Partials     1783     1783

Flag	Coverage Δ
python	`92.34% <ø> (ø)`
rust	`82.92% <93.02%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

acl-cqc · 2025-02-05T09:58:40Z

hugr-passes/src/dead_code.rs

+pub type PreserveCallback = dyn Fn(&Hugr, Node) -> PreserveNode;
+
+/// Signal that a node must be preserved even when its result is not used
+pub enum PreserveNode {


The callback mechanism may be a sledgehammer to crack a nail, I'm not sure if there are alternatives.

One might just say that every node that shouldn't be removed should just be added as an entry-point (i.e. including Calls, TailLoops, etc.)

I'm not quite clear how this interacts with hierarchy, but that probably works ok

However, the issue here is that we want the default to be conservative, so we should assume Call/CFG/TailLoop are impure unless we have evidence otherwise. This would mean the default would require adding every Call/CFG/TailLoop as an entrypoint, which sounds like far too much work to inflict on the client. Instead we'd have to have an add_removable or something (and the client could list such nodes).

As it stands this enum does have "everything I could think of", so it might be appropriate to make the enum smaller and add [non_exhaustive]

I agree we shouldn't demand the user specify every Call/CFG/TailLoop as an entry point; the interface does seem a bit awkward but I don't have a better suggestion.

Ok so the alternative is two methods on DeadCodeElimPass:

force_keep(&mut self, Node)

allow_remove(&mut self, Node)
I guess we could add a bool flag to the latter to determine whether it allows removing a node without checking its descendants, or only if its descendants can also be removed.

This feels simpler in many ways but the drawback (IMHO) is that this approach requires any client to understand the defaults (i.e., currently, that Call + TailLoop + CFG default to being non-removable, nothing else). Potentially those defaults might change (e.g. if we change the DCEPass to look for acyclic CFGs), too. Seems to me that the enum PreserveNode + callback makes it easier for clients to replace the defaults (or fall back to them) without understanding what they are.

That might be illusory, though; one can implement any policy with either approach, e.g. making sure to call either force_keep or allow_remove for every Node. Thoughts @cqc-alec ?

That does feel a bit simpler. But I don't know if it would be in practice. With the callback mechanism, one can define a DeadCodeElimPass that can be used on multiple hugrs; whereas with these methods one would have to define a new one for each hugr. On balance I think I prefer the callbacks.

That's a really good point about using it on different Hugrs. Ok, will keep callbacks...

cqc-alec · 2025-02-05T13:28:03Z

hugr-passes/src/dead_code.rs

+#[derive(Clone, Default)]
+pub struct DeadCodeElimPass {
+    entry_points: Vec<Node>,
+    preserve_callback: Option<Arc<PreserveCallback>>,


There should be some documentation of what these are, especially preserve_callback.

cqc-alec · 2025-02-05T13:43:12Z

hugr-passes/src/dead_code.rs

+pub type PreserveCallback = dyn Fn(&Hugr, Node) -> PreserveNode;
+
+/// Signal that a node must be preserved even when its result is not used
+pub enum PreserveNode {


I agree we shouldn't demand the user specify every Call/CFG/TailLoop as an entry point; the interface does seem a bit awkward but I don't have a better suggestion.

cqc-alec · 2025-02-05T13:48:36Z

hugr-passes/src/dead_code.rs

+    /// Mark some nodes as entry-points to the Hugr.
+    /// The root node is assumed to be an entry point;
+    /// for Module roots the client will want to mark some of the FuncDefn children
+    /// as entry-points too.


Is "entry point" hyphenated or not? (I think not.)

Indeed not. I think this might be slightly harder to read/parse (how long to realize that entry here is an adjective), but ok.

cqc-alec · 2025-02-05T13:54:47Z

hugr-passes/src/dead_code.rs

+            .run_validated_pass(hugr, |h, _| self.run_no_validate(h))
+    }
+
+    fn run_no_validate(&self, hugr: &mut impl HugrMut) -> Result<(), ValidatePassError> {


Does the return type need to include ValidatePassError?

Well, it's a private helper method...so we could do it in the closure passed to run_validated_pass.....ok, it's clearer without the Result, indeed

cqc-alec · 2025-02-05T13:56:28Z

hugr-passes/src/dead_code.rs

+    }
+
+    // "Diverge" aka "never-terminate"
+    // TODO would be more efficient to compute this bottom-up and cache (dynamic programming)


Make an issue?

Now we have a proper DCE pass....it really wasn't that hard....

cqc-alec · 2025-02-05T15:23:32Z

hugr-passes/src/dead_code.rs

+                match h.get_optype(n) {
+                    OpType::CFG(_) => {
+                        // TODO if the CFG has no cycles (that are possible given predicates)
+                        // then we could say it definitely terminates (i.e. return false)


I'm probably missing something, but I'm not sure why we don't want to remove nodes that we can't prove terminate. It's true that doing so would change runtime behaviour (as is the case for any optimization), but would it change program semantics?

Found a discussion of this question here: https://stackoverflow.com/questions/2178115/are-compilers-allowed-to-eliminate-infinite-loops

From reading that I am less sure than I was but OTOH I think this could just be the C "undefined behaviour" brigade. (That is, the school of thought that we can make your programs "faster" by making more of them do things you didn't expect and making it more and more difficult to write programs that do do what you want.) I think I would still say - changing nontermination into termination changes observable behaviour, in that, writing to files (including stdout) is observable, and a program that does that is different from a program that doesn't. So by one definition there, interpret "sequence point" and "volatile" as including system calls and data.

IOW - here are two ways of writing the same thing:

while b(): sleep() launch_missiles()

and

while True: if b(): sleep() else launch_missiles() break()

I think the "you can optimize away infinite loops" means you can turn the first into launch_missiles (at least, in the absence of "state edges" and other such extra inputs/outputs to b()), but not the second, which to me is a bit of a false dichotomy. I guess the counterargument is then that the second has a "control dependence" of launch_missiles() upon b() and we should add a representation of that into the first, i.e. there is some (hidden-in-the-source) probably-linear loop-variant mutated by b and then passed from the loop to launching. In which case maybe we need to put off this PR until we've thoroughly investigated ordering and effect systems https://github.com/quantinuum-dev/hm25/issues/62 #1813

In my view it's distinct from UB. The example using sleep() maybe muddies the waters a bit because it implies a linear time type (or something similar) passing through so that the operation is effectful (and therefore can't be removed). But if we had noop there instead of sleep() then I'd be heading for the nearest bunker! We don't make any claim about the time taken to execute a noop...

writing to files (including stdout) is observable, and a program that does that is different from a program that doesn't

Dunno, I would agree if "a program that doesn't" is replaced by "a terminated program that didn't" ...

Not arguing this so much for philosophical reasons as to suggest we can make the optimizer simpler and better by not worrying about it. But indeed it may depend on exactly how we are handling states and effects.

Yeah on balance I think the best thing to do here is to leave this existing behaviour unchanged - and this PR is a non-breaking change! - and then do something different when we resolve handling of effects. (Indeed I am heading towards, removing such nodes unless they have outgoing edges, but there's a significant semantic change to the Hugr there.)

If I address the rest, does that sound good?

Sounds good to me.

acl-cqc · 2025-02-26T23:18:08Z

Thanks, Alec - I've rejigged the callback a little (one less enum element, made default_for a function) and hopefully addressed the others :)

hugrbot · 2025-02-27T07:51:16Z

This PR contains breaking changes to the public Rust API.
Please deprecate the old API instead (if possible), or mark the PR with a ! to indicate a breaking change.

cargo-semver-checks summary

cqc-alec

Looks good, just a query on the names. And it seems there is a new CI issue that is unrelated.

cqc-alec · 2025-02-27T09:25:28Z

hugr-passes/src/dead_code.rs

+    RemoveIgnoreChildren,
+    /// The node may be removed if all of its children can
+    /// (must be kept if, and only if, any of its children must be kept)
+    RemoveIfChildren,


I wonder if these names are quite descriptive enough. Maybe something like CanRemoveIgnoringChildren and CanRemoveIfChildrenRemoved ? Bit of a mouthful I know but maybe less room for confusion?

CanRemoveIgnoringChildren seems fair - perhaps DespiteChildren? ("Regardless" seems even longer)

CanRemoveIfChildrenRemovable I guess would be correct, but perhaps the point to note here is - this node makes no requirements (MustKeep is the requirement) - so how about just DeferToChildren, AccordingToChildren, AsChildren ??

Yes, DeferToChildren seems good!

acl-cqc added 12 commits January 29, 2025 16:42

Add failing test: main function is removed

1333b41

Fix find_needed_nodes to include main and any AliasDe(cl|fn) beneath …

b264604

…Module

Fix test_module ExtensionSet

9d289f6

Test top-level constant outside main

98d022f

fmt

6b53e06

Test: force keeping outer constant, allow removing 2nd

9535ef5

Make test less fragile

0479b89

Add hugr-passes/src/lib.rs find_main (private)

eba2d7f

Add separate DCE pass with list of entry points; Const-fold passes ma…

d1968e7

…in if necessary

comments

92867c6

WIP Conditional-based test, failing due to bug

e7b9b4a

Amend test to include AliasDecl+AliasDefn

297a84c

Base automatically changed from acl/fix_const_fold to main February 4, 2025 14:18

acl-cqc added 5 commits February 4, 2025 14:55

Merge remote-tracking branch 'origin/main' into acl/fix_const_fold

7fc80f0

Merge branch 'acl/fix_const_fold' into acl/separate_dce

6f55abe

WIP CFG test

9aec6c7

Complete test; fix algorithm to apply callback to all nodes not just …

81a3883

…in DFGs

Mass renaming+doc - diverge => preserve

10db257

fmt,clippy

37e2e54

acl-cqc marked this pull request as ready for review February 4, 2025 16:16

acl-cqc requested a review from a team as a code owner February 4, 2025 16:16

acl-cqc requested a review from cqc-alec February 4, 2025 16:16

acl-cqc commented Feb 5, 2025

View reviewed changes

cqc-alec reviewed Feb 5, 2025

View reviewed changes

acl-cqc added 5 commits February 26, 2025 17:10

Remove PreserveNode::UseDefault, add default_for(&Hugr,Node)

14b5f3b

Option<Arc> -> Arc

93614d8

docs

c40bc30

run_no_validate does not return a Result

ee139d0

RemoveIfChildren/RemoveIgnoreChildren

e7e84c8

acl-cqc added 3 commits February 26, 2025 21:09

fix bad doclink

e47e957

Do the dynamic programming thing for must_preserve

7812a8b

Heh, really do the dynamic programming

5efdaa9

acl-cqc requested a review from cqc-alec February 26, 2025 23:18

Merge remote-tracking branch 'origin/main' into acl/separate_dce

9dd3ddf

cqc-alec reviewed Feb 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add separate DCE pass #1902

feat: add separate DCE pass #1902

acl-cqc commented Feb 3, 2025 •

edited

Loading

codecov bot commented Feb 4, 2025 •

edited

Loading

acl-cqc Feb 5, 2025

cqc-alec Feb 5, 2025

acl-cqc Feb 24, 2025

cqc-alec Feb 24, 2025

acl-cqc Feb 26, 2025

cqc-alec Feb 5, 2025

cqc-alec Feb 5, 2025

cqc-alec Feb 5, 2025

acl-cqc Feb 26, 2025

cqc-alec Feb 5, 2025

acl-cqc Feb 26, 2025

cqc-alec Feb 5, 2025

acl-cqc Feb 26, 2025

cqc-alec Feb 5, 2025

acl-cqc Feb 24, 2025

cqc-alec Feb 24, 2025

acl-cqc Feb 26, 2025

cqc-alec Feb 26, 2025

acl-cqc commented Feb 26, 2025

hugrbot commented Feb 27, 2025

cqc-alec left a comment

cqc-alec Feb 27, 2025

acl-cqc Feb 28, 2025

cqc-alec Feb 28, 2025

feat: add separate DCE pass #1902

Are you sure you want to change the base?

feat: add separate DCE pass #1902

Conversation

acl-cqc commented Feb 3, 2025 • edited Loading

codecov bot commented Feb 4, 2025 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

acl-cqc commented Feb 26, 2025

hugrbot commented Feb 27, 2025

cqc-alec left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

acl-cqc commented Feb 3, 2025 •

edited

Loading

codecov bot commented Feb 4, 2025 •

edited

Loading