Port the full VF2Layout pass to Rust #14056

mtreinish · 2025-03-19T17:22:48Z

Summary

This commit provides a full path Rust implementation of the VF2Layout pass that performs the full pass logic in Rust. We had most of the pass in Rust already by leveraging rustworkx for its vf2 implementation and the scoring logic in Rust. So all that we did in Python was create the graphs as inputs to vf2, building the error map and other rust view objects (since the scoring in Rust predates our rust data model in Qiskit) and then process each found layout for final output. This means the performance improvement expected by this commit is not large since most of the heavy lifting was already in Rust. Instead what this really enables is running the pass in a Python-free context, mainly in a C API for executing this pass which will come in a follow up.

This rust implementation does not cover the full feature set of what the Python implementation offers. Mainly in the case of no target, coupling graph randomization (which is a shame we forgot to remove in 2.0, since it almost always produces worse results), or a user provided error map we rely on the existing Python implementation. This is a tradeoff because these use cases are non-standard and having a Rust interface for running in that mode provides little benefit for the interfaces we're building.

Details and comments

Fixes #12277

TODO:

Fix panics in unittests
Deduplicate implementation around generics for directed vs undirected
Benchmark and tune (and write release note if there is a performance improvement)

This commit provides a full path Rust implementation of the VF2Layout pass that performs the full pass logic in Rust. We had most of the pass in Rust already by leveraging rustworkx for its vf2 implementation and the scoring logic in Rust. So all that we did in Python was create the graphs as inputs to vf2, building the error map and other rust view objects (since the scoring in Rust predates our rust data model in Qiskit) and then process each found layout for final output. This means the performance improvement expected by this commit is not large since most of the heavy lifting was already in Rust. Instead what this really enables is running the pass in a Python-free context, mainly in a C API for executing this pass which will come in a follow up. This rust implementation does not cover the full feature set of what the Python implementation offers. Mainly in the case of no target, coupling graph randomization (which is a shame we forgot to remove in 2.0, since it almost always produces worse results), or a user provided error map we rely on the existing Python implementation. This is a tradeoff because these use cases are non-standard and having a Rust interface for running in that mode provides little benefit for the interfaces we're building. Fixes Qiskit#12277

qiskit-bot · 2025-03-19T17:22:53Z

One or more of the following people are relevant to this code:

@Qiskit/terra-core

This commit address some of the panics and failures seen in scheduling tests. The source of the errors was two fold, the first was the layout definition coming from the mapping is sparse but we were using nlayout which doesn't handle that well so we ended up with several sentinel values persiting to scoring when scoring the layout which caused errors. The second issue was we were missing the special handling of free qubits that was added as a performance optimization in Qiskit#9148 which was causing other failures since some of the code was expecting that.

This commit fixes several issues all stemming from the same root cause. There are multiple mappings of indices to qubits to keep track of and the previous commits in this PR branch were not leveraging the type system to keep these sane. This resulted in getting the mapping between representations incorrect at various points that was causing failures and/or panics when the mappings were not resolved correctly. This commit fixes the issue by moving to a typed representation of the layout as soon as posible. This forces all the places that reference a potential layout to reason about the various mappings correctly.

Now that the pass runs in rust we don't emit python logs for the running of the pass. This commit removes the test log assertions since they're no longer present.

This commit fixes the logic in the free qubit mapping and also skips running vf2 if there are no qubits in the interaction graph and there are free qubits.

coveralls · 2025-03-20T21:13:08Z

Pull Request Test Coverage Report for Build 14313701929

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

404 of 483 (83.64%) changed or added relevant lines in 4 files are covered.
23 unchanged lines in 5 files lost coverage.
Overall coverage increased (+0.08%) to 88.147%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
qiskit/transpiler/passes/layout/vf2_layout.py	14	17	82.35%
crates/accelerate/src/vf2_layout.rs	385	461	83.51%

Files with Coverage Reduction	New Missed Lines	%
crates/accelerate/src/unitary_synthesis.rs	1	94.79%
qiskit/transpiler/passes/layout/vf2_utils.py	3	95.14%
crates/qasm2/src/lex.rs	6	92.23%
crates/qasm2/src/parse.rs	6	96.68%
qiskit/transpiler/passes/layout/vf2_layout.py	7	91.38%

Totals
Change from base Build 14312020200:	0.08%
Covered Lines:	73451
Relevant Lines:	83328

💛 - Coveralls

The earlier approaches were querying the target for each gate while iterating over all the trials. This ends up doing far more hashmap lookups (2x for each gate on a node or edge for each layout) this ends up dominating the runtime. This also fixes an inconsistency in how the scoring worked because we weren't correctly accounting for the number of interactions in the circuit.

This commit fixes a bug that slipped in during an earlier fix where the sort order for the free qubits mapping was accidentally flipped. The intent is to put the lowest error free qubit at the end of the vec so it's the first popped from the stack when mapping it to a qubit with solely single qubit operations. However the comparison got reversed accidentally and that was causing the highest error rate qubits to be used instead. This commit fixes this oversight and fixes the one fialing visualization test that was using the wrong qubit compared to the expected one.

mtreinish · 2025-03-21T03:27:54Z

The one test that is failing: test.python.compiler.test_transpiler.TestTranspile.test_scheduling_dt_constraints looks to be hitting a bug elsewhere in Qiskit. The underlying issue is that the pass manager is constructed with: transpile(qc, backend=backend_v2, scheduling_method="asap", dt=backend.target.dt / 2) and the error rates from the target seem to disappear when vf2layout is passed the target object. This is causing the test to fail because it doesn't have a metric to pick a qubit from in the updated pass in rust.

I think what is happening is that there is a Target.from_configuration call to override the instruction durations with a custom dt and that doesn't know about the error rates from the original target and it's simply omitting them.

…ctly anymore

mtreinish · 2025-03-21T04:35:52Z

The performance here is looking pretty good without any real tuning, especially considering most of the heavy lifting was in rust already,

Benchmarks that have improved:

| Change   | Before [e0da163e] <vf2-rs~3^2>   | After [daec870e] <vf2-rs>   |   Ratio | Benchmark (Parameter)                                   |
|----------|----------------------------------|-----------------------------|---------|---------------------------------------------------------|
| -        | 7.55±0.05ms                      | 6.32±0.06ms                 |    0.84 | utility_scale.UtilityScaleBenchmarks.time_bvlike('ecr') |
| -        | 7.61±0.02ms                      | 6.31±0.07ms                 |    0.83 | utility_scale.UtilityScaleBenchmarks.time_bvlike('cx')  |
| -        | 7.57±0.08ms                      | 6.32±0.07ms                 |    0.83 | utility_scale.UtilityScaleBenchmarks.time_bvlike('cz')  |

Benchmarks that have stayed the same:

| Change   | Before [e0da163e] <vf2-rs~3^2>   | After [daec870e] <vf2-rs>   | Ratio   | Benchmark (Parameter)                                                         |
|----------|----------------------------------|-----------------------------|---------|-------------------------------------------------------------------------------|
|          | 0                                | 0                           | n/a     | utility_scale.UtilityScaleBenchmarks.track_bvlike_depth('cx')                 |
|          | 0                                | 0                           | n/a     | utility_scale.UtilityScaleBenchmarks.track_bvlike_depth('cz')                 |
|          | 0                                | 0                           | n/a     | utility_scale.UtilityScaleBenchmarks.track_bvlike_depth('ecr')                |
|          | 94.2±0.9ms                       | 95.0±1ms                    | 1.01    | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('cz')                |
|          | 30.4±0.2ms                       | 30.7±0.1ms                  | 1.01    | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('cz')  |
|          | 30.1±0.1ms                       | 30.5±0.2ms                  | 1.01    | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('ecr') |
|          | 8.78±0.06ms                      | 8.76±0.1ms                  | 1.00    | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('cx')               |
|          | 8.80±0.04ms                      | 8.80±0.06ms                 | 1.00    | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('cz')               |
|          | 95.1±0.5ms                       | 95.3±0.8ms                  | 1.00    | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('ecr')               |
|          | 387                              | 387                         | 1.00    | utility_scale.UtilityScaleBenchmarks.track_bv_100_depth('cx')                 |
|          | 402                              | 402                         | 1.00    | utility_scale.UtilityScaleBenchmarks.track_bv_100_depth('cz')                 |
|          | 406                              | 406                         | 1.00    | utility_scale.UtilityScaleBenchmarks.track_bv_100_depth('ecr')                |
|          | 300                              | 300                         | 1.00    | utility_scale.UtilityScaleBenchmarks.track_circSU2_depth('cx')                |
|          | 300                              | 300                         | 1.00    | utility_scale.UtilityScaleBenchmarks.track_circSU2_depth('cz')                |
|          | 300                              | 300                         | 1.00    | utility_scale.UtilityScaleBenchmarks.track_circSU2_depth('ecr')               |
|          | 1727                             | 1727                        | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qaoa_depth('cx')                   |
|          | 1734                             | 1734                        | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qaoa_depth('cz')                   |
|          | 1734                             | 1734                        | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qaoa_depth('ecr')                  |
|          | 1982                             | 1982                        | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qft_depth('cx')                    |
|          | 1982                             | 1982                        | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qft_depth('cz')                    |
|          | 1980                             | 1980                        | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qft_depth('ecr')                   |
|          | 2694                             | 2694                        | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qv_depth('cx')                     |
|          | 2694                             | 2694                        | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qv_depth('cz')                     |
|          | 2694                             | 2694                        | 1.00    | utility_scale.UtilityScaleBenchmarks.track_qv_depth('ecr')                    |
|          | 558                              | 558                         | 1.00    | utility_scale.UtilityScaleBenchmarks.track_square_heisenberg_depth('cx')      |
|          | 558                              | 558                         | 1.00    | utility_scale.UtilityScaleBenchmarks.track_square_heisenberg_depth('cz')      |
|          | 558                              | 558                         | 1.00    | utility_scale.UtilityScaleBenchmarks.track_square_heisenberg_depth('ecr')     |
|          | 818±20ms                         | 812±20ms                    | 0.99    | utility_scale.UtilityScaleBenchmarks.time_circSU2('cz')                       |
|          | 826±20ms                         | 814±20ms                    | 0.99    | utility_scale.UtilityScaleBenchmarks.time_circSU2('ecr')                      |
|          | 95.1±0.4ms                       | 94.3±1ms                    | 0.99    | utility_scale.UtilityScaleBenchmarks.time_parse_qft_n100('cx')                |
|          | 30.9±0.2ms                       | 30.7±0.2ms                  | 0.99    | utility_scale.UtilityScaleBenchmarks.time_parse_square_heisenberg_n100('cx')  |
|          | 528±2ms                          | 523±2ms                     | 0.99    | utility_scale.UtilityScaleBenchmarks.time_qft('ecr')                          |
|          | 105±2ms                          | 104±0.7ms                   | 0.99    | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('ecr')            |
|          | 73.7±0.9ms                       | 71.8±1ms                    | 0.98    | utility_scale.UtilityScaleBenchmarks.time_bv_100('cz')                        |
|          | 73.5±0.6ms                       | 71.8±0.4ms                  | 0.98    | utility_scale.UtilityScaleBenchmarks.time_bv_100('ecr')                       |
|          | 9.00±0.06ms                      | 8.81±0.08ms                 | 0.98    | utility_scale.UtilityScaleBenchmarks.time_parse_qaoa_n100('ecr')              |
|          | 271±4ms                          | 265±1ms                     | 0.98    | utility_scale.UtilityScaleBenchmarks.time_qaoa('cz')                          |
|          | 408±2ms                          | 400±2ms                     | 0.98    | utility_scale.UtilityScaleBenchmarks.time_qft('cx')                           |
|          | 512±5ms                          | 500±3ms                     | 0.98    | utility_scale.UtilityScaleBenchmarks.time_qv('cx')                            |
|          | 619±4ms                          | 604±8ms                     | 0.98    | utility_scale.UtilityScaleBenchmarks.time_qv('cz')                            |
|          | 621±9ms                          | 606±2ms                     | 0.98    | utility_scale.UtilityScaleBenchmarks.time_qv('ecr')                           |
|          | 105±1ms                          | 103±1ms                     | 0.98    | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('cz')             |
|          | 821±20ms                         | 795±30ms                    | 0.97    | utility_scale.UtilityScaleBenchmarks.time_circSU2('cx')                       |
|          | 203±1ms                          | 197±3ms                     | 0.97    | utility_scale.UtilityScaleBenchmarks.time_qaoa('cx')                          |
|          | 257±1ms                          | 250±3ms                     | 0.97    | utility_scale.UtilityScaleBenchmarks.time_qaoa('ecr')                         |
|          | 70.9±0.7ms                       | 67.7±0.4ms                  | 0.96    | utility_scale.UtilityScaleBenchmarks.time_bv_100('cx')                        |
|          | 515±4ms                          | 495±5ms                     | 0.96    | utility_scale.UtilityScaleBenchmarks.time_qft('cz')                           |
|          | 92.0±1ms                         | 88.5±0.8ms                  | 0.96    | utility_scale.UtilityScaleBenchmarks.time_square_heisenberg('cx')             |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

mtreinish · 2025-03-25T21:22:36Z

I took this out of WIP because everything is working now and I think it's ready for review. There is still some internal cleanup to the code I think we can do and I'll work on this but I don't want to block on internal code organization because I expect there will be a long tail of changes built on top of this (vf2 post layout, other optimizations, etc).

This commit ports the disjoint layout utils into rust. The eventual goal of this is to enable porting the full DenseLayout and SabreLayout pass to rust so it's a standalone function. This will enable us exposing these passes as a C function. Since the eventual goal of this functionality is to expose it as a C function this attempts to do as much as possible without a py token. A standalone rust entrypoint is added which will serve as what is called from the standalone rust passes in the future, this is then wrapped in a python entrypoint. One minor note is the visibility of the run pass over components entrypoint is `pub(crate)`, but the intent is for it to be `pub`. This visibility was limited here to avoid changing the target's visbility level (which is `pub(crate)` for some reason). This is already being fixed in Qiskit#14056 and when that merges we can update this PR.

alexanderivrii

I have spent quite a bit of time looking at the code, and it makes good sense to me. The only thing that I still don't fully understand is that the free qubits are treated a bit differently in the directed and the undirected cases. Could you please explain the reason for that?

alexanderivrii · 2025-04-07T12:33:36Z

crates/accelerate/src/vf2_layout.rs

+fn build_average_error_map(target: &Target) -> ErrorMap {
+    let qargs_count = target.qargs().unwrap().count();
+    let mut error_map = ErrorMap::new(Some(qargs_count));
+    for qargs in target.qargs().unwrap().flatten() {


Is it always safe to call target.qargs().unwrap() here?

It should be, if .qargs() returns None we handle that before this function is ever called. It's done in build_coupling_map() at L262 it returns the None with the ? and that short circuit returns at L371 or L422 (depending on the graph type).

alexanderivrii · 2025-04-07T12:58:27Z

crates/accelerate/src/vf2_layout.rs

+            return Ok(map_free_qubits(
+                im_graph_data.free_nodes,
+                HashMap::new(),
+                &im_graph_data.reverse_im_graph_node_map,
+                &avg_error_map,
+                target,
+            ));


Why is map_free_qubits only called in the undirected case?

In VF2 and VF2Post there are 3 modes of operations with code shared between the two passes.

Undirected Graph w/ average error rates

Directed Graph w/ average error rates (VF2Layout only)

Directed Graph using target error rates (VF2PostLayout only)

in mode 3 we can't use the free qubits handling because the vf2_mapping function is passed a node and edge matcher that validates the target supports the circuit's operations on a virtual qubit. This is required because in that mode we validate that the new layout is valid per the target and we don't watch vf2 to consider a match if it leaves qubits free that don't support the operations that the free circuit qubits use.

When it comes to mode 2 we could in theory support this optimization but the code doesn't because the directed interaction graph generation is shared between modes 2 and 3. Also mode 2 in general is a bit weird and we don't ever use it by default. So I didn't implement it here because I didn't want to change the behavior from Python for this PR.

Fwiw, in my follow-on patch series, I have done this free-qubit mapping for variant 2, because it let me deduplicate some logic (though it's not all that hard to reinstate it).

Thanks @mtreinish for the explanation, seeing the bigger picture definitely helps.

qiskit/transpiler/passes/layout/vf2_layout.py

jakelishman · 2025-04-07T15:54:24Z

qiskit/transpiler/passes/layout/vf2_layout.py

+        # We can't use the rust fast path because we have a seed set, or no target so continue with
+        # the python path


I feel like this is actually possible certainly for the seed, and (with greater annoyance) also for the coupling map, but I wouldn't necessarily bother in this PR now. It'd be good to remove all this extra Python code at some point, though.

TBH, I was thinking instead it would be better to deprecate these options on the pass and support target only moving forward and no randomization moving forward. For VF2Layout I could see the argument for maybe still supporting coupling map but for VF2PostLayout it doesn't really make any sense.

I only wanted to concentrate on the common path that we use in the preset pass managers for this PR though.

Yeah, I generally agree. My main concern is getting rid of the extra Python code, mostly - it's unpleasant that we're maintaining two separate paths through VF2. I don't feel like we need to do anything about it right now.

jakelishman · 2025-04-07T15:57:53Z

Btw (for those who aren't Matt, who already knows this): I've been working on a follow-on patch series that goes on top of this PR, and reworks a lot of how we do VF2 from quite a low level. From what I saw of working with it, the Rust-space code here is a faithful port of the Python-space code, and while there's potentially ways to reduce some duplication in the constructions of the interactions and couplings, what's here works fine.

alexanderivrii

Thanks @mtreinish for fully porting the VF2Layout pass to Rust, I completely agree with @jakelishman that this is a faithful port of the Python code, approving!

mtreinish added on hold Can not fix yet Changelog: None Do not include in changelog Rust This PR or issue is related to Rust code in the repository mod: transpiler Issues and PRs related to Transpiler labels Mar 19, 2025

mtreinish added this to the 2.1.0 milestone Mar 19, 2025

mtreinish requested a review from a team as a code owner March 19, 2025 17:22

mtreinish added 6 commits March 20, 2025 09:10

Fix circuit->dag conversion to preserve vars

a3f5f50

Remove log assertions for rust code path

728a68e

Now that the pass runs in rust we don't emit python logs for the running of the pass. This commit removes the test log assertions since they're no longer present.

Fix free qubit mapping

cab8a78

This commit fixes the logic in the free qubit mapping and also skips running vf2 if there are no qubits in the interaction graph and there are free qubits.

Make Target public

151b2c0

mtreinish added 4 commits March 20, 2025 19:37

Don't panic in free qubits if there are no error rates

3b0718f

Merge branch 'main' into vf2-rs

774b82f

mtreinish added 2 commits March 20, 2025 23:49

Rename internal scoring function since it doesn't use the target dire…

d7ac4c1

…ctly anymore

Stop caching sort key in free qubit mapping

daec870

ElePT mentioned this pull request Mar 21, 2025

Fix ignored errors/durations in generate_preset_pass_manager if dt is set #14065

Merged

mergify bot mentioned this pull request Mar 25, 2025

Fix ignored errors/durations in generate_preset_pass_manager if dt is set (backport #14065) #14098

Merged

Merge branch 'main' into vf2-rs

49ea728

mtreinish removed the on hold Can not fix yet label Mar 25, 2025

mtreinish changed the title ~~[WIP] Port the full VF2Layout pass to Rust~~ Port the full VF2Layout pass to Rust Mar 25, 2025

Cryoris mentioned this pull request Mar 27, 2025

Add qk_obs_apply_layout plus necessary tooling #14106

Open

mtreinish assigned alexanderivrii Mar 31, 2025

mtreinish mentioned this pull request Apr 2, 2025

unnecessary SWAP on ibm_torino #14162

Open

mtreinish mentioned this pull request Apr 4, 2025

Port disjoint layout utils in rust #14177

Open

1 task

alexanderivrii reviewed Apr 7, 2025

View reviewed changes

mtreinish added 2 commits April 7, 2025 11:50

Fix typ in comment

8c2f3ea

Merge branch 'main' into vf2-rs

1fcd440

jakelishman reviewed Apr 7, 2025

View reviewed changes

alexanderivrii approved these changes Apr 7, 2025

View reviewed changes

alexanderivrii added this pull request to the merge queue Apr 7, 2025

Merged via the queue into Qiskit:main with commit 2e18f9b Apr 7, 2025
24 checks passed

mtreinish deleted the vf2-rs branch April 7, 2025 20:48

jakelishman mentioned this pull request Apr 16, 2025

Handle global target overridden by coupling map in Rust VF2 #14214

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port the full VF2Layout pass to Rust #14056

Port the full VF2Layout pass to Rust #14056

mtreinish commented Mar 19, 2025 •

edited

Loading

qiskit-bot commented Mar 19, 2025

coveralls commented Mar 20, 2025 •

edited

Loading

mtreinish commented Mar 21, 2025 •

edited

Loading

mtreinish commented Mar 21, 2025

mtreinish commented Mar 25, 2025

alexanderivrii left a comment

alexanderivrii Apr 7, 2025

mtreinish Apr 7, 2025

alexanderivrii Apr 7, 2025

mtreinish Apr 7, 2025

jakelishman Apr 7, 2025

alexanderivrii Apr 7, 2025

jakelishman Apr 7, 2025

mtreinish Apr 7, 2025

jakelishman Apr 7, 2025

jakelishman commented Apr 7, 2025

alexanderivrii left a comment

		# We can't use the rust fast path because we have a seed set, or no target so continue with
		# the python path

Port the full VF2Layout pass to Rust #14056

Port the full VF2Layout pass to Rust #14056

Conversation

mtreinish commented Mar 19, 2025 • edited Loading

Summary

Details and comments

qiskit-bot commented Mar 19, 2025

coveralls commented Mar 20, 2025 • edited Loading

Pull Request Test Coverage Report for Build 14313701929

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

mtreinish commented Mar 21, 2025 • edited Loading

mtreinish commented Mar 21, 2025

mtreinish commented Mar 25, 2025

alexanderivrii left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakelishman commented Apr 7, 2025

alexanderivrii left a comment

Choose a reason for hiding this comment

mtreinish commented Mar 19, 2025 •

edited

Loading

coveralls commented Mar 20, 2025 •

edited

Loading

mtreinish commented Mar 21, 2025 •

edited

Loading