Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing Boxes in Qiskit #76

Open
wants to merge 57 commits into
base: master
Choose a base branch
from
Open

Conversation

SamFerracin
Copy link

Summary

This RFC describes a new Box instruction that would be added to qiskit as a way to express groupings of instructions that can have data attached, can be sent up and down the stack, and can pass through transpilation. It provides details of its implementation and its interaction with other qiskit features, and it discusses the benefits of Box in key contexts such as twirling and mitigation

SamFerracin and others added 30 commits September 5, 2024 16:53
Co-authored-by: Ian Hincks <ian.hincks@gmail.com>
Co-authored-by: Ian Hincks <ian.hincks@gmail.com>
Co-authored-by: Ian Hincks <ian.hincks@gmail.com>
Co-authored-by: Ian Hincks <ian.hincks@gmail.com>
Co-authored-by: Ian Hincks <ian.hincks@gmail.com>
Co-authored-by: Ian Hincks <ian.hincks@gmail.com>
Co-authored-by: Ian Hincks <ian.hincks@gmail.com>
Co-authored-by: joshuasn <53916441+joshuasn@users.noreply.github.com>
Co-authored-by: joshuasn <53916441+joshuasn@users.noreply.github.com>
Co-authored-by: joshuasn <53916441+joshuasn@users.noreply.github.com>
Co-authored-by: joshuasn <53916441+joshuasn@users.noreply.github.com>
Co-authored-by: joshuasn <53916441+joshuasn@users.noreply.github.com>
Co-authored-by: Ian Hincks <ian.hincks@gmail.com>
Co-authored-by: Ian Hincks <ian.hincks@gmail.com>
Co-authored-by: Ian Hincks <ian.hincks@gmail.com>
This fleshes out a lot more details about how a generic `box` would work
in Qiskit SDK, and how arbritrary downstream annotations might work in
the context of an entire compiler framework, where annotation-creating
passes may be injected inbetween standard Qiskit passes, or the user may
write into their circuit, and the transpiler might want to verify before
submission to the quantum computer.

The case study on how `pec-runtime` will use this isn't fleshed out - I
mostly just moved it into a section at the end and left it.
SamFerracin and others added 14 commits December 6, 2024 09:18
Co-authored-by: joshuasn <53916441+joshuasn@users.noreply.github.com>
Co-authored-by: Ian Hincks <ian.hincks@gmail.com>
Co-authored-by: Ian Hincks <ian.hincks@gmail.com>
add section about dynamical decoupling with boxes
These were hypothetical anyway, and should not be part of a new
`BackendV3`; there is no necessity, and it would be unacceptable API
whiplash for users anyway.
Co-authored-by: Ian Hincks <ian.hincks@gmail.com>
Comment on lines +160 to +163
> [!NOTE]
> This section is definitely not complete yet; the text of it just represents a starting point for the thinking.

### Semantics of custom
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is meant to say "custom annotations". The section no being fully formalised doesn't need to hold up the rest of the RFC - a fair amount of this will be Qiskit-specific design. It will matter to pass authors, but large amounts of it are going to be about what's technically feasible to Qiskit, especially for a first MVP, and we might need to evolve a bit from there.


A `box` can be seen as the trivial case of a control-flow operation, equivalent to an `if (true)` block in a control-flow graph.
Box, however, additionally has the semantics that is not a valid optimisation to simply remove the box.
A `box` is a non-reusable grouping of instructions; it is not a function call that can be called multiple times (this is still a useful concept, it's just separate to `box` and not addressed here).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this imply that I cannot compose a boxed set of instructions into a circuit multiple times?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording can be improved - you can compose them in multiple times, the point I'm just really trying to say is that box isn't a function call. The language is probably unclear because it's more addressing an issue we had in a very early draft of the RFC that's since been deleted, rather than addressing somebody new.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so boxes are re-usable, then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in the sense that you could do:

from qiskit import QuantumCircuit

has_box = QuantumCircuit(2)
with has_box.box():
    has_box.h(0)
    has_box.cx(0, 1)

and then have an actual main circuit that you construct like

main = QuantumCircuit(4)
main.compose(has_box, [0, 1], inplace=True)
main.compose(has_box, [1, 2], inplace=True)
main.compose(has_box, [2, 3], inplace=True)

The point I was trying to drive at is that even though the exact same Python-space Box instruction was composed on several times to the same circuit, the compiler will reason about them as three completely separate boxes: it largely has to be able to, because what's valid for compilation on qubits (0, 1) may well not be valid for (2, 3). So my compose example is functionally the same as if I'd written

main = QuantumCircuit(4)
with main.box():
    main.h(0)
    main.cx(0, 1)
with main.box():
    main.h(1)
    main.cx(1, 2)
with main.box():
    main.h(2)
    main.cx(2, 3)

and both are equivalent to the OQ3 program

OPENQASM 3.0;
include "stdgates.inc";
qubit[4] q;

box { h q[0]; cx q[0], q[1]; }
box { h q[1]; cx q[1], q[2]; }
box { h q[2]; cx q[2], q[3]; }

What I'm trying to say (poorly) in the text is that that's not necessarily the same as the OQ3 program

OPENQASM 3.0;
include "stdgates.inc";
qubit[4] q;

// This subroutine is expected to be compiled once, and
// reused with some calling convention for moving qubits
// into known registers.
def with_box(qubit a, qubit b) {
    box { h a; cx a, b; }
}

with_box(q[0], q[1]);
with_box(q[1], q[2]);
with_box(q[2], q[3]);

because Qiskit doesn't have the concept of a re-usable subroutine with a calling convention - it only has universal inlining.

An early draft of this RFC had an open question about whether "box" should be re-usable in the "not inlined" sense, and I'm trying to expressly forbid that (for now) because Qiskit doesn't have any concept of a quantum calling convention1.

Footnotes

  1. I'm really interested in developing one, but that feels like an entire compiler research project unto itself.

```

Certain hardware providers may also have direct-access APIs that do not require submission of the job via Qiskit.
Custom serialisers/deserialisers of annotations should be aware of this as well; it is effectively the same problem as serialisation to and from OpenQASM 3.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if, like OpenQASM, annotations were just text (i.e. str type)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, passes will almost certainly want programmatic data structures in their annotations - the "case study" that the primitives team at the bottom gives one or two examples. If we say "annotations are just text", then every pass that needs to examine the annotation (which could well be every pass, even if just to know it can ignore it) needs to know how to parse at least some of the text. The aim here is to make it not a problem until we hit the only boundaries where the data structures matter, and to move the sources of truth for encoding / decoding to the places that define them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I was just wondering if the concept of vendor prefixes could be used here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, I'm sure that the OQ3 part of the custom serialisation/deserialisation absolutely will use vendor prefixes, yeah.

Custom serialisers/deserialisers of annotations should be aware of this as well; it is effectively the same problem as serialisation to and from OpenQASM 3.


### Communication of "allowed" annotations from a backend
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elsewhere we have been working to make the Target the complete representation of the data the transpiler needs to do its work. Why not extend the Target to represent the allowed annotations?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We get into this point more further down, in the discussion of the split of responsibilities and what constitutes a "full amount of work" for the transpiler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no sorry, this is the section I had in mind as "further down".

Then to clarify what we were aiming at: Target is a complete representation of the data that the transpiler needs to do its work right now, and the transpiler right now is always compiling for a QPU - it doesn't know anything about primitives. We're trying to position an annotation here as something that is handled by a "quantum computer", before hitting a QPU - it's an instruction for a further compilation effort.

Let's fast-forward a hypothetical year to a time when Qiskit compiles "estimator-like inputs" into "collection of EstimatorPub", and the user has tagged a box in their circuit with a NoiseLearning(...) annotation. The aim is to know whether a given primitive backend target can handle this NoiseLearning instruction, or whether the transpiler should fail with a message like "you've included some noise-learning metadata, but there are no configured passes to resolve that to pubs and the quantum computer doesn't say it can do it itself". That's roughly akin to BasisTranslator saying "I see you've given me a my_special_u gate, but you haven't told me any equivalences for it, and the QPU doesn't say it can do it itself".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're trying to leave in place the idea that Target is a representation of a QPU's ISA, which is what it is right now, and what it was designed to be. We don't have an hardware-description object that's designed to represent the pre-processing capabilities of a quantum computer, but we need one in some form. If we try to make the Target that as well, we'll have the Target mean different things in different contexts, and risk compromising what it's already very effective at representing.

Copy link
Member

@pedrorrivero pedrorrivero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this RFC—congratulations to everyone involved!

Strong points:

  • Clear and well-defined semantics
  • Introduction of the noop operation
  • Versatility and flexibility of the annotations

That said, I have a couple of questions that I might have missed or misunderstood:

  1. I didn’t see a clear way to declare nested box structures without potentially running into "indentation hell." Are there any plans or thoughts on addressing this?
  2. Is it intended that annotations can enforce checks within the box? For instance, if I want to ensure that my box contains at most one operation per qubit, is there a mechanism to enforce such constraints?

@ihincks
Copy link
Contributor

ihincks commented Jan 20, 2025

Thanks for the feedback @pedrorrivero !

I didn’t see a clear way to declare nested box structures without potentially running into "indentation hell." Are there any plans or thoughts on addressing this?

I'm wondering if you might be able to provide an example to guide us here? I can try to give a general answer, but it might not touch on what you want. Just like other existing instructions that own one or more blocks with their own scopes (switch, if, for, etc.), yes, the QuantumCircuit object will allow the construction of arbitrarily nestings so long as everything is well defined. However, this doesn't imply that any particular execution agent will be willing to interpret and execute such circuits. Just as the runtime primitives have validation for various things today, I expect they won't accept nested boxes of certain flavours.

Is it intended that annotations can enforce checks within the box? For instance, if I want to ensure that my box contains at most one operation per qubit, is there a mechanism to enforce such constraints?

Something will enforce these constraints, but this RFC doesn't specify whether an annotation will declare its own validation method, or whether some other entity will enforce these constraints pre-submission.

@jakelishman
Copy link
Member

On point 1:

Being a ControlFlowOp, you'll be able to construct a Box instruction object manually with Box(body_qc, annotations=[]), and then append it to a circuit with qc.append(box, qubits=[...], clbits=[...]). When the control-flow operations were introduced to Qiskit, that was the only way, and in practice, it's super fiddly to do and get right. That's why the control-flow builder interface was introduced, but the old method is still available. You can of course also use all the other tools to construct QuantumCircuit - building components of the circuit and calling compose onto a larger one, writing the circuit (or parts of it) in OpenQASM 3 and qasm3.loading it in, etc.

On another note: you say "indentation hell", but a box is a logical scope - in all programming languages I'm familiar with, scopes are conventionally (or mandatorily) indicated by a layer of indentation, which is the same as with qc.box(): ... introduces. Is there something more than that that you're worried about?

On point 2:

Not all annotation validations might be able to work knowing only the local scope of the box that's just been created, so it would be limiting to mix in validation concerns along with construction ones. The validity of an annotation might depend on other boxes in the circuit, on the capabilities of the backend it's targeted for, etc, so I would suggest that "at construction time" isn't the best place for the checking. Part of the "backend extension" section above is getting towards the idea that it could one day be the domain of the transpiler and custom transpiler passes (like how the transpiler validates basis gates, coupling, etc), but we don't go all the way in this RFC - there's no need to try and legislate too far in advance yet.

@sbrandhsn
Copy link

I like this RFC! The only point that popped up while reading the RFC was on equivalency. I see that a user can specify an annotation that uniquely identifies a box by using e.g. uuid. I was wondering whether we wanted to enforce some kind of default behaviour for boxes that would allow follow-up transpiler passes to immediately identify equivalent boxes instead of relying on the user to correctly provide uuids. A user may define a box and append it to a circuit multiple times. If we have an efficient way of checking the equivalency of two boxes in the circuit, a transpiler pass may be able to reuse any optimizations performed for the first occurrence of a box. Without a default box identifier, a transpiler pass may first need to establish box equivalency by performing e.g. graph isomorphism checks which may offset the benefits of using cached optimizations.

@jakelishman
Copy link
Member

Sebastian: the RFC is deliberately trying not to do that, with the whole "boxes can't be re-used verbatim" thing. It can very very occasionally work that the compilation of a virtual-circuit box could be re-used verbatim for another instance of the virtual-circuit box, but that depends on a whole bunch of things that aren't local to the box: the layout of the circuit, the routing, the instruction set of the backend target, etc. QuantumCircuit and the core transpiler need to be conservative, so an automatic marking trying to get at "this is the same box and must be compiled the same every time" would be nearly unusable - it'd have to be identical for every modification the transpiler might make, and that's not something we have algorithmic support for. Routing is the big one here, especially because it's non-local effects from the boxes that make them incompatible, but that then ties into your wants for optimisation: an optimisation isn't valid to be re-used if the box doesn't take place on the same hardware qubits.

The idea of "reusable optimisation" is explicitly punted from this RFC - that would be a separate "function call" sort of instruction. That's obviously really really interesting, but "reusability" is a separate concern to "box".

@sbrandhsn
Copy link

My understanding of this RFC was that boxes are reusable?

I think it would be up to the transpiler pass to decide what it does with equivalent boxes. Granted, a transpiler pass would often have to consider things outside of the box definition to make that decision but e.g. for synthesis on a homogeneous gate set, you could potentially incur a large runtime speedup when encountering m equivalent n-qubit boxes in your quantum circuit. Reusing peephole optimizations (or many other passes in the init stage of the default passmanager) within a box is another example that appears to be workable in the future.

On the other hand, I don't want to suggest any kind of scope creep and it appears that boxes are useful outside of these kinds of use cases.

@jakelishman
Copy link
Member

jakelishman commented Jan 22, 2025

There's some more discussion up here: #76 (comment).

I think it would be up to the transpiler pass to decide what it does with equivalent boxes.

There are built-in transpiler passes, which (almost) always run that will break your definition of "equivalent" from the virtual-circuit perspective immediately. I totally agree that it would be great if we could re-use optimisations, but it's very much not trivial to do this.

I think you're thinking of equivalence and optimisations in quite high-level abstract terms, where we reason about the quantum hardware in very homogeneous terms. Having an "auto equivalence" might be useful here, but that's not related to box - it's about any high-level construct, like a multiplexor or a high-arity QFT as well. We already have a mechanism to mark a composite instruction as reused - it's to make a custom subclass of Instruction or the like. We don't do anything with that information yet, but the principle is there for future expansion, and then it'd auto apply to other high-arity / complex instructions too.

Once we've moved past abstract optimisations, the next thing that happens is layout and routing. Both of these map all instructions from the virtual nicely homogenous "algorithm designer's" space down to the physical hardware, where we lose almost all the homogeneity pretty immediately. This doesn't even require a heterogeneous basis set in the sense of "different 2q operations on different links" - the connectivity graph of the physical qubits and their error rates are already significant inhomogeneities for both routing and all subsequent optimisations. So now the equivalence of a group of instructions is that they have to have been routed onto the same hardware qubits in the same orders (to match error rates), or if errors aren't considered, then they still need to have been mapped to an isomorphic subgraph with equivalent swaps. We don't have any routing algorithm that can enforce that, so if reuse of the box structure implied that equivalence was required to this level, then the only thing box re-use could be for would be super low-level applications where you're putting the box on the same physical qubits, and so you're already likely to have done the vast amount of the work that you might want from the transpiler.

I think there is a use for the latter thing, including potential re-use, but I don't think it can work as the default setting. I'm interested in making it opt-in via something like built-in annotations, something like Verbatim (fail if layout/routing is required, no optimisation with), OptimizationLevel(x) (apply a different optimization level within the block), etc. That bit of the design isn't well-thought out in my mind (and doesn't need to be in MVP0), but it's definitely an area for future expansion.

@ihincks
Copy link
Contributor

ihincks commented Jan 23, 2025

I see that a user can specify an annotation that uniquely identifies a box by using e.g. uuid.

I think I may have unnecessarily injected some confusion into the PR by naming that annotation "uuid". The goal there is more narrow than to annotate the box and its contents as being uniquely universally identifiable, implying any two boxes sharing the uuid must be exactly equivalent.

The idea instead is that the annotation itself should be uuid, and maybe more properly uid, so that it can be used as a marker at execution time to attach external execution-time information external to the circuit, such as noise model injection. For this to work, there need not be a promise that the contents of all boxes sharing a uuid must be equal, only that each box with a particular uuid must be compatible with whatever they're attached to. In the cases I'm thinking of these compatibility constraints come down to something easy like qubit count, and can be done at validation time. It's true that built-in tooling and typical workflows will always have the behaviour of, say, only attaching a particular noise model to boxes with identical contents. But to demand this by (difficult to implement) contract would be overbearing.

@jlapeyre
Copy link
Contributor

This looks super useful.

Mutability: I think the data in an annotation should be immutable.
In Python, you'd have trouble enforcing this. But in any case, it should be made
clear whether the data is immutable; throughout Qiskit transpilation and perhaps other layers of the stack.

noop: I wonder if a better solution than noop could be found. What you want to convey is that the box includes certain resources. This data could also be passed explicitly when constructing the box. Then, no pass or other analysis tool would have to search for the noops. If this data is already stored in the box structure behind the scenes upon construction, all the more reason to have a direct way to specify it.

The RFC presumably allows noop to be available for any other purpose that users find.
If this is really what we want, fine. But I'm uncomfortable with it happening as byproduct of specifying the edges of a box.

Versions: Boxes and annotations may be produced and consumed in different places and times. And the annotation carries some data of non-trivial complexity. I expect that the format of some annotations will be modified after they are introduced. Making some allowance for versioning of boxes (or the annotations, really) would complicate the RFC. But putting off questions of versioning will cause headaches down the road.

Reusability. It looks like a box carries two substructures, one for the circuit data and one for "annotations". It's clear that the box, including its circuit data, cannot be reused. But I imagine some circuits might repeat a box with the same "annotations" many times. In Python, the natural way to do this, for convenience as well as memory efficiency, is to assign the annotation structure to a variable and then use this variable in each box to refer to the data. In the RFC, the word "self-contained" is used. But references to annotations are not really self-contained. Note that the question of immutability is important here. Typically, in programming languages, immutable elements can be deduplicated as an optimization without changing semantics.

The RFC already mentions identifying an annotation with a uid, for other purposes. If the annotation is immutable and may be specified by a reference (in Python, this is something like the id) then the uid is redundant. Maybe this is ok.

At some point, it might be useful to implement hierarchical "grouping", like drawing programs do. A group contains an arbitrary collection of elements. But I can't think of a use off the top of my head.

Control flow semantics: Do we really want methods like replace_blocks?

Python-centrism: In practice, annotations containing fields such as strings and lists of floats are straightforward in most languages and platforms. But allowing other Python objects in annotations would complicate this picture. Restricting annotations a bit might make the serialization story simpler and more secure, as well. Also, ControlFlowOp will be implemented in Rust eventually. So it's best to think about compatibility now.

I'm very much onboard with the client-side validation. And more broadly, I'd consider specifying some kind of structure and semantics for annotation "metadata" (or "tags") as opposed to the more free-form annotation data. I mean things like "cannot ignore" that are discussed in the RFC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants