TopQAD's Tools
Compiler
The Compiler automates the compilation of circuits from an intermediate representation (IR) format such as OpenQASM down to the scheduling of lattice surgeries (i.e., stabilization instructions) to be implemented in the core processor. Note that the final machine-level instructions for execution of a program on a QPU include also the schedule of lattice surgeries (stabilizations) in modules other than just the core processor (e.g., MSFs); the complete lattice surgery schedule an output of the (lower-level) Assembler.
Compilation in TopQAD begins with circuit synthesis. This is the process of converting1 all gates in a programmed algorithm into the ISA gate set. During this process, some gates must be replaced and approximated. This will incur synthesis error. The Compiler ensures that the accumulated synthesis errors remain under the user-specified target error budget.
For the example of the Pauli product rotations ISA, arbitrary-angle rotation gates in the programmed algorithm are approximated by gates in the Clifford+ gate set; TopQAD currently employs the Solovay–Kitaev algorithm [6] and its extensions [7, 8]. This step incurs synthesis error. The resulting Clifford+ gates are converted into Pauli product rotations as per the ISA without incurring synthesis error.
TopQAD’s Compiler has the ability to optimize a circuit into one that is more compact by commuting Clifford gates to the end of the circuit so they are absorbed by the final qubit measurements. The resulting circuit will contain only non-Clifford gates, thus significantly reducing the number of operations. However, this process makes the algorithm less parallelizable by creating higher-weight and longer-ranged lattice surgeries [9].
Next, qubits defined in the synthesized circuit are allocated on the logical microarchitecture and the Compiler creates a schedule, which defines when and how (e.g., using which bus patches) each operation will be executed. TopQAD solves this scheduling problem by optimizing the resource usage (in terms of space and time) for the operations required to execute the algorithm. In particular, it minimizes the circuit depth by searching for operations that can be performed in parallel, thereby reducing resource requirements.
The Compiler selects a core processor microarchitecture that is designed to leverage the parallelization potential of gates within the synthesized circuit. This parallelization potential is measured as the ratio of the circuit depth, determined by gate commutativity, to the circuit length, which represents the total number of gates in the circuit. TopQAD may select different core processor layouts depending on the parallelizability of the circuit, two examples of which are given in Fig. 2.


Figure 2. Example logical microarchitecture layouts for the memory zone of the core processor, for the Pauli product rotations ISA implemented using a rotated surface code microarchitecture. Both layouts follow conventions detailed in the legend of Fig. 1. Arrows represent access points to auto-correction zones. The logical microarchitecture in (a) has multiple access points for the auto-correction zone and an expanded quantum bus size that increases the chances of finding feasible paths for the lattice surgeries to be performed in parallel. The correction qubit patches are used for Clifford gates. TopQAD determines the number of required auto-correction units and correction qubit patches for the core processor based on the parallelization potential. This logical microarchitecture is preferable for a circuit with high parallelization potential. The logical microarchitecture in (b) has a reduced quantum bus size to save space. Only one access point to an auto-correction unit is created. This enforces serial execution of the gates and is preferable for circuits with reduced parallelization potential. Clifford gates must have been removed during circuit synthesis for this memory zone to be selected, due to the absence of correction qubit patches.
The scheduling problem is solved at discrete time steps determined by the logical cycle, which is defined as the time needed to perform a parallel set of logical gates in the algorithm.
A key part of the scheduling problem is determining when gates can be executed in parallel. Parallel operations occur when multiple logical gates are scheduled within the same logical cycle. Parallelization can reduce the overall quantum runtime, as it allows for better utilization of available resources. Additionally, parallelizing operations can help minimize computational errors, since idling logical qubits, that is, those not used for operations in a given logical cycle, go through the same stabilization cycles as the ones involved in the logical gates.
TopQAD solves the scheduling problem using a decomposition approach. First, the logical relations between the logical gates are determined using a trivial commutation rule and then mapped onto a dependency graph. Similarly, the core processor layout is mapped onto an adjacency graph (see Ref. [9] for more information). The decomposition approach uses an earliest-available-first (EAF) policy, where operations are tentatively scheduled as they become available. See Ref. [9] for further details.
By solving the scheduling problem, the Compiler is able to provide a detailed estimate of logical resources used within the core processor when logical gates are being executed, indicating when and which patches of the core processor are active. In addition to the logical resource estimate, the Compiler generates metrics such as the expected number of active logical cycles of the core processor and statistics for the sizes of buses needed to perform each operation. For a more detailed discussion of how to interact with the Compiler, including its inputs and outputs, see the Compiler service page. For a comprehensive explanation of resource requirements that extends beyond the core processor—such as those involving physical resource estimation—please refer to the section on TopQAD’s QRE service.
Assembler
The Assembler is a low-level compilation tool designed to convert a compiled circuit, for example, circuits produced by TopQAD’s Compiler, into sequences of stabilization instructions for execution by QPU controllers and decoders. The assembly process depends on the ISA, the microarchitecture, and the noise profile of the QPUs of the computer. Therefore, TopQAD’s Assembler receives inputs from both the Compiler and the Noise Profiler.
The QPU noise profile, in particular, the performance of various fault-tolerant protocols—e.g., quantum memory, magic state preparation, magic state distillation, code growth, and logical operations—is used by the Assembler to determine how to allocate appropriate physical space for various microarchitecture modules and their interconnects such that the compiled program can be executed within a user-defined error budget. This budget covers errors that might occur during the tasks of running logical operations and producing the states they require. These error rates are predicted using mathematical models or by using data from simulations or experiments, such as those provided by the Noise Profiler.
The Assembler uses specific features of the scheduled program (such as the amount of parallelization or the structure of various segments of the compiled program) to optimize the space–time trade-offs in the execution of the quantum algorithm. With these inputs, the assembler determines the size of the required microarchitecture modules (e.g., the MSF hierarchy and the core processor) and the optimal QEC settings. In this version of TopQAD, the assembler provides a time-optimal microarchitecture, that is, sufficient redundancies in the number of distillation units in the MSF are used for providing a balanced supply of generated magic states for consumption in the core processors.
Error Modelling
The Assembler uses an algorithm that designs and optimizes a logical microarchitecture such that the quantum algorithm it executes will be within an input error budget by providing a logical microarchitecture layout that balances space (the number of physical qubits used) and time (the expected runtime) costs, and the assembled machine-level (stabilization) instructions for that microarchitecture. The Assembler models this as a bi-objective optimization problem, making decisions as to the number of distillation levels required in the MSF, the number of preparation and distillation units at each level, and the code distances required in each level and in the core processor.
Both the compilation and physical execution of a quantum circuit contribute to the total computational error. The Assembler ensures that the assembled program meets an error budget according to
where is the synthesis error produced by compilation, and each is the error generated in the -th module during execution. The value can be determined by TopQAD’s Compiler and provided to the Assembler. However, each depends on the architecture of the invoked modules (e.g., core processor, MSF hierarchies, and QROM), and is therefore determined by the Assembler itself. The Assembler proposes an initial logical microarchitecture and iterates on its layout until the error falls within the error budget.
Noise Profiler
In a fault-tolerant quantum computer, QECCs are used to protect logical quantum states from physical errors. Logical operations on logical states protected in this manner are performed using FTQC protocols. These protocols, consisting of physical operations and classical computation, are designed such that a given logical operation succeeds with a high probability of success. Generally, the probability of failure or logical error is dependent on the distance of the code in which the quantum state is encoded, but also on other protocol parameters such as the number of stabilization cycles. Formally, an FTQC protocol, for a given set of protocol parameters, is described by a quantum circuit operating on physical qubits that possibly includes some classical conditional logic based on measurement outcomes, and a decoder for the underlying QECC. The Noise Profiler includes routines for generating such protocols.
To estimate the performance of a protocol on a quantum computer requires knowledge of the physical noise experienced by the qubits and gates of which it is composed. This information can be obtained experimentally by characterization techniques, for example, randomized benchmarking, resulting in a set of qubit and gate characterization parameters, as outlined below. To assess the impact of these parameters on the performance of a protocol, the Noise Profiler simulates noisy quantum channels representing the physical circuit of the FTQC protocol. An example noise model supported by the Noise Profiler is the depolarizing noise model, details about which can be found in Appendix E of Ref. [10].
The protocol circuit with added noise can be simulated with the help of a Clifford simulator [11, 12] coupled to a fast and accurate decoder [13]. Given the probabilistic nature of errors, Monte Carlo sampling is used to estimate the protocol’s performance metrics, such as the logical error rate, the post-selection rate, or the error-suppression rate.
To model and predict the logical error rates of high-distance fault-tolerant protocols, the Noise Profiler combines Monte Carlo simulations with theoretical models and numerical regressions, fitting a smaller number of model parameters to reflect realistic error behaviour. For example, the logical error rates of the memory protocol can be modelled as
where the two parameters are:
- Error prefactor (): captures the baseline error rate based on the physical characteristics of the quantum processor; and
- Error suppression rate (): describes how quickly error rates decrease as the code distance increases.
In what follows, the collection of all the data input to the Noise Profiler and the logical error rates of fault-tolerant protocols predicted by it is called the noise profile of the QPU.
Footnotes
-
A quantum algorithm may be described by a sequence of operations, referred to as gates, belonging to a gate set, such as the universal Clifford+ set. These operations are followed by a final sequence of qubit measurements to extract the results of the computation. For a given QECC, which underlies the microarchitecture, the gates that can be directly applied are limited to the ISA gate set that the microarchitecture implements. Thus, conversion from a programmed quantum algorithm to the ISA gate set is necessary. ↩