Sample-based quantum diagonalization with Fulqrum¶
In this tutorial, we will use Fulqrum to find an approximation to the ground state of a chemistry Hamiltonian following a sampled-based quantum diagonalization (SQD) approach.
We will use different blocks from Fulqrum to construct a full end-to-end workflow to estimate molecular Hamiltonian ground state energy using configuration recovery.
Convert a molecular Hamiltonian from one- and two-body integrals to a Fulqrum compatible operator.
Construct a subspace from bitstrings (configurations or Slater Determinants) and project the operator on to the subsapce as a sparse matrix in the CSR format.
Eigensolve the sparse matrix with a choice of an eigensolver.
Compute electron orbital occupancies from the eigenvector and subspace bitstrings and use them to recover configurations.
Use recovered configurations along with carryover bitstrings in the next iteration to refine ground state estimation.
This tutorial requires following two packages in addition to Fulqrum:
PySCF: To define the molecule and compute the one- and two-body integrals.Openfermion: To convert the one- and two-body integrals into a Fulqrum compatible operator internally.
Define Molecule¶
In this tutorial, we will approximate the ground state energy of an \(N_2\) molecule in the 6-31g basis. First, we will specify the molecule and its properties. Then, we will compute one- and two-body integrals using the PySCF package.
[1]:
import warnings
import pyscf
import pyscf.mcscf
warnings.filterwarnings("ignore")
# Build N2 molecule
mol = pyscf.gto.Mole()
mol.build(
atom=[["N", (0, 0, 0)], ["N", (1.0, 0, 0)]],
basis="6-31g",
symmetry="Dooh",
)
# Define active space
n_frozen = 2
active_space = range(n_frozen, mol.nao_nr())
# Get molecular integrals
scf = pyscf.scf.RHF(mol).run()
num_orbitals = len(active_space)
n_electrons = int(sum(scf.mo_occ[active_space]))
num_elec_a = (n_electrons + mol.spin) // 2
num_elec_b = (n_electrons - mol.spin) // 2
cas = pyscf.mcscf.CASCI(scf, num_orbitals, (num_elec_a, num_elec_b))
mo = cas.sort_mo(active_space, base=0)
hcore, nuclear_repulsion_energy = cas.get_h1cas(mo) # hcore: one-body integrals
eri = pyscf.ao2mo.restore(1, cas.get_h2cas(mo), num_orbitals) # eri: two-body integrals
converged SCF energy = -108.835236570774
Convert to the Fulqrum operator¶
Next, we will convert the one- and two-body integrals into a Fulqrum FermionicOperator using the integrals_to_fq_fermionic_op() converter available from Fulqrum. Next, we will perform an extended Jordan-Wigner transformation using extended_jw_transform() to convert it to a Fulqrum QubitOperator. The QubitOperator will be used in the subsequent projection step.
[2]:
from fulqrum.convert.integrals import integrals_to_fq_fermionic_op
fermionic_op = integrals_to_fq_fermionic_op(
one_body_integrals=hcore, two_body_integrals=eri
)
print(f"Num terms operator: {fermionic_op.num_terms}")
fulqrum_operator = fermionic_op.extended_jw_transformation()
print(f"Num terms operator [after JW transform]: {fulqrum_operator.num_terms}")
Num terms operator: 36960
Num terms operator [after JW transform]: 13280
Load random bitstrings¶
Now, we load some pre-generated random bitstrings for the subspace construction and projection.
[3]:
import json
with open("./data/random_counts_n2_10000_samples.json", "r") as jf:
counts = json.load(jf)
print(f"Num bitstrings in raw counts: {len(counts)}")
Num bitstrings in raw counts: 10000
Next, we will separate bitstrings and their respective probabilities into two separate lists so that it becomes easier to use them later.
[4]:
import numpy as np
raw_bitstrings = []
raw_probs = []
for bs, cnt in counts.items():
raw_bitstrings.append(bs)
raw_probs.append(cnt)
raw_probs = np.array(raw_probs, dtype=float)
raw_probs /= np.linalg.norm(raw_probs)
Configuration recovery loop¶
Now, we are ready to run the full sample-based quantum diagonalization (SQD) routine with iterative configuration recovery to refine our estimation of the ground state energy of \(N_2\) molecule. Fulqrum has an sqd module that uses the qiskit-addon-sqd-hpc as a submodule, which provides postselection, subsampling, and configuration recovery capabilities. Each SQD loop has the following steps:
Construct alpha (beta) half strings by combining Hartree-Fock state, carryover configurations from previous round, and recovered configurations from raw counts - in that order.
Construct a Fulqrum
Subspaceusing the spin-up (alpha) and spin-down (beta) half strings. For this \(N_2\) example, both will be same. FulqrumSubspaceinternally sorts the half strings and takes a Cartesian product to construct full strings.Project the
fulqrum_operatoron to this subspace and construct a CSR-format sparse matrix representation of the projected Hamiltonian. The CSR matrix will be wrapped into a LinearOperator for faster eigensolving.Prepare a simple initial guess vector \(v0\) for the eigensolver. While it is optional, it can speed up eigensolving.
Eigensolve the projected Hamiltonian matrix to get the minimum eigenvalue (ground state energy estimate) and the corresponding eigenvector.
Use the eigenvector to compute electron orbital occupancies that will be used to recover configurations in the next round and also to select important configurations to carryover to the next round.
[5]:
from collections import OrderedDict
def unique_alpha_beta_combined(bitstrings):
"""A utility function for getting unique alpha and beta halves from full bitstrings.
Args:
bitstrings (list[str]): A list of full bitstrings consisting of both alpha and beta parts.
Returns:
A dictionary with combined unique ``alpha`` and ``beta`` half strings.
"""
if not bitstrings:
return OrderedDict()
unique_ab = OrderedDict()
num_spatial_orb = len(bitstrings[0]) // 2
for bs in bitstrings:
a = bs[num_spatial_orb:]
b = bs[:num_spatial_orb]
unique_ab[a] = 1
unique_ab[b] = 1
return unique_ab
[6]:
import time
import scipy.sparse.linalg as spla
import fulqrum as fq
from fulqrum.core.sqd import (
postselect_by_hamming_right_and_left,
subsample,
recover_configurations,
get_carryover_full_strs,
)
# options
tol = 1e-5
num_batches = 1
carryover_threshold = 1e-4
seed = 0
nelec = (num_elec_a, num_elec_b)
samples_per_batch = 500
current_occupancies = None
max_iterations = 5
carryover_full_strs = []
S = None # Subspace
for ni in range(max_iterations):
iter_start = time.perf_counter()
print(f"ITERATION {ni}")
if current_occupancies is None:
# If we don't have average orbital occupancy information, simply postselect
# bitstrings with the correct numbers of spin-up and spin-down electrons
bitstrings, probs = postselect_by_hamming_right_and_left(
raw_bitstrings, raw_probs, num_elec_a, num_elec_b
)
else:
# If we do have average orbital occupancy information, use it to refine the
# full set of noisy configurations
bitstrings, probs = recover_configurations(
raw_bitstrings,
raw_probs,
current_occupancies[0], # occs_a
current_occupancies[1], # occs_b
num_elec_a,
num_elec_b,
seed,
)
subsamples = [
subsample(bitstrings, probs, min(len(bitstrings), len(bitstrings)), seed)
for _ in range(num_batches)
]
for batch in subsamples:
hs_start = time.perf_counter()
# In this tutorial, we include half strings with the following precedence
# Hartree-Fock state (always include) > carryover (from previous round) > recovered samples.
# We add bitstrings in a dictionary as it preserves order and also filters out duplicates.
half_strs_dict = OrderedDict()
# Add hartree-fock state
hartree_fock_state = "0" * (num_orbitals - num_elec_a) + "1" * num_elec_a
half_strs_dict[hartree_fock_state] = 1
# Add carry-overs
# First, we split carryover full strs into unique alpha and beta halves
unique_ab_co = unique_alpha_beta_combined(carryover_full_strs)
for bs in unique_ab_co.keys():
half_strs_dict[bs] = 1
# Add recovered bitstrings
unique_ab_recovered = unique_alpha_beta_combined(batch)
for bs in unique_ab_recovered.keys():
half_strs_dict[bs] = 1
half_strs = list(half_strs_dict.keys())
print(" num total half strs: ", len(half_strs))
# We will use a subset of accumulated half strings in the Subspace construction
# defined by the `samples_per_batch` parameter. For this tutorial, the subspace
# dimension will be samples_per_batch x samples_per_batch
half_strs = half_strs[:samples_per_batch]
print(" num selected half strs: ", len(half_strs))
half_strs.sort() # fq.Subspace() also sorts bitstrings, so can be skipped.
hs_end = time.perf_counter()
print(f" Half strs construction took: {hs_end - hs_start:.4f} seconds")
s_start = time.perf_counter()
S = fq.Subspace([half_strs, half_strs])
s_end = time.perf_counter()
print(f" Subspace construction took: {s_end - s_start:4f} seconds")
print(
f" Subspace dimension: {len(half_strs)} x {len(half_strs)} = {len(half_strs) ** 2:_}"
)
proj_start = time.perf_counter()
Hsub = fq.SubspaceHamiltonian(fulqrum_operator, S)
# Convert the SubspaceHamiltonian into a CSR format sparse matrix,
# which will also be wrapped in a LinearOperator for faster eigensolving.
Hsub_csr_linop = Hsub.to_csr_linearoperator_fast(verbose=False)
proj_end = time.perf_counter()
print(f" Operator projection took: {proj_end - proj_start:4f} seconds")
total_bytes = Hsub_csr_linop.memory_size
total_mega_bytes = total_bytes / (1024 * 1024)
print(f" CSR matrix memory: {total_mega_bytes:.6f} MBs")
# Constructing a simple initial guess vector
v0_start = time.perf_counter()
diag_vec = Hsub.diagonal_vector()
min_idx = np.where(diag_vec == diag_vec.min())[0]
v0 = np.zeros(len(S), dtype=Hsub.dtype)
v0[min_idx] = 1
v0_end = time.perf_counter()
print(
f" Initial guess vector v0 construction took: {v0_end - v0_start:.4f} seconds"
)
print(" Starting eigensolving ...")
start = time.perf_counter()
eigvals, eigvecs = spla.eigsh(
Hsub_csr_linop, # use `Hsub` for the matrix-free mode. Memory-efficient but slower.
k=1,
which="SA",
tol=tol,
v0=v0,
)
end = time.perf_counter()
print(f" Eigensolving took: {end - start:.4f} seconds")
print(f" Electronic Energy: {eigvals}")
print(f" Total Energy: {eigvals + nuclear_repulsion_energy}")
eigvecs = eigvecs.ravel()
# Subspace class has a method named `get_orbital_occupancies()` to compute
# electron orbital occupancies from the eigenvector and subspace bitstrings.
# The method accepts probabilities, and thus, we must do |amplitude| ^ 2.
# Returns alpha and beta occupanices as a length-2 tuple in the order
# ([a0 ... aN], [b0 ... bN]).
current_occupancies = S.get_orbital_occupancies(
probs=np.abs(eigvecs) ** 2, norb=num_orbitals
)
# Next, we get important bitstrings to include (carryover to) in the next round.
# Important bitstrings are the ones with |amplitude| > threshold.
# It is a must to provide |amplitude|, i.e., np.abs(eigenvector) to the function.
# The function returns a list of length-2 tuples, where the first element of the tuple
# is the full bitstrings, and the second element is its weight (|amplitude|).
# The list is also sorted in the descending order of |amplitude|.
carryover_full_strs_and_weights = get_carryover_full_strs(
S, np.abs(eigvecs), carryover_threshold
)
carryover_full_strs = [item[0] for item in carryover_full_strs_and_weights]
print(f" num carryover full strs: {len(carryover_full_strs)}")
iter_end = time.perf_counter()
print(f"Iter {ni} took: {iter_end - iter_start:.4f} seconds\n")
ITERATION 0
num total half strs: 75
num selected half strs: 75
Half strs construction took: 0.0001 seconds
Subspace construction took: 0.001573 seconds
Subspace dimension: 75 x 75 = 5_625
Operator projection took: 0.064944 seconds
CSR matrix memory: 0.287434 MBs
Initial guess vector v0 construction took: 0.0000 seconds
Starting eigensolving ...
Eigensolving took: 0.0055 seconds
Electronic Energy: [-32.60417101]
Total Energy: [-108.83527355]
num carryover full strs: 5
Iter 0 took: 0.0776 seconds
ITERATION 1
num total half strs: 3867
num selected half strs: 500
Half strs construction took: 0.0031 seconds
Subspace construction took: 0.069567 seconds
Subspace dimension: 500 x 500 = 250_000
Operator projection took: 2.525665 seconds
CSR matrix memory: 115.867283 MBs
Initial guess vector v0 construction took: 0.0003 seconds
Starting eigensolving ...
Eigensolving took: 0.2498 seconds
Electronic Energy: [-32.73355104]
Total Energy: [-108.96465358]
num carryover full strs: 2229
Iter 1 took: 3.0796 seconds
ITERATION 2
num total half strs: 3884
num selected half strs: 500
Half strs construction took: 0.0035 seconds
Subspace construction took: 0.071565 seconds
Subspace dimension: 500 x 500 = 250_000
Operator projection took: 2.714844 seconds
CSR matrix memory: 161.923939 MBs
Initial guess vector v0 construction took: 0.0014 seconds
Starting eigensolving ...
Eigensolving took: 0.3775 seconds
Electronic Energy: [-32.78512381]
Total Energy: [-109.01622635]
num carryover full strs: 4765
Iter 2 took: 3.4013 seconds
ITERATION 3
num total half strs: 3884
num selected half strs: 500
Half strs construction took: 0.0042 seconds
Subspace construction took: 0.068180 seconds
Subspace dimension: 500 x 500 = 250_000
Operator projection took: 2.727803 seconds
CSR matrix memory: 196.800907 MBs
Initial guess vector v0 construction took: 0.0004 seconds
Starting eigensolving ...
Eigensolving took: 0.4339 seconds
Electronic Energy: [-32.79179802]
Total Energy: [-109.02290056]
num carryover full strs: 6098
Iter 3 took: 3.4597 seconds
ITERATION 4
num total half strs: 3874
num selected half strs: 500
Half strs construction took: 0.0044 seconds
Subspace construction took: 0.069365 seconds
Subspace dimension: 500 x 500 = 250_000
Operator projection took: 2.874993 seconds
CSR matrix memory: 225.126499 MBs
Initial guess vector v0 construction took: 0.0003 seconds
Starting eigensolving ...
Eigensolving took: 0.4042 seconds
Electronic Energy: [-32.79448011]
Total Energy: [-109.02558265]
num carryover full strs: 7016
Iter 4 took: 3.5714 seconds