Sample from state vectors

This guide shows how to use the ffsim.sample_state_vector function to sample configurations (bitstrings) from fermionic state vectors. We will cover:

  • Spinful vs spinless systems

  • Sampling all orbitals vs a subset of orbitals

  • Different bitstring output types

  • Using StateVector objects vs raw NumPy arrays

Spinful example: alpha–beta sectors

We consider a spinful system with separate alpha and beta electrons. Here:

  • norb is the number of spatial orbitals.

  • nelec is a tuple (n_alpha, n_beta).

The state vector lives in the tensor product of alpha and beta subspaces, each with fixed particle number. ffsim.sample_state_vector can return bitstrings either:

  • concatenated as \(s_b s_a\) (beta on the left, alpha on the right), or

  • split as a pair (strings_a, strings_b).

[1]:
import ffsim

# Spinful example: 3 spatial orbitals, 2 alpha and 1 beta electron
norb = 3
nelec = (2, 1)

# Hartree–Fock state in this subspace
vec_spinful = ffsim.hartree_fock_state(norb, nelec)

print("Hartree–Fock state (first few amplitudes):")
print(vec_spinful[:5])

# Sample 8 bitstrings with default concatenation (s_b s_a)
samples_concat = ffsim.sample_state_vector(
    vec_spinful,
    norb=norb,
    nelec=nelec,
    shots=8,
)
print("Concatenated samples (s_b s_a):", samples_concat)

# Sample again, but keep alpha and beta strings separate
samples_a, samples_b = ffsim.sample_state_vector(
    vec_spinful,
    norb=norb,
    nelec=nelec,
    shots=8,
    concatenate=False,
)
print("Alpha strings:", samples_a)
print("Beta  strings:", samples_b)
Hartree–Fock state (first few amplitudes):
[1.+0.j 0.+0.j 0.+0.j 0.+0.j 0.+0.j]
Concatenated samples (s_b s_a): ['001011', '001011', '001011', '001011', '001011', '001011', '001011', '001011']
Alpha strings: ['011', '011', '011', '011', '011', '011', '011', '011']
Beta  strings: ['001', '001', '001', '001', '001', '001', '001', '001']

For spinful systems:

  • Each alpha configuration is a bitstring of length norb.

  • Each beta configuration is also a bitstring of length norb.

By default (concatenate=True), the output for each sample is a single bitstring "s_b s_a":

  • The beta string s_b appears on the left.

  • The alpha string s_a appears on the right.

If you set concatenate=False, sample_state_vector returns two lists:

  • strings_a: one alpha bitstring for each shot,

  • strings_b: one beta bitstring for each shot.

This is often convenient when post-processing alpha and beta sectors separately.

Spinless example: sampling all orbitals

We now look at a spinless system where we only track one kind of fermion. Again norb is the number of spatial orbitals. But, nelec is not a tuple, it is a single integer and the Hilbert space lives in the fixed-particle-number subspace. After having norb and nelec, we will generate a random normalized state vector in that subspace. Then, we call ffsim.sample_state_vector to draw bitstrings.

[2]:
# Spinless example: 4 orbitals, 2 fermions
norb = 4
nelec = 2  # spinless: just one integer

# Dimension of the fixed-particle-number space
dim = ffsim.dim(norb, nelec)
print(f"Dimension (spinless, norb={norb}, nelec={nelec}): {dim}")

# Random normalized state vector
vec = ffsim.random.random_state_vector(dim, seed=123)

# Sample 8 bitstrings from this state
samples = ffsim.sample_state_vector(
    vec,
    norb=norb,
    nelec=nelec,
    shots=8,
)
print("Samples (strings):", samples)
Dimension (spinless, norb=4, nelec=2): 6
Samples (strings): ['1100', '0110', '0110', '0110', '0011', '1010', '0101', '1100']

Each sampled bitstring has length norb in the spinless case. For example, for norb = 4:

  • "0101" means orbitals 0 and 2 are occupied (reading from right to left),

  • "1100" means orbitals 2 and 3 are occupied, etc.

Internally, ffsim.sample_state_vector:

  1. Computes probabilities $p_i = |:nbsphinx-math:psi_i|^2 $ from the state vector.

  2. Uses a pseudorandom sampler (NumPy’s default_rng) to pick indices according to these probabilities.

  3. Converts indices to bitstrings in the same ordering convention as PySCF’s FCI module.

Sampling a subset of orbitals

Sometimes, we only want measurement outcomes on a subset of spin-orbitals: for example, when we are only interested in a local region or a fragment.

This is controlled by the orbs argument.

  • Spinless:

    • orbs is a list of spatial orbital indices (0 to norb - 1).

  • Spinful:

    • orbs is a pair (orbs_a, orbs_b) of lists for alpha and beta.

[3]:
# Spinless again: 4 orbitals, 2 fermions
norb = 4
nelec = 2
dim = ffsim.dim(norb, nelec)
vec = ffsim.random.random_state_vector(dim, seed=42)

print("All-orbit samples:")
samples_all = ffsim.sample_state_vector(vec, norb=norb, nelec=nelec, shots=5)
print(samples_all)

print("\nSamples restricted to orbitals [0, 2]:")
samples_02 = ffsim.sample_state_vector(
    vec,
    norb=norb,
    nelec=nelec,
    orbs=[0, 2],  # keep only these spatial orbitals
    shots=5,
)
print(samples_02)
All-orbit samples:
['1001', '1001', '1001', '1100', '1100']

Samples restricted to orbitals [0, 2]:
['10', '11', '00', '00', '00']
[4]:
# Spinful subset example
norb = 3
nelec = (2, 1)
vec_spinful = ffsim.random.random_state_vector(ffsim.dim(norb, nelec), seed=7)

# Define which spatial orbitals to keep for alpha and beta
orbs_a = [0, 2]  # alpha sector
orbs_b = [1, 2]  # beta sector

samples_a, samples_b = ffsim.sample_state_vector(
    vec_spinful,
    norb=norb,
    nelec=nelec,
    orbs=(orbs_a, orbs_b),
    shots=5,
    concatenate=False,
)
print("Restricted alpha strings:", samples_a)
print("Restricted beta  strings:", samples_b)
Restricted alpha strings: ['11', '01', '10', '10', '11']
Restricted beta  strings: ['00', '00', '10', '00', '10']

Choosing the bitstring type

The bitstring_type argument controls how bitstrings are represented: ffsim.BitstringType has three options:

  • STRING: lists of strings like "0101" (default),

  • INT: lists of integers, where each integer is the bitstring interpreted in binary (e.g. "0101" 5),

  • BIT_ARRAY: 2D NumPy arrays of booleans, where each row is one sample.

This is useful when interfacing with different libraries or when you prefer a specific format for analysis.

[5]:
norb = 4
nelec = 2
dim = ffsim.dim(norb, nelec)
vec = ffsim.random.random_state_vector(dim, seed=99)

# Default: STRING
samples_str = ffsim.sample_state_vector(
    vec,
    norb=norb,
    nelec=nelec,
    shots=4,
    bitstring_type=ffsim.BitstringType.STRING,
)
print("STRING:", samples_str)

# Integers
samples_int = ffsim.sample_state_vector(
    vec,
    norb=norb,
    nelec=nelec,
    shots=4,
    bitstring_type=ffsim.BitstringType.INT,
)
print("INT   :", samples_int)

# Bit arrays
samples_array = ffsim.sample_state_vector(
    vec,
    norb=norb,
    nelec=nelec,
    shots=4,
    bitstring_type=ffsim.BitstringType.BIT_ARRAY,
)
print("BIT_ARRAY shape:", samples_array.shape)
print(samples_array.astype(int))
STRING: ['1010', '1100', '1010', '1010']
INT   : [10, 10, 12, 12]
BIT_ARRAY shape: (4, 4)
[[1 0 1 0]
 [1 1 0 0]
 [1 1 0 0]
 [1 1 0 0]]

Seeding and reproducibility

The seed argument lets you control the randomness. It accepts anything that can be passed to np.random.default_rng, such as an integer or an existing Generator instance.

Using a fixed seed makes your sampling results reproducible, which is important for testing and documentation.

[6]:
norb = 3
nelec = 2
dim = ffsim.dim(norb, nelec)
vec = ffsim.random.random_state_vector(dim, seed=123)

# Same seed → same samples
samples1 = ffsim.sample_state_vector(vec, norb=norb, nelec=nelec, shots=6, seed=2025)
samples2 = ffsim.sample_state_vector(vec, norb=norb, nelec=nelec, shots=6, seed=2025)

print("Samples 1:", samples1)
print("Samples 2:", samples2)
Samples 1: ['110', '101', '110', '110', '110', '011']
Samples 2: ['110', '101', '110', '110', '110', '011']

Using StateVector objects directly

Many parts of ffsim work with a StateVector dataclass that stores:

  • the raw vector coefficients,

  • the number of spatial orbitals norb,

  • the electron numbers nelec.

sample_state_vector accepts either:

  • a raw np.ndarray plus explicit norb and nelec, or

  • a StateVector object, in which case norb and nelec must be omitted.

This behavior is implemented by the helper canonicalize_vec_norb_nelec, which unifies the inputs into a canonical form before sampling.

[7]:
from ffsim.states import StateVector

norb = 4
nelec = (2, 2)
dim = ffsim.dim(norb, nelec)

# Raw state vector
vec_array = ffsim.random.random_state_vector(dim, seed=314)

# Wrap into a StateVector
state = StateVector(vec=vec_array, norb=norb, nelec=nelec)

# Case 1: raw NumPy array → need norb and nelec
samples_array = ffsim.sample_state_vector(
    vec_array,
    norb=norb,
    nelec=nelec,
    shots=4,
)
print("Samples from raw array:", samples_array)

# Case 2: StateVector → norb and nelec taken from the object
samples_state = ffsim.sample_state_vector(
    state,
    shots=4,
)
print("Samples from StateVector:", samples_state)
Samples from raw array: ['00111001', '10101010', '10101010', '01101010']
Samples from StateVector: ['01101010', '00111010', '11000110', '10100101']

Summary

In this guide we showed how to use ffsim.sample_state_vector to:

  • sample from spinless and spinful state vectors,

  • restrict sampling to subsets of orbitals via orbs,

  • choose different bitstring output formats via bitstring_type,

  • control randomness with seed,

  • work with both raw NumPy arrays and StateVector objects.

These tools are useful whenever you want to emulate projective measurements directly from a fermionic state vector, for example when:

  • post-processing states obtained from Trotterized time evolution,

  • analyzing variational ansätze,

  • or integrating with Qiskit-based workflows that consume bitstrings.