FastRerandomize - Fast Rerandomization Using Accelerated Computing

Publication

FastRerandomize in SoftwareX (2026)

FastRerandomize pairs accelerated hardware with a minimal software interface, making rigorous rerandomization feasible for large-scale experiments without extra complexity.

The paper documents three advances: accelerated balance checks, key-only storage, and design-respecting inference. The package uses a two-layer design: an R front-end for ergonomics and a JAX/XLA backend for batched kernels on CPU, GPU, or TPU.

Access the full manuscript on arXiv; the formal SoftwareX citation is listed below for reference.

What is rerandomization?

Rerandomization is a design-stage procedure: repeatedly draw treatment assignments and keep only those that meet a pre-specified covariate-balance rule.

For example, one rule accepts an assignment when the distance, M, between treatment and control covariate means is below a cutoff, a: accept if M ≤ a.

The acceptance probability q is the fraction of random assignments that would pass this rule; smaller q means more stringent balance.

Design-aware inference

Design changes the randomization distribution, so inference must match the design.

Design: choose covariates + a balance rule (e.g., M ≤ a, equivalently q).
Generate: draw many candidate assignments and keep the accepted set.
Infer: randomization tests must resample from the accepted set, not from all possible assignments.

Features

FastRerandomize is designed for clarity and speed across experimental scales.

Sharper balance, better precision

Keep only assignments passing a balance threshold (M ≤ a), improving precision when covariates predict outcomes. The stronger the covariate–outcome relationship (higher R²), the larger the precision gain.

Practical at modern scale

Handles large samples and high-dimensional covariates so rerandomization remains usable in real-world, data-rich experiments.

Accelerated balance checks

Evaluate balance for large batches efficiently (CPU/GPU/TPU) via batched, auto-vectorized kernels with just-in-time compilation. This makes stringent q (very selective acceptance) feasible even at large scale.

Technical details

Key-only storage

Store compact PRNG keys and regenerate full assignments on demand—dramatically reducing memory requirements. This allows massive accepted pools without memory blowups.

Performance

CPU vs GPU scaling

Benchmark results from the FastRerandomize paper, highlighting accelerated performance as sample sizes grow.

In a representative benchmark (n = 100, d = 100, 2×10⁵ draws), the GPU backend completes pool generation in about 5 s versus about 112 s for a baseline R workflow (about 24x faster). At n = 1000 and d = 1000, GPU time drops from about 91 s (CPU) to about 7 s (GPU), a ~96% reduction, and peak speedups reach ~42x in high-dimensional, stringent settings.

Benchmark note: Speedups depend on N, covariate dimension d, acceptance probability q, number of draws, and hardware. See the paper for exact configurations.

Performance benchmarks for n=100 (light mode) — Performance: n = 100.

Performance benchmarks for n=100 (dark mode) — Performance: n = 100.

Performance benchmarks for n=1000 (light mode) — Performance: n = 1000.

Performance benchmarks for n=1000 (dark mode) — Performance: n = 1000.

Minimal API example

Acceptance probability q sets stringency (how selective the acceptance rule is; lower q is more stringent). Holding q fixed, drawing more candidates does not change expected balance (the rule hasn't changed). It mainly (i) increases the size of the accepted pool used for design-respecting inference and (ii) yields finer p-value resolution because randomization-test p-values are discrete when based on a finite accepted set.

At q = 1%, 10⁵ draws yields about 1,000 accepted assignments (min p about 0.0010), while 2×10⁵ draws yields about 2,000 accepted (min p about 0.0005). Use diagnose_rerandomization() to choose q based on n, d, R², σ, and a target effect size.

# Step 1: Set up the environment
library(fastrerandomize)

# Build the backend with Python dependencies
build_backend()

# Step 2: Create some example covariate data
set.seed(999L)
n <- 1000
X <- matrix(rnorm(n * 5), n, 5)

# Step 3: Generate balanced treatment assignments using the main function
result <- generate_randomizations(
  n_units = n,
  n_treated = n/2,    # Number of treated units (50%)
  X = X,              # Covariate matrix
  randomization_accept_prob = 0.01,  # Acceptance probability
  randomization_type = "monte_carlo"  # Use Monte Carlo sampling
)

# Examine the results
head(result$randomizations)  # Treatment assignments
summary(result$balance)      # Balance statistics
plot(result)                 # Plot distribution of balance measures
        

# Advanced Monte Carlo Batching
library(fastrerandomize)

# Create example data with many covariates
set.seed(987)
n <- 5000
p <- 20
X <- matrix(rnorm(n * p), n, p)

# For large datasets, use monte_carlo explicitly
result_mc <- generate_randomizations_mc(
  n_units = n,
  n_treated = n/2,
  X = X,
  randomization_accept_prob = 0.001,  # Stricter balance criterion
  max_draws = 1e6,       # Maximum number of randomizations to draw
  batch_size = 10000,    # Process in batches for memory efficiency
  approximate_inv = TRUE # Use diagonal approximation for speed
)

# For faster computation with GPU acceleration
# Especially useful for large datasets
gpu_result <- generate_randomizations(
  n_units = n,
  n_treated = n/2,
  X = X,
  randomization_accept_prob = 0.0001,  # Very stringent
  max_draws = 1e6,
  batch_size = 10000,
  randomization_type = "monte_carlo",
  verbose = TRUE        # Show progress information
)
        

# Using the randomization test functionality
library(fastrerandomize)

# 1. Generate covariates
set.seed(123)
n <- 100
X <- matrix(rnorm(n * 3), n, 3)

# 2. Generate candidate randomizations
randomizations <- generate_randomizations(
  n_units = n,
  n_treated = n/2,
  X = X,
  randomization_accept_prob = 0.05,
  randomization_type = "monte_carlo",
  max_draws = 10000,
  batch_size = 1000
)

# 3. Generate a simulated outcome with known effect
obsW <- randomizations$randomizations[1,]  # Use first randomization
true_effect <- 0.5                        # True treatment effect
obsY <- rnorm(n) + obsW * true_effect

# 4. Conduct randomization test
test_result <- randomization_test(
  obsW = obsW,
  obsY = obsY,
  candidate_randomizations = randomizations$randomizations,
  findFI = TRUE,   # Calculate fiducial interval
  alpha = 0.05     # Significance level
)

# 5. Examine test results
print(test_result)
plot(test_result)  # Visualize effect and fiducial interval
        

Citation

How to cite FastRerandomize

Connor T. Jerzak, Rebecca Goldstein, Aniket Kamat, and Fucheng Warren Zhu. FastRerandomize: An R Package for Fast Rerandomization Using Accelerated Computing. SoftwareX, 2026. Also available on arXiv.

@article{jerzak2025fastrerandomize,
  title={FastRerandomize: An R Package for Fast Rerandomization Using Accelerated Computing},
  author={Jerzak, Connor T. and Rebecca Goldstein and Aniket Kamat and Fucheng Warren Zhu},
  journal={SoftwareX},
  year={2026}
}

Try fastrerandomize:

This interactive capsule reproduces the paper's workflow and performance comparisons in a browser-friendly environment. It mirrors the same accelerated backend used in the package.