FastRerandomize Package Tutorial
2025-01-22
Introduction
This vignette demonstrates how to use the fastrerandomize package for generating and testing experimental designs in an agricultural study setting based on:
- Burchardi, K.B., Gulesci, S., Lerva, B. and Sulaiman, M., 2019. Moral hazard: Experimental Evidence from Tenancy Contracts. The Quarterly Journal of Economics, 134(1), pp.281-347.
We will walk through:
- Generating synthetic covariate data
- Creating candidate randomizations
- Exploring the S3 methods for printing, summarizing, and plotting randomization objects
- Conducting a randomization-based inference test
Note: This tutorial assumes you have installed the fastrerandomize package, either from source or via GitHub:
# If you haven't installed or set up the package:
# devtools::install_github("cjerzak/fastrerandomize-software/fastrerandomize")
# (Done once) Optionally build the JAX backend if needed
# fastrerandomize::build_backend()
# note that this vignette can be found like so:
# vignette(package = "fastrerandomize")
1. Obtain Pre-treatment Covariate Data
For illustration, we use several pre-treatment covariates from the original experiment.
library(fastrerandomize)
# Obtain pre-treatment covariates
data(QJEData, package = "fastrerandomize")
myCovariates <- c("children","married","hh_size","hh_sexrat")
QJEData <- QJEData[!is.na(rowSums(QJEData[,myCovariates])),]
X <- QJEData[,myCovariates]
# Analysis parameters
n_units <- nrow(X)
n_treated <- round(nrow(X)/2)
2. Generate Randomizations
We now create our candidate randomizations using the function
generate_randomizations()
from fastrerandomize. Depending
on the scale and complexity of your experiment, you may choose either
exact enumeration (“exact”) or Monte Carlo (“monte_carlo”)
randomization.
In this example, we’ll use the Monte Carlo approach. Note that
randomization_accept_prob
is set very low (0.0001) purely
for demonstration.
RunMainAnalysis <- (!is.null(check_jax_availability()))
# if FALSE, consider calling build_backend()
if(RunMainAnalysis){
CandRandomizations <- generate_randomizations(
n_units = n_units,
n_treated = n_treated,
X = X,
randomization_type = "monte_carlo",
max_draws = 100000L,
batch_size = 1000L,
randomization_accept_prob = 0.0001
)
}
## Using monte carlo randomization
## Starting batch processing with 100 batches.
## Skipping to batch 100 of 100
## On batch 100 of 100
## MC Loop Time (s): 1.2409
4. S3 Methods: Print, Summary, and Plot
generate_randomizations()
returns an S3 object of class
fastrerandomize_randomizations
. We can use its associated
print, summary, and plot methods to inspect the candidate
randomizations.
if(RunMainAnalysis){
# 4a. Print the object
print(CandRandomizations)
# 4b. Summary
summary(CandRandomizations)
# 4c. Plot the balance distribution
plot(CandRandomizations)
}
## Object of class 'fastrerandomize_randomizations'
##
## Call:
## generate_randomizations(n_units = n_units, n_treated = n_treated,
## X = X, randomization_accept_prob = 1e-04, max_draws = 100000L,
## batch_size = 1000L, randomization_type = "monte_carlo")
##
## Number of candidate randomizations: 10
## Number of units (columns): 302
## Balance statistics available for each randomization.
## Summary of 'fastrerandomize_randomizations' object:
##
## Balance statistics (summary):
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.003574 0.008231 0.008746 0.009835 0.013152 0.017181
3. Randomization Test
Next, we showcase a randomization-based inference procedure using fastrerandomize.
3a. Setup Simulated Outcomes
We simulate an outcome Y
that depends on our covariates
X
, the chosen randomization, and some true treatment
effect τ. We pick the first acceptable randomization from our set as
our “observed” assignment Wobs
.
if(RunMainAnalysis){
CoefY <- rnorm(ncol(X)) # Coefficients for outcome model
Wobs <- CandRandomizations$randomizations[1, ] # 1st acceptable randomization
tau_true <- 1 # true treatment effect
# Generate Y = X * Coef + W*tau + noise
Yobs <- as.numeric(as.matrix(X) %*% as.matrix(CoefY) +
Wobs * tau_true + rnorm(n_units, sd = 0.1))
}
3b. Run the Randomization Test
We pass our observed treatment assignment (obsW
),
observed outcomes (obsY
), and the full matrix of
candidate randomizations
(candidate_randomizations
) to
randomization_test()
. This test:
- Computes the observed difference in means
- Permutes the treatment assignment across our candidate randomizations
- Estimates the p-value by comparing the observed statistic to its permutation distribution
if(RunMainAnalysis){
randomization_test_results <- randomization_test(
obsW = Wobs,
obsY = Yobs,
candidate_randomizations = CandRandomizations$randomizations,
findFI = TRUE
)
# Inspect results
print(randomization_test_results)
summary(randomization_test_results)
plot(randomization_test_results)
}
## Object of class 'fastrerandomize_test'
##
## Call:
## randomization_test(obsW = Wobs, obsY = Yobs, candidate_randomizations = CandRandomizations$randomizations,
## findFI = TRUE)
##
## P-value: 0.1
## Observed effect (tau_obs): 0.9888765
## Fiducial interval: [-547.2352, 498.731]
## Summary of 'fastrerandomize_test' object:
##
## Call:
## randomization_test(obsW = Wobs, obsY = Yobs, candidate_randomizations = CandRandomizations$randomizations,
## findFI = TRUE)
##
## P-value:
## 0.1
##
## Observed effect (tau_obs):
## 0.9888765
##
## Fiducial Interval:
## [-547.2352, 498.731]
The findFI = TRUE
flag attempts to locate a fiducial
interval (FI) for the treatment effect.
Conclusion
This tutorial demonstrated a basic workflow with fastrerandomize:
- Generating synthetic or real covariate data
- Producing a set of acceptable treatment assignments via randomization
- Visualizing and summarizing these assignments
- Testing for treatment effects with a randomization-based approach
We encourage you to consult the package documentation for more advanced functionalities, including GPU-accelerated computations via JAX and flexible definitions of balance criteria.