A First Step into the Bootstrap World

May, 2016
IDA document: D-5816
Type: Documents
Division: Operational Evaluation Division

Download Publication

Bootstrapping is a powerful nonparametric tool for conducting statistical inference with many applications to data from operational testing. Bootstrapping is most useful when the population sampled from is unknown or complex or the sampling distribution of the desired statistic is difficult to derive. Careful use of bootstrapping can help address many challenges in analyzing operational test data.

Bootstrapping is predicated on the use of the sample data as a plug in estimator for the population. Inference is then conducted in this “bootstrap world” wherein the population of interest is identical to the observed sample. With the population known, repeated sampling can be used to characterize the desired sampling distribution up to Monte Carlo error. This can now be used to calculate exact confidence intervals or perform relevant hypothesis tests within the bootstrap world. These bootstrap world intervals and p-values can be treated as estimates in the real world.

This briefing provide an outline for this approach and include examples applying these principles to synthetic data sets generated to mimic operational test data. The role of the sampling distribution in statistical inference is described, and bootstrapping is motivated intuitively using the metaphor of the bootstrap world introduced above. Examples include confidence intervals for sample means and medians, how to apply the bootstrap to complex statistics involving random variables from multiple distributions (such as Availability calculations), and hypothesis testing via the bootstrap.