In the world of DoD test and evaluation, collecting sufficient data to evaluate system performance against operationally realistic threats is often not possible due to cost and resource restrictions, safety concerns, or lack of adequate or representative threats. Thus, modeling and simulation (M&S) tools are frequently used to augment live testing in order to facilitate a more complete evaluation of performance. When M&S is used as part of an operational evaluation, the M&S capability should first be rigorously validated to ensure it is representing the real world adequately enough for the intended use. Specifically, the usefulness and limitations of the M&S should be well characterized, and uncertainty quantified to the extent possible. Many statistical techniques are available to rigorously compare M&S output with live test data. This document will describe some of these methodologies and present recommendations for a variety of data types and sizes. We will show how design for computer experiments can be used to efficiently cover the simulation domain and inform live testing. Experimental design and corresponding statistical analysis techniques for comparing live and simulated data will be discussed and compared. A simulation study shows that regression analysis is the most powerful comparison when experimental design techniques are used, while more robust non-parametric techniques provide widely applicable solutions for the comparison.