Abstract: Intentional Missing Data in a Single Mediator Model (Society for Prevention Research 21st Annual Meeting)

64 Intentional Missing Data in a Single Mediator Model

Schedule:
Wednesday, May 29, 2013
Pacific N/O (Hyatt Regency San Francisco)
* noted as presenting author
Amanda Neeche Baraldi, MS, Graduate Student, Arizona State University, Tempe, AZ
David Peter MacKinnon, PhD, Professor, Arizona State University, Tempe, AZ
Introduction: Prevention researchers often use mediation analyses to understand the causal chain of relations between three variables of interest. In terms of data collection, researchers must choose appropriate measures to evaluate a particular construct and determine how many participants to include in the sample. Assuming a researcher has finite resources (time and money), it is important to determine the best way to allocate these resources in order to answer a research question. Researchers may choose between collecting a large sample of data with an unreliable but inexpensive measure or collecting a smaller sample of data with a more reliable but costly measure. It may be possible that a design with purposeful missing data may maximize both power and accuracy by “borrowing” information from a large sample of an unreliable mediator and a small subsample of a reliable mediator. This poster evaluates the use of a simplified version of the two-method measurement design as described by Graham et al. (2006).

Method: A Monte Carlo computer simulation was used to evaluate the performance of two potential Maximum Likelihood methods for estimating a mediation model. In the first analysis approach, the Auxiliary Variable Model, the reliable mediator is the mediator in a single mediator model and the unreliable measure is used as an auxiliary variable to increase power and accuracy.  In the second maximum likelihood approach, the Latent Indicator Model, both the reliable and unreliable measures are included in the model as latent variable indicators of a Latent Mediator Variable. Three mediation effect sizes (small, medium and large), four sample size conditions (N’s between 100 and 5000), 2 missing data rates (small and large), and two estimation models (Auxiliary Variable Model vs. Latent Indicator Model) were studied.

Results: The mediated effect is severely underestimated in the Auxiliary Variable Model and the estimates in the Latent Indicator condition are much less biased. However the Latent Indicator results may be biased as nearly 1/3 of of the Latent Variable Indicator replications were dropped due to a non-positive definite covariance matrix.  For small sample sizes and small mediated effects, the two methods were severely underpowered in terms of detecting the mediated effect. For example, for small mediated effect sizes with N = 200 and a 50% missing data rate, power for both models ranged from 2.1% to 4.4 depending on the exact parameter specification simulated. Fortunately, for medium to large effect sizes and large sample sizes, these models are sufficiently powered (power varied depending on the exact model specification).

Conclusions More simulation conditions would allow for more overarching conclusions about the alternative methods. Although these models have limitations, the results are promising for selection of planned missing data models that  maximize power and accuracy. Researchers interested in designing a planned missing data study would be prudent to conduct a simulation study to investigate data collection based on the theorized parameters that will minimize costs and increase the power and accuracy of estimates.