Schedule:
Thursday, June 2, 2016
Grand Ballroom B (Hyatt Regency San Francisco)
* noted as presenting author
Wei Wang, PhD,
Associate Professor of Biostatistics, University of South Florida, Tampa, FL
Paul Ellis Greenbaum, PhD, Research Professor, University of South Florida, Temple Terrace, FL
Craig E. Henderson, PhD, Associate Professor of Psychology, Sam Houston State University, Huntsville, TX
Chen Henian, PhD, Associate Professor of Biostatistics, University of South Florida, Tampa, FL
Integrative Data Analysis (IDA; Curran & Hussong, 2009) has provided a comprehensive approach to pool multiple studies and analyze them integratively. Analyses are usually separated into two steps, measurement harmonization and advanced statistical modeling. Evidence is mixed on whether IDA provides sufficient accuracy to answer substantive research questions. Common problems that may introduce bias are usually more severe and could be overlooked in IDA. First the discrepancy between the population of interest and the study samples may lead to heavily sample driven conclusions. Second, even with randomization, the balance between the treatment and control groups is hard to maintain starting from the baseline measurements and this may enlarge or shrink estimated intervention effects. Third, severity of missing data issues can also be multiplied when data from multiple studies are combined. Besides, the complexity of IDA often exhausts computational capacity of existing software packages thus simplification of statistical modelling are inevitable. This may further introduce bias into the analysis results. We attempt to incorporate existing statistical bias correction strategies including multiple imputation (Schafer, 1997), propensity score matching (Rubin, 2006), and entropy balancing (Hainmueller, 2011) into IDA at different stages of implementation. Through extensive Monte-Carlo simulation studies, we aim to provide practical guidance on which method to use under different scenarios based on their practicality and effectiveness.
For illustration purposes, this paper applies the method and re-analyzes a previously published IDA project (Greenbaum et al, 2014). In this analysis, we examined sex and ethnicity as moderators of Multidimensional Family Therapy (MDFT) effectiveness for adolescent drug abuse. Data were pooled from 5 independent MDFT randomized trials to increase power to examine impacts on subgroups. Participants were 646 youth, 11 to 17 year olds (M = 15.31, SD = 1.30) receiving treatment for drug use, 19% female, 14% European American, 35% Hispanic, and 51% African American. Youth were randomized to MDFT or active comparison treatments, which varied by study. Drug use involvement (frequency & consequences) was measured at baseline, 6-, and 12-months by a four-indicator latent variable. We discovered that applying entropy balancing at the modeling stage in conjunction with multiple imputation prior to measurement harmonization was more effective. In summary, using the new approach we observed more stable outcomes that are reflected by the smaller variations across estimates from multiple calibration samples. Results confirmed general effectiveness of MDFT and inequalities of treatment strength by gender and race/ethnicity groups.