Abstract: Comparison of Two Methods on Propensity Score Analysis in Practice (Society for Prevention Research 25th Annual Meeting)

551 Comparison of Two Methods on Propensity Score Analysis in Practice

Schedule:
Friday, June 2, 2017
Bunker Hill (Hyatt Regency Washington, Washington DC)
* noted as presenting author
Ji Hoon Ryoo, PhD, Assistant Professor, University of Virginia, Charlottesville, VA
Elise Touris Pas, PhD, Assistant Scientist, Johns Hopkins University Bloomberg School of Public Health, Washington, DC
Rashelle Musci, Ph.D., Assistant Professor, The Johns Hopkins University, Baltimore, MD
Catherine Bradshaw, PhD, Professor and Associate Dean for Research & Faculty Development, University of Virginia, Charlottesville, VA
Introduction: Guo and Fraser (2015) identified seven different models for estimating treatment effects using propensity score analysis (PSA). Among those seven models, the two most commonly used methods are propensity score matching models (PSMM) and propensity score weighting models (PSWM). PSMM aims to randomize treatment status among a subset of schools that can be matched to a comparison whereas PSWM removes differences in characteristics of the treatment and comparison groups within the whole sample. Inconsistencies between the matched and weighted samples have often been observed in practice due to complexity of study design. There are a couple of reasons such inconsistency may arise; for example, PSA is conducted under the assumption of ignorability, but this may not be justified or the PSA removes the overt bias in the data but does not remove hidden bias nor capture all sources of selection biases. In this paper, we provide results from an empirical study showing the inconsistency between PSMM and PSWM using various diagnostic tools, and recommend a framework to achieve valid PSA in a complex study design.

Method: Data come from elementary, middle, and high schools across the state of Maryland, and part of a study to examine the effects of Positive Behavioral Intervention and Supports (PBIS). Two different PSA models (i.e., PSMM and PSWM) were fit separately for elementary schools and secondary (i.e., middle and high) schools, to remove selection bias in the samples that chose to be trained in PBIS. We then compared the matched sample and weighted sample by examining the use of statistical hypothesis testing, standardized differences, box plots, non-parametric density estimates, and QQ plots to assess confounding after PSAs.

Results: We first demonstrate paradoxical results indicating the inconsistent results from the PSMM and the PSWM. Although the degree to which distance in select baseline measures was reduced differed in the two PSA methods, the two resulting samples from matching and weighting both reduced a significant amount of selection biases. We revisited the variables used in PSAs to identify contributions on bias reduction and improvement of efficiency, which helps to explain the effects of strong ignorance and hidden biases in PSA. The procedure provided subsets of variables to minimize the inconsistency from PSAs.

Conclusions: It is helpful to have different PSA models which can be used in certain circumstances. However, it is still unclear under what circumstances a specific PSA model works better than the others. This study provides a framework on how to identify any inconsistency from selecting the PSMM and the PSWM. Furthermore, the framework can also be applicable to examine any inconsistency among other PSA methods.