One approach has been large scale efforts to architect methodological common ground across studies so that key data can be harmonized. Paper One describes lessons learned from harmonizing data from the 21 studies that are part of National Institute on Drug Abuse-funded Seek, Test, Treat and Retain cohort consortium studying substance use and focused on HIV treatment cascade outcomes. 11 common domains across the studies included: demographic characteristics, criminal justice involvement, HIV risk behaviors, HIV and/or hepatitis C infections, laboratory measures of CD4 cell count and HIV viral load, mental health, socioeconomic status, healthcare access, and substance use. STTR data were harmonized to allow for complex analyses of outcomes over diverse populations.
The field has also seen the development and implementation of statistical methods to facilitate integrated data analysis even when measures are across studies (or time points) are not comparable. It is possible to create a commensurate latent variable across multiple different measurement scales as long as there are items in common using moderated non-linear factor analysis (MNLFA). Paper Two uses simulated data to represent multiple samples arising from independent substance use studies which also differ from one another in measurement and in terms of relevant demographic characteristics. The simulated data are pooled, analyzed using MNLFA, and accuracy of the scoring is evaluated. Paper Three uses MNLFA with integrated data from publicly available national data to study nicotine dependence in Hispanic teens.
CONCLUSIONS. Together, these three papers suggest a forward path for secondary researchers; where publicly available data can be combined to address past data limitations leading to greater understanding of substance abuse within and across sociodemographic groups.