Abstract: Development and Validation of Automatic Text Processing Methods for Title and Abstract Screening (Society for Prevention Research 24th Annual Meeting)

173 Development and Validation of Automatic Text Processing Methods for Title and Abstract Screening

Schedule:
Wednesday, June 1, 2016
Bayview A (Hyatt Regency San Francisco)
* noted as presenting author
J.D. Smith, PhD, Assistant Professor, Northwestern University, Chicago, IL
Carlos Gallo, PhD, Research Assistant Professor, Northwestern University, Chicago, IL
Kaitlyn Egan, B.A., Graduate Student, Baylor University, Waco, TX
C. Hendricks Brown, PhD, Professor, Northwestern University, Chicago, IL
Background. In the conduct of systematic reviews, following a search of the available literature, which often results in hundreds or even thousands of potentially relevant studies, the next step is the screening of the titles and abstracts to identify sources for full-text review and extraction of appropriate data. This step is time-intensive, and therefore quite costly, when done by human coders but it can be effectively and efficiently accomplished using automatic text processing methods (Brown et al. 2015; IOM, 2015). The efficient completion of systematic reviews of scientific evidence is critical for keeping pace with research advancement needed to inform policy. Methods. We outline the technical steps required to export search results into text formats that are machine-readable for key words and phrases appearing in the title or abstract. These steps leverage existing platforms for scientific searches (Web of Science) and bibliographic organization (EndNote) in order to reduce the steps necessary to conduct text analysis. We then describe the development of automatic text processing methods using data from a recently completed systematic review of the developmental correlates of childhood obesity, which was conducted for the purposes of identifying the interrelated ecological risk and protective factors in the progression of obesity that can be targeted with prevention strategies during specific developmental periods. The results of the machine-based text analysis methodology were compared to the results obtained via human screening on the dimensions of specificity and sensitivity. Results. The systematic search resulted in 933 published articles. Human screening of titles and abstracts resulted in a total of 277 articles to undergo full-text review and extraction of pertinent data. The computational procedures we developed were able to replicate the results of the human coding with acceptable specificity and sensitivity; thus validating the effectiveness of automatic text processing methods for this step of the review process. Further, the time required for human and machine methods indicated significant cost savings when using automatic text processing. Conclusions. Automatic text processing methods are capable of accurately and efficiently screening titles and abstracts. The cost savings alone warrant further investigation, as does the need for a standardized evaluation process to keep pace with changes introduced by recent legislative actions, such as the Affordable Care Act and the Mental Health Parity and Addiction Equity Act (Institute of Medicine, 2015). The feasibility and next steps for using these methods are discussed.