Methods: Our data consists of transcripts and human-based ratings of 470 sessions of the New Beginning Program (NBP) for divorcing parents from an effectiveness trial. First, we used a machine classifier that uses Support Vector Machine algorithm to classify between high-quality versus low-quality (defined by human coders) sessions. The classifier was trained on a corpus of two dimensions of quality namely when the intervention group leader a) Provided helpful examples to parents, b) Indicated belief in parent’s ability to use the skills well. Second, we tested the validity of the machine-based ratings following our theoretical model which posits that when sessions are delivered with quality, parents will be more likely to engage in the program and attend the following session.
Results: The machine-based rater correctly classifies 94% of sessions into either low or high quality sessions. We present the linguistic patterns in terms of word frequencies that helped the algorithm identify low and high quality. We present quantitative assessments of predictive validity of the machine-based ratings with parents’ retention and skills practice.
Conclusions: This study presents preliminary evidence that 1) there are linguistic features readily available for automatic recognition of quality, and 2) machine-based methods can be used to monitor implementation in community settings. This will have significant implications for ensuring that EBPs are able to achieve public health impact.