Abstract: Abstract of Distinction: Using a Machine Learning System to Improve Retention in HIV Care (Society for Prevention Research 27th Annual Meeting)

226 Abstract of Distinction: Using a Machine Learning System to Improve Retention in HIV Care

Wednesday, May 29, 2019
Pacific D/L (Hyatt Regency San Francisco)
* noted as presenting author
Arthi Ramachandran, PhD, Postdoctoral Scholar, University of Chicago, Chicago, IL
Avishek Kumar, PhD, Data Scientist, University of Chicago, Chicago, IL
Hannes Koenig, MS, Research Assistant, University of Chicago, Chicago, IL
Christina Sung, MBA, Project Manager, University of Chicago, Chicago, IL
Adolfo De Unanue, PhD, Senior Research, University of Chicago, Chicago, IL
Joe Walsh, PhD, Senior Researcher, University of Chicago, Chicago, IL
John Schneider, MD, MPH, Associate Professor of Medicine, University of Chicago Medicine, Chicago, IL
Jessica Ridgway, MD, Assistant Professor of Medicine, University of Chicago, Chicago, IL
Rayid Ghani, MS, Research Associate Professor, University of Chicago, Chicago, IL
Background: Retaining HIV positive individuals in medical care is critical to arrest onward transmission. However, maintaining quarterly appointments and daily medication for a lifetime is exceedingly difficult due to a complex constellation of factors. Further, minorities are disproportionately impacted and less likely to be retained in care.

Using electronic medical record data, we developed a point-of-care machine learning based model that predicts whether an individual patient will drop from care.

Methods: We developed a scalable machine learning system using raw patient medical data to compute risk scores for dropping out of care. For all patients who received HIV care at the University of Chicago from 2008 to 2015, we built features based on a variety of EMR variables including insurance, appointment attendance, diagnoses, social history, medications, and laboratory tests. These features are complex time-based aggregations of the underlying EMR data resulting in a total of 1,295 predictive variables. Interventions are resource intensive and costly hence, we selected a model that tunes our predictive performance to match our capacity for intervention.

The system explores the performance of a broad range of machine learning methods and hyperparameters including decision trees, gradient boosted decision trees, and random forests. Models were compared with a random baseline and clinically relevant expert rules. Given the diverse nature of the patient population, we audited the inherent bias in our system.

Results: 721 patients received HIV care at our institution over the study period, with approximately 1,500 appointments per year. Of these, 10% of the appointments were out of care. A random forest model had the highest positive predictive value.

Our system is significantly more accurate than expert heuristics used today, correctly identifying 10-60% more visits with at-risk patients. The most important features in the model are the history of previous appointments, lab tests results (both viral load and CD4 counts), and diagnoses. Our machine learning model is also superior to the expert rules because it provides individual level prediction instead of coarse group level prediction, which is currently used. Further, in our bias audit, our model has significant less bias than using expert rules.

Conclusion: We built a predictive model for retention in care using machine learning methods that was significantly more accurate and less biased than currently used expert rules. To our knowledge, this is the first time a machine learning system has been applied to the problem of retention in care. This model can be implemented to guide retention interventions in real time.