Friday, May 31, 2019
Seacliff C (Hyatt Regency San Francisco)
* noted as presenting author
Background: Two seemingly contradictory trends have risen within U.S. health care research: emphasis on big data and the Precision Medicine Initiative (PMI). Big data often implies harvesting large existing datasets for the purpose of refining clinical practice which rarely takes into account nuanced patient factors. PMI aims to explore heterogeneity among individuals in symptoms, behavior, biology, and responses to interventions and often uses small samples or even single cases and intensive longitudinal time series data. For complex and highly heterogenous illnesses, PMI studies may involve big data either because of large numbers of individuals, high throughput analyses are needed to address individual-level hypotheses, or both. This study introduces novel techniques and software in a large dataset for PMI-oriented analyses to identify each individual’s “triggers” that are associated with severity of migraine headaches. Data: For 90 consecutive days, migraine patients (n=776) reported on headache symptoms and 81 potential migraine triggers, including weather factors. The pattern of triggers was expected to be highly individualized, necessitating individual-level analyses. This required analyzing each trigger separately and adjusting for the false discovery rate (FDR) within individuals. The resulting combination of triggers and patients required over 55,000 analyses. Methods: Hierarchical linear modeling (HLM) and time series modeling of each combination of individual and trigger required several steps. First, the functional form between time and migraine severity (e.g., linear, quadratic, etc.) was estimated using likelihood ratio tests. Then, one-at-a-time, triggers were modelled including the best fitting of autoregressive moving average residual correlation structures (based on AIC). The Benjamini and Yekutieli FDR adjustment was applied across triggers. The process was then repeated for each individual. The PersonAlytics software package was developed to automate these steps. Results: The sample was a mean 42.8 years old (SD=12.7 years), 88.1% female, and had a mean 8.8 migraine days per month (SD=5.5). An average of 21% (SD=5.5%) of (non-missing) triggers per person were found to be significant before FDR corrections, while 9% (SD=3.8%) were significant after FDR correction. This process would have been daunting, if not impossible, to conduct one analysis at a time. Using the PersonAlytics approach and software, the total run time took less than one hour. Conclusions: Large data problems of multiple testing and high throughput can be found even in moderately sized data sets especially when conducting individual-level analyses. Using data science automation tools as implemented in PersonAlytics can make the problem manageable.