Methods. We tested a “big data” approach for generating measures of alcohol use gleaned from Internet searching and social media, evaluating whether novel measures would predict alcohol use from a US national survey, and whether measures of the alcohol policy environment moderate relationships. Data sources: We used Google Trends (GT) to estimate relative search volumes (RSVs) for 7 alcohol-related keywords and created alcohol use measures from Twitter posts via classification with natural language processing, with searching/posting measures operationalized for 50 US states. Survey reports of alcohol use and sociodemographics were obtained from the Behavioral Risk Factor Surveillance System (BRFSS). Measures of US state alcohol policy environments used the Alcohol Policy Scale. All data were from the 2015-16 period. Analyses: Using multivariate logistic regression we estimated individual odds of past 30-day alcohol use, and linear regression to model maximum number of drinks consumed on an occasion among BRFSS respondents predicted by state-level RSVs (terms modeled independently) and, separately, state-level Twitter metrics, adjusting for sociodemographics and Internet use. Prototypical plots were generated to investigate the potential that associations among GT and Twitter alcohol metrics and BRFSS metrics would vary in the context of state alcohol policies. All analyses accounted for the complex sampling design used in BRFSS.
Results. Analyses of 459,474 BRFSS respondents (47.1% female, mean age 47.3 years) of which 52.5% reported past 30 day alcohol use found that all alcohol-related RSVs and Twitter measures were positively associated with past 30-day alcohol use (p<0.05), with the RSV for “beer” showing the largest effect and “liquor” showing the smallest effect. RSVs for “drinking,” “alcoholic,” “alcoholism,” “beer,” and the Twitter metric were positively associated with maximum number of drinks on an occasion (p<0.05). RSV for “wine” was negatively associated with maximum number of drinks on an occasion (p<0.05). Alcohol policies moderated relationships between alcohol-related Internet searching/posting measures and BRFSS measures of alcohol use, with smaller associations between searching/posting and drinking behavior in the setting of high policy controls.
Conclusion. Strong positive associations were found between reports of individual alcohol use and state-level measures of alcohol use derived from Internet searching and Twitter. Associations were moderated by the strength of state alcohol policies. Findings support use of novel “big data” derived measures gleaned from Internet searching/posting for monitoring alcohol use behaviors and demonstrate the sensitivity of the approach to detecting the influence of alcohol policies, a boon to prevention sciences including around policy evaluation.