The present study seeks to address this issue by analyzing the spatial accuracy of georeferenced EMA data collected through the Social-Spatial Adolescent Study, a longitudinal study of neighborhood and social contextual effects on adolescent substance use, based in Richmond, Virginia. EMA surveys were administered to each subject via text message with an embedded URL link 3-6 times per day, over a four day period, every other month, over a period of two years. To analyze the spatial accuracy, we extracted the EMA surveys in which the subject indicated that they were at home at that moment of the survey. We then compared the location of these ‘home’ EMAs with the ‘true’ home location, derived from conventional address geocoding, for subjects who did not move residences during the study. This resulted in a data set of 3,935 EMA surveys for 88 subjects. We then calculated the spatial ‘error’ as the Euclidean distance between the EMA location and the true home location for each subject. For each subject we calculated: 1) the mean and median error, 2) the error associated with the mean and median coordinate positions of the EMA ‘point cloud,’ and 3) the convex hull of the EMA point cloud, its area, and its area-to-perimeter ratio.
Results indicate that half of the subjects had a median error less than 462 feet, and 90% of subjects had a median error less than 3,600 feet. The distribution of the error is highly skewed, however, with a few outliers with particularly high error values. A similar pattern occurs for measurements of the error from the home to the mean center and median center of the EMA location point cloud. Preliminary visual investigation of the EMA convex hulls and area-to-perimeter ratios suggests the nature of this error, where most EMA locations for a subject are seen to cluster together (demonstrating high accuracy and precision), but may have one or two spatial outliers. This may be due to error associated with the GPS data capture or with the EMA response.