METHOD
A simulation study was used to determine the power, type I error, and confidence interval (CI) coverage for these models. Mean structure, variance structure, effect size, predictor type, and sample size were included in the factorial design. Mean structure reflected either a linear or an exponential relationship between the predictor and the outcome. Since the distribution of the underlying count is unobserved, several variance options were evaluated, including homoscedastic, monotonically increasing, and increasing then decreasing variance. Zero, small, medium, and large effect sizes and sample sizes of 100, 200, 500, and 1000 were examined. A single predictor (either continuous or binary) was used to predict the grouped count outcome.
RESULTS
All regression models produced unbiased estimates of the regression coefficient. Ordinal logistic regression produced type I error, power, and confidence interval (CI) coverage rates that were consistently within acceptable limits. Linear regression produced type I error and power that were within acceptable limits, but CI coverage was too low in conditions with an exponential mean structure, particularly with a large effect size and/or monotonically increasing variance structure. Poisson regression displayed inflated type I error, low power, and low CI coverage rates for nearly all conditions.
CONCLUSIONS
Based on the statistical performance of the three models, ordinal logistic regression is the preferred method for analyzing grouped count outcomes. Linear regression also performed well, but CI coverage was too low for several conditions with an exponential mean structure; these specific conditions are of particular interest because they reflect conditions commonly observed for counts and frequencies. Comparisons of model fit and tests of model assumptions (e.g., the proportional odds assumption for ordinal logistic regression) are in progress.