Given the common use of ordinal items (e.g., self-report Likert scales) in measures of key mediators and outcomes, a concern arises whether it is appropriate to treat them as continuous and use maximum likelihood (ML) estimation or robust ML estimation (MLR) in evaluating longitudinal measurement invariance. Some simulation studies suggest that when there are five or more response categories, using ML or MLR is acceptable. However, if the observed response distribution is rather skewed and answers mainly fall on, say three of the five categories, ML results are prone to biases. In such cases, using confirmatory factor analysis (CFA) models with ordinal scales is more appropriate.
This study illustrates through examples the test of longitudinal measurement invariance using CFA with ordinal variables. Three models are compared: a baseline model that assumes common factor structure over time, a loading invariance model that also assumes that factor loadings are the same across time, and a threshold invariance model that further assumes that for each item, the threshold level of going from one response category to the next (e.g., from “I somewhat believe this” to “I very much believe this”) is the same over time. We also present a way to gauge the practical significance of the violations of invariance. It is done by comparing the estimated probabilities associated with choosing a specific category on an item at a measurement wave, across different models (e.g., a model assuming loading invariance versus a model assuming threshold invariance). An R program has been developed to help researchers calculate these predicted probabilities from CFA outputs.