A Comparison on Performance of Tests for Equality of Means of Related Ordinal Data

Vanida Pongsakchat, Nopparat Panngam


For hypothesis testing about more than one population mean when the data are related, parametric tests usually have been used. When using parametric tests, assumptions are required, the data have to be quantitative data and normally distributed. However, if the data are ordinal Likert scale and/or non-normally distributed, non-parametric tests are alternative. In this research, in case of testing two population means, t-test and Wilcoxon test are compared and for testing three population means, F-test and Friedman test are compared. The correlated ordinal data are generated from normal, uniform, left-skewed and right-skewed distributions. In addition, the correlation coefficients are 0.50 and 0.70. For testing two population means, sample sizes are 10, 20, 30, 50 and 100 and for testing three population means, sample sizes are 20, 30, 50 and 100. From the simulation study, when the sample sizes are small (10, 20 and 30 for two groups test, 20 and 30 for three groups test), parametric tests      (t-test and F-test) perform better than non-parametric tests (Wilcoxon test and Friedman test) in term of controlling the type I error rate and power. When the sample sizes are medium and large, both methods have similar performances. The power of both methods are increased as sample sizes and correlation increase and the distributions are completely difference.


Keywords :  t-test, F-test, Wilcoxon test, Friedman test

Full Text:



Barbiero, A. & Ferrari, P. A. (2015). GenOrd: Simulation of ordinal and discrete variables with given correlation

matrix and marginal distributions. Retrieved January 10, 2016 from http://CRAN.R project.org/package =GenOrd.

Bradley, J. V. (1978). Robustness?, British Journal of Mathematical and Statistical Psychology, 31(1), 144-152.

Ferrari, P. A. & Barbiero, A. (2012). Simulation ordinal data. Multivariate Behavioral Research, 47(4), 566-589.

Harwell, M. R. & Serlin, R. C. (1994). A Monte Carlo study of the Friedman test and some competitors in the single

factor, repeated measures design with unequal covariances. Computational Statistics and Data Analysis, 17(1), 35-49.

Norman, G. (2010). Likert scales, levels of measurement and the “Laws” of statistics. Advance in Health Science

Education, 15(5), 625-632.

Romano, J., Kromrey, J.D., Coraggio, J., & Skowronek, J. (2006). Appropriate statistics for ordinal level data:

Should we really be using t-test and Cohen’s d for evaluating group differences on the NSSE and other

surveys?. Paper presented at the annual meeting of the Florida Association of Institutional Research, February 1 -3, 2006, Cocoa Beach, Florida.

Zimmerman, D.W. & Zumbo, B.D. (1993). Relative power of the Wilcoxon test, the Friedman test, and repeated

measures ANOVA on ranks. The Journal of Experimental Education, 62(1), 75-86.


  • There are currently no refbacks.