Goodness of Fit of Cumulative Logit Models for Ordinal Response Categories and Nominal Explanatory variables with Two-Factor Interaction
Main Article Content
Abstract
Power and the assessing goodness of fit of cumulative models for ordinal response data with two nominalinteraction term of explanatory variables are investigated. The magnitude of goodness-of-fit statistics, thecoefficients of determination or R2 analogs, the likelihood ratio statistic,GM, AIC (Akaike Information Criterion,Akaike, 1973),and BIC (Bayesian Information Criterion, Schwarz, 1978) are calculated. The simulations havebeen conducted for the multinomial logit models with K=3 response categories and two random explanatoryvariables X1 and X2 whose joint distribution of (X1, X2) is assumed to be multinomial with probabilities π1 π2π3and π4, corresponding to (X1, X2) values of (0, 0), (0,1), (1, 0), (1, 1), respectively. Three sets of (π1, π2, π3,π4 ) are studied to represent different distributional shapes, which were chosen to induce possibly strong effectssuch that β1= log 2, β2= log3, and β12= 0.0 - 4.5 (increment 0.3), namely (X1, X2)~multinomial(0.10,0.35,0.45,0.10), (X1, X2)~ multinomial (0.50,0.30,0.10,0.10), and (X1, X2)~multinomial (0.25,0.25,0.25,0.25). Four sets of the three ordered category distributing corresponding with the (X1, X2) were againgenerated through the models under the proportions of (p1, p2, p3), namely Y~multinomial(p1, p2, p3):(0.05,0.20,0.75), (0.25,0.50,0.25), (0.5,0.20,0.25), and (0.33,0.33,0.33) from which it follows that the truemodel intercepts are α1 = log p1 , α2 = log p1 + p2 , corresponding to the proportions of Y = 1, 2, 3respectively. Four sample sizes of 600, 800, 1,000, and 1,500 units were performed. Each condition was carriedout for 1,000 repeated simulations using the developed macro program run with the Minitab Release 11.The results under the distribution conditions of (X1, X2)~ multinomial (0.1,0.35,0.45,0.1) and Y ~(0.55,0.20, 0.25) show that all goodness-of-fit statistics perform better than those of the distribution conditions ofwhich Y~(0.25,0.5,0.25) and Y~(0.33,0.33,0.33) in term of the power of the tests, means and standard deviationsof goodness-of-fit statistics. These results are also similar to the condition when (X1, X2)~ (0.50,0.30,0.1,0.1).However, when the distribution conditions are symmetric such that (X1, X2)~ (0.25,0.25,0.25,0.25) and Y~ (0.33,0.33,0.33) all statistics are much generally improved the model fits. In conclusion it probably isrecommended to use large sample sizes in the analysis of ordinal categorical responses when the distributions ofvariables are asymmetric, except only when the distribution of the response categories is clearly increasing inorder. Besides this, there is also a tendency to improve the model fit by using the models with an interaction termwhen the correlated structures between the explanatory variables are evident.
Downloads
Article Details
References
Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In the Second Intenational Symposium on Information Theory (N. Petrov and F. Czake). eds. B. Akademiai, Kiado, Budapest: 267-281.
Aldrich, J. H. and Nelson, F. D. (1984). Linear Probability Logit and Probit Models, Sage, Beverly Hills.
Aitkin, M. D., Anderson, B., Francis, and Hinde, J. (1989). Statistical Modelling in GLIM, Oxford Science Publications, Oxford.
Armstrong, B. G. and Sloan, M. (1989). Regression models for epidemiologic data. Journal of Epidemiology, 129: 191-204.
Cole, S. R., Allison, P. D., and Ananth, C. V. (2003). Estimation of cumulative odds ratios. Association of Educational Psychologists, 14(3): 172-178.
Cox, D. R. and Snell, E. G. (1989). The Analysis of Binary Data, 2nd ed., Chapman and Hall, London.
Fienberg, S. E. (1980). The Analysis of Cross-Classified Categorical Data, 2nd ed., MIT Press, Cambridge.
Jorgensen, B. (1997). The Theory of Dispersion Models, Chapman and Hall, London.
Hastie, T. I. and Tibshirani, R. J. (1990). Generalized Additive Models, Chapman and Hall, London.
Heyde, C. C. (1997). Quasi-likelihood and Its Application: A General Approach to Optimal Parameter Application. Springer, New York.
Holbrugge, W. and Schumacher, M. (1991). A Comparison of regression models for the analysis of ordered categorical data. Applied Statistics 40: 249-59.
Lawal, H. B. (2003). Categorical Data Analysis with SAS and SPSS Applications, Lawrence Erlbaum Associates. Inc., London.
Liang, K. L. and Seger, S. L. (1986). Longiudinal data anslysis using generalized linear models, Biometrika, 73: 13-22.
Lindsey, J. K. (1997). Applying Generalized Linear Models, Springer, New York.
Lipsitz, S. R., Fitzmaurice, G. M., and Molenberghs, G. (1996). Goodness-of-fit for ordinal response regression model. Applied Statistics, 45(2): 175-190.
Maddala, G. S. (1983). Limited-Dependent and Qualitative Variables in Econometrics, Uni. Press, Cambridge.
McCullagh, D. (1980). Regression models for ordinal data. Journal Royal Statistics, 42B:109-142.
McCulloch, C. (1997). Maximum likelihood algorithms for generalized linear mixed models. Journal of the American Statistical Association, 92:162-170.
McCulloch, C. (2000). Generalized linear models. Journal Of the American Statistical Association, 95: 1320-1324.
McCulloch, C. and Searle, S. R. (2001). Generalized, Linear, and Mixed Model, Wiley, New York.
McFadden, D. (1974). The Measurement of urban travel demand. Journal of Public Economics, 3: 303-328.
Minitab Reference Manual and User’s Guide, Release 11 for Windows TM Windows 3.1 or 3.11,Window NT, and Windows 95. (1996). MinitabInc., New York.
Nelder, J. A. and Wedderburn, R. W. M. (1972). Generalized linear models. Royal Statistics, 135A: 370-384.
Paul, S. R. and Deng, D. (2000). Goodness of fit of generalized linear models to sparse data. Royal Statistics, 62B: 323-333.
Peterson, B. L. and Harrell, F. E. (1990). Partial proportional odds models for ordinal response variables. Applied Statistics, 39: 205-17.
Ryan, T. P. (1997). Modern Regression Methods, Wiley, New York.
Schwarz, G. (1978). Estimating the dimensions of amodel. Annals of statistics, 6: 461-464.
Walker, S. H. and Duncan, D. B. (1967). Estimation of probability of an event as a function of several independent variables. Biometrika, 54: 167-179.