| Peer-Reviewed

New Criteria of Model Selection and Model Averaging in Linear Regression Models

Received: 4 September 2014     Accepted: 17 September 2014     Published: 20 October 2014
Views:       Downloads:
Abstract

Model selection is an important part of any statistical analysis. Many tools are suggested for selecting the best model including frequentist and Bayesian perspectives. There is often a considerable uncertainty in the selection of a particular model to be the best approximating model. Model selection uncertainty arises when the data are used for both model selection and parameter estimation. Bias in estimators of model parameters often arise when data based selection has been done. Therefore, model averaging of the parameter estimators will be done to alleviate the bias in model selection in a set of candidate models, by combining the information from a set of candidate models. This paper is two-fold, new criteria of model selection are proposed based on different averages of AIC, BIC, AICc, and HQC. Also, model averaging is introduced to compare the parameter estimators in model averaging with the ones in model selection. Two Simulation studies are considered, the first is for model selection and showed that the new proposed criteria are lies between some of the known criteria such as AIC, BIC, AICc, and HQC, and so they can be used as new criteria of model selection. The second simulation study is for model averaging and showed that the parameter estimators have less bias and less predicted mean square error (PMSE) compared with the parameter estimators in model selection.

Published in American Journal of Theoretical and Applied Statistics (Volume 3, Issue 5)
DOI 10.11648/j.ajtas.20140305.15
Page(s) 148-166
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2014. Published by Science Publishing Group

Keywords

AIC, BIC, AICc, HQC, Kullback-Leibler (K-L) Distance, Model Averaging, Model Selection

References
[1] Akaike, H., 1973. "Information theoryas an extension of the maximum likelihood principle", P.267-281 in B.N P Petrov, and FCsaki, (eds.) Second International Symposium on Information Theory. Akademiai Kiado, Budapest.
[2] Akaike, H., 1974. ''A new look at the statistical model identification", IEEE Transactions on Automatic Control AC", 19, 716-723.
[3] Al-Subaihi, Ali A. (2007). "Variable selection in multivariate regression using SAS/IML, Saudi Arabia.
[4] Anderson, D. R., Burnham,K.P., and White, G.C.(1998). "Comparison of AIC and CAIC for model selection and statistical inference from capture-recapture studies". Journal of Applied Statistics 25, 263-282.
[5] Bozdogan, H. (1987). "Model selection and Akaike's information criterion (AIC): the general theory and its analytical extensions. Psychometrika 52, 345-370.
[6] Bozdogan, H. (1994). "Editor's general preface". Pages ix-xii in H. Bozdogan (ed.) Engineering and Scientific applicatioins. Vol., 3, Proceedings of the First US/Japan Conference on the Frontiers of Statistical Modeling: An Informational Approach. Kluwer Academic Publishers, Dordrecht, the Netherlands.
[7] Buckland, S.T., Burnham, K.P., and Augustin, N.H. (1997). "Model selection: an integral part of inference", Biometrics 53, 603-618.
[8] Burnham, K.P.; Anderson, D.R. (2002). Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach (2nd ed.), Springer-Verlage ISBN 0-387-9536-7.
[9] Burnham, K.P.; Anderson, D.R. (2004). ''Multimodel inference: understanding AIC and BIC in Model Selection'', Sociological Methods and research 33:201-208.
[10] Cavanaugh, J.E., and Neath, A.A. (1999). "Generlizing the derivation of the Schwarz information criterion", Communication in Statistics-Theory and Methods 28, 49-66.
[11] Cavanaugh, J.E., and Shumway, R.H. (1997). "A bootsrap variant of AIC for state-space model selection", Statistica Sinica 7, 473-496.
[12] Deleauw, j. (1992). "Introduction to Akaike (1973) information theory and an extension of the maximum likelihood principle". Pages 599-609 in S. Kotz, and N.L. Johnson (eds) Breakthroughs in statistics. Vol.1. Springer Verlag, London.
[13] Hannan, E.J., and Quinn, B.G. (1979). "The determination of the order of an autoregression", Journal of the Royal Statistical Society, Series B 41, 190-195.
[14] Hoeting, J.A., Madigan, D., Raftery, A.E., and Volinsky, C.T. (1999). "Bayesian model averaging: a tutorial (with discussion)", Statistical Science 14, 382-417.
[15] Hurvich, C.M., and Tsai, (1989). "Regression and time series model selection in small samples", Biometrika 76, 297-307.
[16] Hurvich, C.M., and Tsai, (1991). "Bias of the corrected AIC criterion for underfitted regression and time series models", Biometrika 78, 499-509.
[17] Hurvich, C.M., and Tsai, (1995). "Model selection for extended quasi-likelihood models in small samples", Biometrics 51, 1077-1084.
[18] Kapur, J.N., and Kesavan, H.K. (1992). Entropy optimization principles with applications. Academic press, London.
[19] Kullback, S. (1959). Information theory and statistics. John Wiley and Sons, New York, NY.
[20] Kullback, S., and Leibler, R.A. (1951). On information and sufficiency, Annals of Mathematical Statistics 22, 79-86.
[21] Leamer, E.E. (1978). Specification searches: ad hoc inference with nonexperimental data, John Wiley and Sons, New York, NY.
[22] Lebreton, J-D., Burnham, K.P., Clobert, J., and Anderson, D.R. (1992). "Modeling survival and testing biological hypotheses using marked animals: a unified approach with case studies", Ecological Monograph 62, 67-118.
[23] Mallows, C.L. (1973). "Some comments on Cp", Technometrics 12, 591-612.
[24] Mallows, C.L. (1995). "More comments on Cp", Technometrics 37, 362-372.
[25] Sakamoto, Y., Ishiguro, M., and Kitagawa, G. (1986). Akaike information criterion statistics, KTK Scientific Publishers, Tokyo, Japan.
[26] Schwarz, Gideon E. (1978). ''Estimating the dimension of a model''. Annals of statistics 6 (2):461-464.
[27] Shibata, R. (1983). A theoretical view of the use of AIC. Pages 237-244 in O.D. Anderson (ed.) Time series analysis: theory and practice. Elsevier Science Publication, North-Holland, the Netherlands.
[28] Stone, C.J. (1982). "Local asymptotic admissibility of a generalization of Akaike's model selection rule", Annals of the Institute of Statistical Mathematics Part A 34, 123-133.
[29] Sugiura, N. (1978). "Further analysis of the data by Akaike's information criterion and the finite corrections", Communications in Statistics, Theory and Methods A7, 13-26.
[30] Takeuchi, K. (1976). "Distribution of informational statistics and a criterion of model fitting", Suri-Kagaku (Mathematic Sciences) 153, 12-18. (In Japanese).
Cite This Article
  • APA Style

    Magda Mohamed Mohamed Haggag. (2014). New Criteria of Model Selection and Model Averaging in Linear Regression Models. American Journal of Theoretical and Applied Statistics, 3(5), 148-166. https://doi.org/10.11648/j.ajtas.20140305.15

    Copy | Download

    ACS Style

    Magda Mohamed Mohamed Haggag. New Criteria of Model Selection and Model Averaging in Linear Regression Models. Am. J. Theor. Appl. Stat. 2014, 3(5), 148-166. doi: 10.11648/j.ajtas.20140305.15

    Copy | Download

    AMA Style

    Magda Mohamed Mohamed Haggag. New Criteria of Model Selection and Model Averaging in Linear Regression Models. Am J Theor Appl Stat. 2014;3(5):148-166. doi: 10.11648/j.ajtas.20140305.15

    Copy | Download

  • @article{10.11648/j.ajtas.20140305.15,
      author = {Magda Mohamed Mohamed Haggag},
      title = {New Criteria of Model Selection and Model Averaging in Linear Regression Models},
      journal = {American Journal of Theoretical and Applied Statistics},
      volume = {3},
      number = {5},
      pages = {148-166},
      doi = {10.11648/j.ajtas.20140305.15},
      url = {https://doi.org/10.11648/j.ajtas.20140305.15},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajtas.20140305.15},
      abstract = {Model selection is an important part of any statistical analysis. Many tools are suggested for selecting the best model including frequentist and Bayesian perspectives. There is often a considerable uncertainty in the selection of a particular model to be the best approximating model. Model selection uncertainty arises when the data are used for both model selection and parameter estimation. Bias in estimators of model parameters often arise when data based selection has been done. Therefore, model averaging of the parameter estimators will be done to alleviate the bias in model selection in a set of candidate models, by combining the information from a set of candidate models. This paper is two-fold, new criteria of model selection are proposed based on different averages of AIC, BIC, AICc, and HQC. Also, model averaging is introduced to compare the parameter estimators in model averaging with the ones in model selection. Two Simulation studies are considered, the first is for model selection and showed that the new proposed criteria are lies between some of the known criteria such as AIC, BIC, AICc, and HQC, and so they can be used as new criteria of model selection. The second simulation study is for model averaging and showed that the parameter estimators have less bias and less predicted mean square error (PMSE) compared with the parameter estimators in model selection.},
     year = {2014}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - New Criteria of Model Selection and Model Averaging in Linear Regression Models
    AU  - Magda Mohamed Mohamed Haggag
    Y1  - 2014/10/20
    PY  - 2014
    N1  - https://doi.org/10.11648/j.ajtas.20140305.15
    DO  - 10.11648/j.ajtas.20140305.15
    T2  - American Journal of Theoretical and Applied Statistics
    JF  - American Journal of Theoretical and Applied Statistics
    JO  - American Journal of Theoretical and Applied Statistics
    SP  - 148
    EP  - 166
    PB  - Science Publishing Group
    SN  - 2326-9006
    UR  - https://doi.org/10.11648/j.ajtas.20140305.15
    AB  - Model selection is an important part of any statistical analysis. Many tools are suggested for selecting the best model including frequentist and Bayesian perspectives. There is often a considerable uncertainty in the selection of a particular model to be the best approximating model. Model selection uncertainty arises when the data are used for both model selection and parameter estimation. Bias in estimators of model parameters often arise when data based selection has been done. Therefore, model averaging of the parameter estimators will be done to alleviate the bias in model selection in a set of candidate models, by combining the information from a set of candidate models. This paper is two-fold, new criteria of model selection are proposed based on different averages of AIC, BIC, AICc, and HQC. Also, model averaging is introduced to compare the parameter estimators in model averaging with the ones in model selection. Two Simulation studies are considered, the first is for model selection and showed that the new proposed criteria are lies between some of the known criteria such as AIC, BIC, AICc, and HQC, and so they can be used as new criteria of model selection. The second simulation study is for model averaging and showed that the parameter estimators have less bias and less predicted mean square error (PMSE) compared with the parameter estimators in model selection.
    VL  - 3
    IS  - 5
    ER  - 

    Copy | Download

Author Information
  • Department of Statistics, Mathematics, and Insurance, Faculty of Commerce, Damanhour University, Egypt

  • Sections