Yıl: 2021 Cilt: 21 Sayı: 3 Sayfa Aralığı: 263 - 272 Metin Dili: İngilizce DOI: 10.21121/eab.960840 İndeks Tarihi: 09-12-2022

Comparison of Different Estimation Approaches in Rare Events Data

Öz:
In social science researches, there may be cases where a category of the dependent variable is seen hundred times less (more) than the other category. Events like wars, mass migrations or coups in social sciences; an event of interest in binary variable(s) may have very low prevalence, resulting in low or even zero cell counts in one or two cells in the 2X2 tables of two factors. In this case, independent variable predict the dependent variable perfectly or almost perfectly, and this leads to an issue called complete or quasi-complete separation problem in statistical modelling. This study aims to compare three methods suggested in the literature for the quasi-complete separation in a real small dataset; penalized maximum likelihood (Firth-type), exact logistic regression and bayesian logistic regression. Methods were compared via odds ratios, odds’ standard error estimates, confidence intervals and statistical significance. Parameter estimates were obtained under three different models with binary and continuous variables. Results show that all methods can provide convergence in the presence of quasi-complete separation. In conclusion, bayesian logistic regression estimates tend to be superior than the other methods in terms of estimation of standard errors.
Anahtar Kelime: bayesian logistic regression zero cell count penalized maximum likelihood Rare events quasi-complate separation

Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık
  • Allison P.D. (2008). Convergence failures in logistic regression. In: Proceedings of the SAS Global Forum 2008 Conference. SAS Institute Inc., Cary, NC. http://www2.sas.com/proceed- ings/forum2008/360-2008.pdf
  • Cengiz, M.A., Terzi, E. Şenel, T. ve Murat, N. (2013). Lojistik regre- syonda parametre tahmininde Bayesci bir yaklaşım. Afyon Kocatepe Üniversitesi Fen Bilimleri Dergisi, 12(2012), 15-22.
  • Derr R.E. (2009). Performing exact logistic regression with the SAS System-Revised 2009. Proceedings of the Twenty-fifth Annual SAS Users Group International Conference; Cary, NC; 2009: Citeseer.
  • Devika, S. Jeyaseelan, L. ve Sebastian, G. (2016). Analysis of sparse data in logistic regression in medical research: a newer approach. Journal of Postgraduate Medicine, 62(1), 26-31.
  • Eyduran, E. (2008). Usage of penalized maximum likelihood estimation method in medical research: an alternative to maximum likelihood estimation method, JRMS 13(6), 325- 330.
  • Firth D. (1993). Bias reduction of maximum likelihood esti- mates. Biometrika, 80(1), 27-38.
  • Gavanji, R. (2019). Penalized Regression Methods for Modelling Rare Events Data with Application to Occupational Injury Study (Doctoral dissertation, University of Saskatchewan).
  • Gelman, A. ve Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press, USA.
  • Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.V., Vehtari, A., Rubin, D.B. (2013). Bayesian Data Analysis, Third Edition. Chapman and Hall, London.
  • Gelman, A., Jakulin, A., Pittau, M.G., and Su, Y. (2009). A weakly informative default prior distribution for logistic and other regression models. The Annals of Applied Statistics, 2(4), 1360–1383.
  • Greenland, S., Schwartzbaum, J.A., Finkle, W.D. (2000). Prob- lems dur to small samples and sparse data in conditional regression analysis. American Journal of Epidemiology, 151(5), 531-539.
  • Guns, M., and Vanacker, V. (2012). Logistic regression applied to natural hazards: rare event logistic regression with rep- lications. Natural Hazards and Earth System Sciences, 12(6), 1937-1947.
  • Heinze, G. And Schemper, M. (2002). A solution to the problem of separation in logistic regression. Statistics in Medicine, 21, 2409-2419.
  • King, E.N. ve Ryan, T.P. (2002). A preliminary investigation of maximum likelihood logistic regression versus exact logis- tic regression. The American Statistician, 56(3), 163-170.
  • King, G. and Zeng, L. (2001). Logistic regression in rare events data. Political Analysis. 9(2), 137-163.
  • Kocak, M. (2017). An empirical Bayesian approach in estimating odds ratios for rare or zero events. Turkiye Klinikleri J Biostat, 9(1), 1-11.
  • Mehta, C.R. and Patel, N.R. (1995). Exact logistic regression: the- ory and examples. Statistics in Medicine, 14(19), 2143-2160.
  • Muchlinski, D., Siroky, D., He, J., and Kocher, M. (2016). Compar- ing random forest with logistic regression for predicting class-imbalanced civil war onset data. Political Analysis, 87-103.
  • Paal, V.D. (2013). A comparison of different methods for mod- elling rare events data. Master Thesis. Universiteit Gent, Belgium.
  • Rainey, C. (2016). Dealing with separation in logistic regression models. Political Analysis, 2016(24), 339-355.
  • Soliman, A.M.A., MacLehose, R.F. and Carlson, A. (2013). Bayes- ian models with a weakly informative prior: A useful alter- native for solving sparse data problems. Value In Health. 16(3), A48-A49.
  • Webb, M.C., Wilson, J.R. ve Chong, J. (2004). An analysis of qua- si-complete binary data with logistic models: applications to alcohol abuse data. Journal of Data Science, 2(2004), 273-285.
  • Zorn, C. (2005). A solution to separation in binary logit models. Political Analysis, 13,157-170.
APA bacaksız e, KOÇ S (2021). Comparison of Different Estimation Approaches in Rare Events Data. , 263 - 272. 10.21121/eab.960840
Chicago bacaksız ece,KOÇ Selçuk Comparison of Different Estimation Approaches in Rare Events Data. (2021): 263 - 272. 10.21121/eab.960840
MLA bacaksız ece,KOÇ Selçuk Comparison of Different Estimation Approaches in Rare Events Data. , 2021, ss.263 - 272. 10.21121/eab.960840
AMA bacaksız e,KOÇ S Comparison of Different Estimation Approaches in Rare Events Data. . 2021; 263 - 272. 10.21121/eab.960840
Vancouver bacaksız e,KOÇ S Comparison of Different Estimation Approaches in Rare Events Data. . 2021; 263 - 272. 10.21121/eab.960840
IEEE bacaksız e,KOÇ S "Comparison of Different Estimation Approaches in Rare Events Data." , ss.263 - 272, 2021. 10.21121/eab.960840
ISNAD bacaksız, ece - KOÇ, Selçuk. "Comparison of Different Estimation Approaches in Rare Events Data". (2021), 263-272. https://doi.org/10.21121/eab.960840
APA bacaksız e, KOÇ S (2021). Comparison of Different Estimation Approaches in Rare Events Data. Ege Akademik Bakış, 21(3), 263 - 272. 10.21121/eab.960840
Chicago bacaksız ece,KOÇ Selçuk Comparison of Different Estimation Approaches in Rare Events Data. Ege Akademik Bakış 21, no.3 (2021): 263 - 272. 10.21121/eab.960840
MLA bacaksız ece,KOÇ Selçuk Comparison of Different Estimation Approaches in Rare Events Data. Ege Akademik Bakış, vol.21, no.3, 2021, ss.263 - 272. 10.21121/eab.960840
AMA bacaksız e,KOÇ S Comparison of Different Estimation Approaches in Rare Events Data. Ege Akademik Bakış. 2021; 21(3): 263 - 272. 10.21121/eab.960840
Vancouver bacaksız e,KOÇ S Comparison of Different Estimation Approaches in Rare Events Data. Ege Akademik Bakış. 2021; 21(3): 263 - 272. 10.21121/eab.960840
IEEE bacaksız e,KOÇ S "Comparison of Different Estimation Approaches in Rare Events Data." Ege Akademik Bakış, 21, ss.263 - 272, 2021. 10.21121/eab.960840
ISNAD bacaksız, ece - KOÇ, Selçuk. "Comparison of Different Estimation Approaches in Rare Events Data". Ege Akademik Bakış 21/3 (2021), 263-272. https://doi.org/10.21121/eab.960840