Yıl: 2021 Cilt: 10 Sayı: 4 Sayfa Aralığı: 1510 - 1515 Metin Dili: İngilizce DOI: 10.5455/medscience.2021.09.322 İndeks Tarihi: 16-05-2022

Classification of stroke with gradient boosting tree using smote-based oversampling method

Öz:
The aim of this study is to classify the disease with the gradient increasing tree classification method in an open access dataset containing data from patients with and without stroke disease. In addition, it is aimed to compare the results by balancing the data with the oversampling method Synthetic Minority Over-sampling Technique (SMOTE) which is one of the data balancing methods in the study. In this study, a dataset containing information about patients with and without stroke disease obtained from the address "https://www.kaggle.com/asaumya/healthcare-problem-prediction-stroke-patients" was used. In the study, SMOTE was used as the data balancing method, and the gradient boosting tree method was used in the modeling. The performance of the model was evaluated by Specificity, sensitivity, accuracy, positive predictive value and negative predictive values. Specificity, sensitivity, accuracy, positive predictive value and negative predictive values were obtained as 0.0887, 0.9772, 0.9339, 0.9544 and 0.1679, respectively, according to the modeling result using the gardient boosting tree method using the original version of the dataset. Specificity, sensitivity, accuracy, positive predictive value and negative predictive values were obtained as 0.0887, 0.9772, 0.9339, 0.9544 and 0.1679, respectively, according to the modeling result using the gardient boosting tree method using the SMOTE applied version of the dataset. When the results obtained from the study were examined, the modeling results made with the SMOTE applied dataset were obtained more consistently and realistically. As a result, it is suggested that researchers use dataset balancing methods to acquire more accurate results whenever they come across an unbalanced dataset problem.
Anahtar Kelime:

Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık
  • 1. Baroni AFFB, Fábio SRC, Dantas RO. Risk factors for swallowing dysfunction in stroke patients. Arq Gastroenterol. 2012;49:118-24.
  • 2. Hatano S. Experience from a multicentre stroke register: a preliminary report, Bull World Health Organ. 1976;54:541.
  • 3. Sarikaya H, Ferro J, Arnold M. Stroke prevention-medical and lifestyle measures. Eur Neurol. 2015;73:150-7.
  • 4. Feigin VL, Nichols E, Alam T, et al. Global, regional, and national burden of neurological disorders, 1990–2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet Neurol. 2019;18:459-80.
  • 5. Stinear CM, Lang CE, Zeiler S, et al. Advances and challenges in stroke rehabilitation. Lancet Neurol. 2020;19:348-60.
  • 6. Adams RD, Victor M, Ropper AH, et al. Principles of neurology, ed: LWW, 1997.
  • 7. Tasci ME, Samli R. Diagnosis of Heart Disease with Data Mining. Eur J Eng Sci Tech. 2020;88-95.
  • 8. Dogan A, Birant D. Machine learning and data mining in manufacturing. Expert Syst Appl. 2020;114060.
  • 9. Ozmen O, Ahmad K, Engin A. Performance Comparison of Classifiers on Heart Disease Data. Fırat University J Engineering Sciences. 2018;30:153- 9.
  • 10. Chawla NV, Bowyer KW, Hall LO, et al. SMOTE: synthetic minority oversampling technique. J Artif Intell Res. 2002;16:321-57.
  • 11. Natekin A, Knoll A. Gradient boosting machines, a tutorial. Front Neurorobot. 2013;7:21.
  • 12. Maldonado S, López J, Vairetti C. An alternative SMOTE oversampling strategy for high-dimensional datasets. Appl. Soft Comput. 2019;76:380-9. classification using SMOTE and cluster-based undersampling, in 2015 7Th international joint conference on knowledge discovery, knowledge engineering and knowledge management (IC3k). 2015;226-34.
  • 14. Manju B, Nair AR. Classification of Cardiac Arrhythmia of 12 Lead ECG Using Combination of SMOTEENN, XGBoost and Machine Learning Algorithms, in 2019 ISED. 2019;1-7.
  • 15. Wang Z, Wu C, Zheng K, et al. SMOTETomek-based resampling for personality recognition. IEEE Access. 2019;7:129678-89.
  • 16. Li T, Levina E, Zhu J. Network cross-validation by edge sampling. Biometrika. 2020;107:257-6.
  • 17. Guelman L. Gradient boosting trees for auto insurance loss cost modeling and prediction. Expert Syst Appl. 2012;39:3659-67.
  • 18. Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001;1189-232.
  • 19. Ahmed T, Bosu A, Iqbal A, et al. SentiCR: a customized sentiment analysis tool for code review interactions, in 2017 32nd IEEE/ACM International Conference on ASE. 2017;106-11.
  • 20. Medin J, Nordlund A, Ekberg K. Increasing stroke incidence in Sweden between 1989 and 2000 among persons aged 30 to 65 years: evidence from the Swedish Hospital Discharge Register. Stroke. 2004;35:1047-51.
  • 21. Altun Y, Aydin I, and Algin A. Demographic characteristics of stroke types in Adiyaman. Turkish J Norol. 2018;24:6.
  • 22. Katan M, Luft A. Global burden of stroke, in Seminars in neurology. 2018;208-11.
  • 23. Wu G, Chang EY. Class-boundary alignment for imbalanced dataset learning, in ICML 2003 workshop on learning from imbalanced data sets II, Washington, DC. 2003;49-56.
  • 24. He H, Garcia EA. Learning from imbalanced data, IEEE Trans Knowl Data Eng. 2009;1263-84.
  • 25. Han H, Wang WY, Mao BH. Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning, in International conference on intelligent computing. 2005;878-87.
APA YAĞIN F, CİCEK I, KÜÇÜKAKÇALI Z (2021). Classification of stroke with gradient boosting tree using smote-based oversampling method. , 1510 - 1515. 10.5455/medscience.2021.09.322
Chicago YAĞIN Fatma Hilal,CİCEK Ipek Balikci,KÜÇÜKAKÇALI ZEYNEP Classification of stroke with gradient boosting tree using smote-based oversampling method. (2021): 1510 - 1515. 10.5455/medscience.2021.09.322
MLA YAĞIN Fatma Hilal,CİCEK Ipek Balikci,KÜÇÜKAKÇALI ZEYNEP Classification of stroke with gradient boosting tree using smote-based oversampling method. , 2021, ss.1510 - 1515. 10.5455/medscience.2021.09.322
AMA YAĞIN F,CİCEK I,KÜÇÜKAKÇALI Z Classification of stroke with gradient boosting tree using smote-based oversampling method. . 2021; 1510 - 1515. 10.5455/medscience.2021.09.322
Vancouver YAĞIN F,CİCEK I,KÜÇÜKAKÇALI Z Classification of stroke with gradient boosting tree using smote-based oversampling method. . 2021; 1510 - 1515. 10.5455/medscience.2021.09.322
IEEE YAĞIN F,CİCEK I,KÜÇÜKAKÇALI Z "Classification of stroke with gradient boosting tree using smote-based oversampling method." , ss.1510 - 1515, 2021. 10.5455/medscience.2021.09.322
ISNAD YAĞIN, Fatma Hilal vd. "Classification of stroke with gradient boosting tree using smote-based oversampling method". (2021), 1510-1515. https://doi.org/10.5455/medscience.2021.09.322
APA YAĞIN F, CİCEK I, KÜÇÜKAKÇALI Z (2021). Classification of stroke with gradient boosting tree using smote-based oversampling method. Medicine Science, 10(4), 1510 - 1515. 10.5455/medscience.2021.09.322
Chicago YAĞIN Fatma Hilal,CİCEK Ipek Balikci,KÜÇÜKAKÇALI ZEYNEP Classification of stroke with gradient boosting tree using smote-based oversampling method. Medicine Science 10, no.4 (2021): 1510 - 1515. 10.5455/medscience.2021.09.322
MLA YAĞIN Fatma Hilal,CİCEK Ipek Balikci,KÜÇÜKAKÇALI ZEYNEP Classification of stroke with gradient boosting tree using smote-based oversampling method. Medicine Science, vol.10, no.4, 2021, ss.1510 - 1515. 10.5455/medscience.2021.09.322
AMA YAĞIN F,CİCEK I,KÜÇÜKAKÇALI Z Classification of stroke with gradient boosting tree using smote-based oversampling method. Medicine Science. 2021; 10(4): 1510 - 1515. 10.5455/medscience.2021.09.322
Vancouver YAĞIN F,CİCEK I,KÜÇÜKAKÇALI Z Classification of stroke with gradient boosting tree using smote-based oversampling method. Medicine Science. 2021; 10(4): 1510 - 1515. 10.5455/medscience.2021.09.322
IEEE YAĞIN F,CİCEK I,KÜÇÜKAKÇALI Z "Classification of stroke with gradient boosting tree using smote-based oversampling method." Medicine Science, 10, ss.1510 - 1515, 2021. 10.5455/medscience.2021.09.322
ISNAD YAĞIN, Fatma Hilal vd. "Classification of stroke with gradient boosting tree using smote-based oversampling method". Medicine Science 10/4 (2021), 1510-1515. https://doi.org/10.5455/medscience.2021.09.322