Yıl: 2022 Cilt: 60 Sayı: 3 Sayfa Aralığı: 196 - 203 Metin Dili: İngilizce DOI: 10.4274/haseki.galenos.2022.8440 İndeks Tarihi: 05-07-2022

Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application

Öz:
Aim: The diagnosis of breast cancer can be accomplished using an algorithm or an early detection model of breast cancer risk via determining factors. In the present study, gradient boosting machines (GBM), extreme gradient boosting (XGBoost) and light gradient boosting (LightGBM) models were applied and their performances were compared. Methods: The open-access Breast Cancer Wisconsin Dataset, which includes 10 features of breast tumors and results from 569 patients, was used for this study. The GBM, XGBoost, and LightGBM models for classifying breast cancer were established by a repeated stratified K-fold cross validation method. The performance of the model was evaluated with accuracy, recall, precision, and area under the curve (AUC). Results: Accuracy, recall, AUC, and precision values obtained from the GBM, XGBoost, and LightGBM models were as follows: (93.9%, 93.5%, 0.984, 93.8%), (94.6%, 94%, 0.985, 94.6%), and (95.3%, 94.8%, 0.987, 95.5%), respectively. According to these results, the best performance metrics were obtained from the LightGBM model. When the effects of the variables in the dataset on breast cancer were assessed in this study, the five most significant factors for the LightGBM model were the mean of concave points, texture mean, concavity mean, radius mean, and perimeter mean, respectively. Conclusion: According to the findings obtained from the study, the LightGBM model gave more successful predictions for breast cancer classification compared with other models. Unlike similar studies examining the same dataset, this study presented variable significance for breast cancer-related variables. Applying the LightGBM approach in the medical field can help doctors make a quick and precise diagnosis.
Anahtar Kelime:

Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık
  • 1. Lee J, Lee MG. Effects of Exercise Interventions on Breast Cancer Patients During Adjuvant Therapy: A Systematic Review and Meta-analysis of Randomized Controlled Trials. Cancer Nurs 2020;43:115-25.
  • 2. Ping J, Guo X, Ye F, et al. Differences in gene-expression profiles in breast cancer between African and Europeanancestry women. Carcinogenesis 2020;41:887-93.
  • 3. Gul A, Aygin D. Lymphedema and Air Travel After Breast Cancer Surgery. IGUSABDER 2021;15:669-80.
  • 4. Haydaroglu A, Cakar B, Gokmen E, et al. Epidemiological and overall survival characteristics of breast cancer patients in Ege University Hospital database. Ege Journal of Medicine 2019;58:50-7.
  • 5. Peterson AC, Uppal H. Method for predicting response to breast cancer therapeutic agents and method of treatment of breast cancer. Google Patents; 2019.
  • 6. Arslan AK, Tunc Z, Cicek IB, Colak C. A novel interpretable web-based tool on the associative classification methods: an application on breast cancer dataset. The Journal of Cognitive Systems 2020;5:33-40.
  • 7. Yilmaz R, Yagin FH. Early Detection of Coronary Heart Disease Based on Machine Learning Methods. Medical Records 2022;4:1-6.
  • 8. Awad M, Khanna R. Efficient learning machines: theories, concepts, and applications for engineers and system designers. Springer nature; 2015.
  • 9. Moons KG, Altman DG, Vergouwe Y, Royston P. Prognosis and prognostic research: application and impact of prognostic models in clinical practice. BMJ 2009;338:b606.
  • 10. Han J, Kamber M, Pei J. Data Mining: Concepts and Techniques Third Edition [M]. The Morgan Kaufmann Series in Data Management Systems 2011;5:83-124.
  • 11. Buchlak QD, Esmaili N, Leveque JC, et al. Machine learning applications to clinical decision support in neurosurgery: an artificial intelligence augmented systematic review. Neurosurg Rev 2020;43:1235-53.
  • 12. Yagin FH, Yagin B, Arslan AK, Colak C. Comparison of Performances of Associative Classification Methods for Cervical Cancer Prediction: Observational Study. Turkiye Klinikleri J Biostat 2021;13:266-72.
  • 13. Dua D, Graff C. UCI Machine Learning Repository [http:// archive. ics. uci. edu/ml]. Irvine, CA: University of California. School of Information and Computer Science. 2019;25:27.
  • 14. Telli S. [Emotion detection and recognition on twitter using ensemble learning] (Thesis). Izmir (Turkey): Ege Univ; 2019.
  • 15. Yangin G. [Application of XGboost and decision tree based algorithms on Diabetes Data] (Thesis). Istanbul (Turkey): Mimar Sinan Fine Arts Univ; 2019.
  • 16. Pittman SJ, Brown KA. Multi-scale approach for predicting fish species distributions across coral reef seascapes. PLoS One 2011;6:e20583.
  • 17. Hutchinson R, Liu L-P, Dietterich T. Incorporating boosted regression trees into ecological latent variable models. Proceedings of the AAAI Conference on Artificial Intelligence 2011;25:1343-8.
  • 18. Johnson R, Tong Zhang. Learning Nonlinear Functions Using Regularized Greedy Forest. IEEE Trans Pattern Anal Mach Intell 2014;36:942-54.
  • 19. Ekiz E. [Prediction of debt collection behaviour with machine learning techniques: A case study on telecommunication company customers] (Thesis). Istanbul (Turkey): Istanbul Technical Univ; 2019.
  • 20. Kesici M. [Wide area measurement based early prediction of power system transient instability and its evolution using deep learning and decision tree based algorithms] (Thesis). Istanbul (Turkey): Istanbul Technical Univ; 2019.
  • 21. Gumustas E. [Classification with ensemble methods on missing and imbalanced data]. (Thesis). Istanbul (Turkey): Mimar Sinan Fine Arts Univ; 2019.
  • 22. Ke G, Meng Q, Finley T, et al. Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems 2017;30:3146-54.
  • 23. Artur M. Review the performance of the Bernoulli Naïve Bayes Classifier in Intrusion Detection Systems using Recursive Feature Elimination with Cross-validated selection of the best number of features. Procedia Computer Science 2021;190:564-70.
  • 24. Python R. Python. Python Releases for Windows. 2019;24.
  • 25. Sadoughi F, Dana PM, Asemi Z, et al. Molecular and cellular mechanisms of melatonin in breast cancer. Biochimie 2022:S0300-9084(22)00067-0.
  • 26. Kavitha T, Mathai PP, Karthikeyan C, et al. Deep Learning Based Capsule Neural Network Model for Breast Cancer Diagnosis Using Mammogram Images. Interdiscip Sci 2022;14:113-29.
  • 27. Asri H, Mousannif H, Al Moatassime H, Noel T. Using machine learning algorithms for breast cancer risk prediction and diagnosis. Procedia Computer Science 2016;83:1064-9.
  • 28. Abdel-Zaher AM, Eldeib AM. Breast cancer classification using deep belief networks. Expert Systems with Applications 2016;46:139-44.
  • 29. Kor H. Classification of Breast Cancer by Machine Learning Methods. SETSCI Conference Proceedings 2019;4:508-11.
  • 30. Bayrak EA, Kırcı P, Ensari T. Comparison of machine learning methods for breast cancer diagnosis. 2019 Scientific meeting on electrical-electronics & biomedical engineering and computer science (EBBT): IEEE; 2019. p. 1-3.
  • 31. Rawal G, Rawal R, Shah H, Patel K. A Comparative Study Between Artificial Neural Networks and Conventional Classifiers for Predicting Diagnosis of Breast Cancer. ICDSMLA 2019. Springer; 2020. p. 261-71.
  • 32. Guldogan E, Tunc Z, Colak C. Classification of Breast Cancer and Determination of Related Factors with Deep Learning Approach. The Journal of Cognitive Systems 2020;5:10-4.
  • 33. Harinishree M, Aditya C, Sachin D. Detection of Breast Cancer using Machine Learning Algorithms–A Survey. 2021 5th International Conference on Computing Methodologies and Communication (ICCMC): IEEE; 2021. p. 1598-601.
  • 34. Assegie TA, Tulasi RL, Kumar NK. Breast cancer prediction model with decision tree and adaptive boosting. IAES International Journal of Artificial Intelligence 2021;10:184.
  • 35. Magesh G, Swarnalatha P. Analysis of breast cancer prediction and visualisation using machine learning models. International Journal of Cloud Computing 2022;11:43-60.
  • 36. Sakib S, Yasmin N, Tanzeem AK, Shorna F, Alam SB. Breast Cancer Detection and Classification: A Comparative Analysis Using Machine Learning Algorithms. Proceedings of Third International Conference on Communication, Computing and Electronics Systems: Springer; 2022. p. 703-17.
APA Akbulut S, BALIKCI CICEK I, ÇOLAK C (2022). Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application. , 196 - 203. 10.4274/haseki.galenos.2022.8440
Chicago Akbulut Sami,BALIKCI CICEK IPEK,ÇOLAK Cemil Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application. (2022): 196 - 203. 10.4274/haseki.galenos.2022.8440
MLA Akbulut Sami,BALIKCI CICEK IPEK,ÇOLAK Cemil Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application. , 2022, ss.196 - 203. 10.4274/haseki.galenos.2022.8440
AMA Akbulut S,BALIKCI CICEK I,ÇOLAK C Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application. . 2022; 196 - 203. 10.4274/haseki.galenos.2022.8440
Vancouver Akbulut S,BALIKCI CICEK I,ÇOLAK C Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application. . 2022; 196 - 203. 10.4274/haseki.galenos.2022.8440
IEEE Akbulut S,BALIKCI CICEK I,ÇOLAK C "Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application." , ss.196 - 203, 2022. 10.4274/haseki.galenos.2022.8440
ISNAD Akbulut, Sami vd. "Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application". (2022), 196-203. https://doi.org/10.4274/haseki.galenos.2022.8440
APA Akbulut S, BALIKCI CICEK I, ÇOLAK C (2022). Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application. Haseki Tıp Bülteni, 60(3), 196 - 203. 10.4274/haseki.galenos.2022.8440
Chicago Akbulut Sami,BALIKCI CICEK IPEK,ÇOLAK Cemil Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application. Haseki Tıp Bülteni 60, no.3 (2022): 196 - 203. 10.4274/haseki.galenos.2022.8440
MLA Akbulut Sami,BALIKCI CICEK IPEK,ÇOLAK Cemil Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application. Haseki Tıp Bülteni, vol.60, no.3, 2022, ss.196 - 203. 10.4274/haseki.galenos.2022.8440
AMA Akbulut S,BALIKCI CICEK I,ÇOLAK C Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application. Haseki Tıp Bülteni. 2022; 60(3): 196 - 203. 10.4274/haseki.galenos.2022.8440
Vancouver Akbulut S,BALIKCI CICEK I,ÇOLAK C Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application. Haseki Tıp Bülteni. 2022; 60(3): 196 - 203. 10.4274/haseki.galenos.2022.8440
IEEE Akbulut S,BALIKCI CICEK I,ÇOLAK C "Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application." Haseki Tıp Bülteni, 60, ss.196 - 203, 2022. 10.4274/haseki.galenos.2022.8440
ISNAD Akbulut, Sami vd. "Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application". Haseki Tıp Bülteni 60/3 (2022), 196-203. https://doi.org/10.4274/haseki.galenos.2022.8440