TY  - JOUR
TI  - An investigation of ensemble learning methods in classification problemsand an application on non-small-cell lung cancer data
AB  - This study aims to classify NSCLC death status and consists of patient records of 24 variables created by the open-source dataset of the cancer data site. Besides, basicclassifiers such as SMO (Sequential Minimal Optimization), K-NN (K-Nearest Neighbor), random forest, and XGBoost (Extreme Gradient Boosting), which are machinelearning methods, and their performances, and voting, bagging, boosting, and stacking methods from ensemble learning methods were used. Performance evaluationof models was compared in terms of accuracy, specificity, sensitivity, precision, and Roc curve. The basic classifier performances of random forest, SMO, K-NN, andXGBoost classifiers, their performances in the bagging ensemble learning method, and their performances in the boosting ensemble learning method are evaluated. Inaddition, Model 1 (random forest + SMO), Model 2 (XGBoost + K-NN), Model 3 (random forest + K-NN), Model 4 (XGBoost+SMO), Model 5 (SMO+K-NN + randomforest), Model 6 (SMO+K-NN+XGBoost) and Model 7 (SMO+K-NN + random forest + XGBoost) the performances of in different metrics were expressed. The boosting ensemble learning method, which provides the maximum classification performance with XGBoost, achieved a 0.982 accuracy value, 0.971 sensitivity value, 0.989precision value, 0.989 specificity value, and 0.998 ROC curve. It is recommended to use ensemble learning methods for classification problems in patients with a highprevalence of cancer to achieve successful results.
AU  - ÇOLAK, Cemil
AU  - KIVRAK, MEHMET
DO  - 10.5455/medscience.2021.10.339
PY  - 2022
JO  - Medicine Science
VL  - 11
IS  - 2
SN  - 2147-0634
SP  - 924
EP  - 933
DB  - TRDizin
UR  - http://search/yayin/detay/529902
ER  -