Yıl: 2023 Cilt: 31 Sayı: 1 Sayfa Aralığı: 112 - 125 Metin Dili: İngilizce DOI: 10.55730/1300-0632.3974 İndeks Tarihi: 16-05-2023

Early diagnosis of pancreatic cancer by machine learning methods using urine biomarker combinations

Öz:
The most common type of pancreatic cancer is pancreatic ductal adenocarcinoma (PDAC), which accounts for the vast majority of pancreatic cancers. The five-year survival rate for PDAC due to late diagnosis is 9%. Early diagnosed PDAC patients survive longer than patients diagnosed at a more advanced stage. Biomarkers can play an essential role in the early detection of PDAC to assist the health professional. Machine learning and deep learning methods are used with biomarkers obtained in recent studies for diagnostic purposes. In order to increase the survival rates of PDAC patients, early diagnosis of the disease with a noninvasive test is a critical need. Our study offers a promising approach for the early detection of PDAC with noninvasive urinary biomarkers and carbohydrate antigen 19-9 (CA19-9). The Kaggle Urinary Biomarkers for Pancreatic Cancer (2020) open-access dataset consisting of 590 participants was used in this study. Seven machine learning classifiers (support vector machine (SVM), naive Bayes (NB), k-nearest neighbors (kNN), random forest (RF), light gradient boosting machine (LightGBM), AdaBoost, and gradient boosting classifier (GBC)) to detect PDAC disease classifier were used. Binary and multiple classification processes were carried out. Data was validated in our study using 5–10-fold crossvalidation. This study aimed to determine the best machine learning model by analyzing the performance of machine learning models in determining the classes of healthy controls, pancreatic disorders, and patients with PDAC. It is a remarkable finding that ensemble learning models were more successful in all our groups. The most successful classification method in classifying healthy controls and patients with PDAC was CV-10, while the GBC (92.99%) model was (AUC = 0.9761). The most successful classification method in classifying patients with pancreatic disorders and PDAC was CV-10, while the LightGBM (86.37%) model was (AUC = 0.9348). In the classification of healthy controls, pancreatic disorders, and patients with PDAC, the most successful classification method was CV-5, while the GBC (72.91%) model was (AUC = 0.8733).
Anahtar Kelime: Pancreatic cancer urine biomarker machine learning ensemble learning classification

Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık
  • [1] Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A cancer journal for clinicians 2018; 68 (6): 394-424. https://doi.org/10.3322/caac.21492
  • [2] Franck C, Müller C, Rosania R, Croner RS, Pech M et al. Advanced pancreatic ductal adenocarcinoma: moving forward. Cancers 2020; 12 (7): 1955. https://doi.org/10.3390/cancers12071955
  • [3] Siegel RL, Miller KD, Jemal A. Cancer statistics, 2019. CA: A cancer journal for clinicians 2019 Jan; 69 (1): 7-34. https://doi.org/10.3322/caac.21551
  • [4] Korc M. Pathogenesis of pancreatic cancer-related diabetes mellitus: Quo Vadis? Pancreas 2019; 48 (5): 594.
  • [5] Rahib L, Smith BD, Aizenberg R, Rosenzweig AB, Fleshman JM et al. Projecting cancer incidence and deaths to 2030: the unexpected burden of thyroid, liver, and pancreas cancers in the United States. Cancer research 2014; 74 (11): 2913-2921. https://doi.org/10.1158/0008-5472.CAN-14-0155
  • [6] Brezgyte G, Shah V, Jach D, Crnogorac-Jurcevic T. Non-Invasive Biomarkers for Earlier Detection of Pancreatic Cancer—A Comprehensive Review. Cancers 2021 ;13 (11): 2722. https://doi.org/10.3390/cancers13112722
  • [7] Young MR, Wagner PD, Ghosh S, Rinaudo JA, Baker SG et al. Validation of biomarkers for early detection of pancreatic cancer: summary of the alliance of pancreatic cancer consortia for biomarkers for early detection workshop. Pancreas 2018; 47 (2): 135-141.
  • [8] Radon TP, Massat NJ, Jones R, Alrawashdeh W, Dumartin L et al. Identification of a three-biomarker panel in urine for early detection of pancreatic adenocarcinoma. Clinical Cancer Research 2015 ; 21 (15): 3512-3521.
  • [9] Pereira SP, Oldfield L, Ney A, Hart PA, Keane MG et al. Early detection of pancreatic cancer. The lancet Gastroenterology & hepatology 2020;5 (7): 698-710.
  • [10] Kriz D, Ansari D, Andersson R. Potential biomarkers for early detection of pancreatic ductal adenocarcinoma. Clinical and Translational Oncology 2020;(12): 2170-2174. https://doi.org/10.1007/s12094-020-02372-0
  • [11] Almeida PP, Cardoso CP, de Freitas LM. PDAC-ANN: an artificial neural network to predict pancreatic ductal adenocarcinoma based on gene expression. BMC cancer 2020; 20 (1): 1-11.
  • [12] Muhammad W, Hart GR, Nartowt B, Farrell JJ, Johung K et al. Pancreatic cancer prediction through an artificial neural network. Frontiers in Artificial Intelligence 2019; 2 (2): 1-10. https://doi.org/10.3389/frai.2019.00002
  • [13] Barat M, Chassagnon G, Dohan A, Gaujoux S, Coriat R et al. Artificial intelligence: a critical review of current applications in pancreatic imaging. Japanese Journal of Radiology 2021; 39 (6) : 514-523. https://doi.org/10.1007/s11604-021-01102-y
  • [14] Kaissis G, Ziegelmayer S, Lohöfer F, Algül H, Eiber M et al. A machine learning model for the prediction of survival and tumor subtype in pancreatic ductal adenocarcinoma from preoperative diffusion-weighted imaging. European radiology experimental 2019; 3 (1): 1-9. https://doi.org/10.1186/s41747-019-0119-0
  • [15] Honda K, Hayashida Y, Umaki T, Okusaka T, Kosuge T et al. Possible detection of pancreatic cancer by plasma protein profiling. Cancer research 2005 ; 65 (22): 10613-10622.
  • [16] Hsieh MH, Sun LM, Lin CL, Hsieh MJ, Hsu CY et al. Development of a prediction model for pancreatic cancer in patients with type 2 diabetes using logistic regression and artificial neural network models. Cancer management and research 2018; 10: 6317-6324. https://doi.org/10.2147/CMAR.S180791
  • [17] Khatri I, Bhasin MK. A Transcriptomics-Based Meta-Analysis Combined With Machine Learning Identifies a Secretory Biomarker Panel for Diagnosis of Pancreatic Adenocarcinoma. Frontiers in genetics 2020; 11: 572284. https://doi.org/10.3389/fgene.2020.572284
  • [18] Debernardi S, O’Brien H, Algahmdi AS, Malats N, Stewart GD et al. A combination of urinary biomarker panel and PancRISK score for earlier detection of pancreatic cancer: A case–control study. PLoS Medicine 2020 ; 17 (12): e1003489. https://doi.org/10.1371/journal.pmed.1003489
  • [19] Zhang YW, Ding LS, Lai MD. Reg gene family and human diseases. World journal of gastroenterology: WJG 2003 Dec 15; 9 (12): 2635-2641. https://doi.org/10.3748/wjg.v9.i12.2635
  • [20] Makawita S, Dimitromanolakis A, Soosaipillai A, Soleas I, Chan A et al. Validation of four candidate pancre- atic cancer serological biomarkers that improve the performance of CA19. 9. BMC cancer 2013; 13 (1): 1-11. https://doi.org/10.1186/1471-2407-13-404
  • [21] O’Neill RS, Stoita A. Biomarkers in the diagnosis of pancreatic cancer: Are we closer to finding the golden ticket?. World Journal of Gastroenterology 2021 ; 27 (26): 4045-4087.
  • [22] Cortes C, Vapnik V. Support-vector networks. Mach Learn 1995 ; 20 (3): 273-297. https://doi.org/10.1007/BF00994018
  • [23] Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B. Support vector machines. IEEE Intelligent Systems and their applications 1998; 13 (4): 18-28. https://doi.org/10.1109/5254.708428
  • [24] Chang CC, Lin CJ. LIBSVM: A library for support vector machines. ACM transactions on intelligent systems and technology (TIST) 2011; 2 (3): 1-27. https://doi.org/10.1145/1961189.1961199
  • [25] Jiang S, Pang G, Wu M, Kuang L. An improved K-nearest-neighbor algorithm for text categorization. Expert Systems with Applications 2012 ; 39 (1): 1503-1509. https://doi.org/10.1016/j.eswa.2011.08.040
  • [26] Xing W, Bei Y. Medical health big data classification based on KNN classification algorithm. IEEE Access 2019 ; 8: 28808-28819.
  • [27] Langley P, Iba W, Thompson K. An analysis of Bayesian classifiers. InAaai 1992 ; 90: 223-228.
  • [28] Zhou ZH. Machine learning: Ensemble learning. 1st ed. China: Springer Singapore Press 2021: pp. 181-210. https://doi.org/10.1007/978-981-15-1967-3_8
  • [29] Verma A, Mehta S. A comparative study of ensemble learning methods for classification in bioinformatics. In 2017 7th International Conference on Cloud Computing, Data Science & Engineering-Confluence 2017: 155-158. https://doi.org/10.1109/CONFLUENCE.2017.7943141
  • [30] Jafarzadeh H, Mahdianpari M, Gill E, Mohammadimanesh F, Homayouni S. Bagging and boosting ensemble classifiers for classification of multispectral, hyperspectral and PolSAR data: a comparative evaluation. Remote Sensing 2021; 13 (21): 4405. https://doi.org/10.3390/rs13214405
  • [31] Alzamzami F, Hoda M, El Saddik A. Light gradient boosting machine for general sentiment classification on short texts: a comparative evaluation. IEEE access 2020; 8: 101840-101858.
  • [32] Freund Y, Schapire RE. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences 1997; 55 (1): 119-139. https://doi.org/10.1006/jcss.1997.1504
  • [33] Bahad P, Saxena P. Study of adaboost and gradient boosting algorithms for predictive analytics. In International Conference on Intelligent Computing and Smart Communication 2019; Singapore 2020: 235-244.
  • [34] Sim J, Wright CC. The kappa statistic in reliability studies: use, interpretation, and sample size requirements. Physical therapy 2005 ; 85 (3): 257-268. https://doi.org/10.1093/ptj/85.3.257
  • [35] Singhi AD, Koay EJ, Chari ST, Maitra A. Early detection of pancreatic cancer: opportunities and challenges. Gastroenterology 2019 ; 156 (7): 2024-2040. https://doi.org/10.1053/j.gastro.2019.01.259
  • [36] Brand RE, Matamoros A. Imaging techniques in the evaluation of adenocarcinoma of the pancreas. Digestive Diseases 1998; 16 (4): 242-252. https://doi.org/10.1159/000016872
  • [37] De La Cruz MS, Young AP, Ruffin MT. Diagnosis and management of pancreatic cancer. American family physician 2014 ; 89 (8): 626-632.
  • [38] Gupta S, Gupta MK. A comprehensive data level investigation of cancer diagnosis on imbalanced data. Computa- tional Intelligence 2022 ; 38 (1): 156-186. https://doi.org/10.1111/coin.12452
  • [39] Goonetilleke KS, Siriwardena AK. Systematic review of carbohydrate antigen (CA 19-9) as a biochemical marker in the diagnosis of pancreatic cancer. European Journal of Surgical Oncology (EJSO) 2007; 33 (3): 266-270. https://doi.org/10.1016/j.ejso.2006.10.004
  • [40] Azizian A, Rühlmann F, Krause T, Bernhardt M, Jo P et al. CA19-9 for detecting recurrence of pancreatic cancer. Scientific reports 2020 ; 10 (1): 1-10. https://doi.org/10.1038/s41598-020-57930-x
  • [41] Majumder S, Taylor WR, Foote PH, Berger CK, Wu CW et al. High detection rates of pancreatic cancer across stages by plasma assay of novel methylated DNA markers and CA19-9. Clinical Cancer Research 2021; 27 (9): 2523-2532. https://doi.org/10.1158/1078-0432.CCR-20-0235
  • [42] Yadav D, Lowenfels AB. The epidemiology of pancreatitis and pancreatic cancer. Gastroenterology 2013 ; 144 (6): 1252-1261. https://doi.org/10.1053/j.gastro.2013.01.068
  • [43] Narayanan S, Balamurugan NM, Maithili K, Palas PB. Leveraging Machine Learning Methods for Multiple Disease Prediction using Python ML Libraries and Flask API. In IEEE 2022 International Conference on Applied Artificial Intelligence and Computing (ICAAIC) 2022: 694-701.
APA Acer İ, ORHANBULUCU F, içer s, Latifoğlu F (2023). Early diagnosis of pancreatic cancer by machine learning methods using urine biomarker combinations. , 112 - 125. 10.55730/1300-0632.3974
Chicago Acer İrem,ORHANBULUCU Fırat,içer semra,Latifoğlu Fatma Early diagnosis of pancreatic cancer by machine learning methods using urine biomarker combinations. (2023): 112 - 125. 10.55730/1300-0632.3974
MLA Acer İrem,ORHANBULUCU Fırat,içer semra,Latifoğlu Fatma Early diagnosis of pancreatic cancer by machine learning methods using urine biomarker combinations. , 2023, ss.112 - 125. 10.55730/1300-0632.3974
AMA Acer İ,ORHANBULUCU F,içer s,Latifoğlu F Early diagnosis of pancreatic cancer by machine learning methods using urine biomarker combinations. . 2023; 112 - 125. 10.55730/1300-0632.3974
Vancouver Acer İ,ORHANBULUCU F,içer s,Latifoğlu F Early diagnosis of pancreatic cancer by machine learning methods using urine biomarker combinations. . 2023; 112 - 125. 10.55730/1300-0632.3974
IEEE Acer İ,ORHANBULUCU F,içer s,Latifoğlu F "Early diagnosis of pancreatic cancer by machine learning methods using urine biomarker combinations." , ss.112 - 125, 2023. 10.55730/1300-0632.3974
ISNAD Acer, İrem vd. "Early diagnosis of pancreatic cancer by machine learning methods using urine biomarker combinations". (2023), 112-125. https://doi.org/10.55730/1300-0632.3974
APA Acer İ, ORHANBULUCU F, içer s, Latifoğlu F (2023). Early diagnosis of pancreatic cancer by machine learning methods using urine biomarker combinations. Turkish Journal of Electrical Engineering and Computer Sciences, 31(1), 112 - 125. 10.55730/1300-0632.3974
Chicago Acer İrem,ORHANBULUCU Fırat,içer semra,Latifoğlu Fatma Early diagnosis of pancreatic cancer by machine learning methods using urine biomarker combinations. Turkish Journal of Electrical Engineering and Computer Sciences 31, no.1 (2023): 112 - 125. 10.55730/1300-0632.3974
MLA Acer İrem,ORHANBULUCU Fırat,içer semra,Latifoğlu Fatma Early diagnosis of pancreatic cancer by machine learning methods using urine biomarker combinations. Turkish Journal of Electrical Engineering and Computer Sciences, vol.31, no.1, 2023, ss.112 - 125. 10.55730/1300-0632.3974
AMA Acer İ,ORHANBULUCU F,içer s,Latifoğlu F Early diagnosis of pancreatic cancer by machine learning methods using urine biomarker combinations. Turkish Journal of Electrical Engineering and Computer Sciences. 2023; 31(1): 112 - 125. 10.55730/1300-0632.3974
Vancouver Acer İ,ORHANBULUCU F,içer s,Latifoğlu F Early diagnosis of pancreatic cancer by machine learning methods using urine biomarker combinations. Turkish Journal of Electrical Engineering and Computer Sciences. 2023; 31(1): 112 - 125. 10.55730/1300-0632.3974
IEEE Acer İ,ORHANBULUCU F,içer s,Latifoğlu F "Early diagnosis of pancreatic cancer by machine learning methods using urine biomarker combinations." Turkish Journal of Electrical Engineering and Computer Sciences, 31, ss.112 - 125, 2023. 10.55730/1300-0632.3974
ISNAD Acer, İrem vd. "Early diagnosis of pancreatic cancer by machine learning methods using urine biomarker combinations". Turkish Journal of Electrical Engineering and Computer Sciences 31/1 (2023), 112-125. https://doi.org/10.55730/1300-0632.3974