Yıl: 2021 Cilt: 12 Sayı: 5 Sayfa Aralığı: 757 - 765 Metin Dili: İngilizce DOI: 10.24012/dumf.1051352 İndeks Tarihi: 29-07-2022

Effect on model performance of regularization methods

Öz:
Artificial Neural Networks with numerous parameters are tremendously powerful machine learning systems. Nonetheless, overfitting is a crucial problem in such networks. Maximizing the model accuracy and minimizing the amount of loss is significant in reducing in-class differences and maintaining sensitivity to these differences. In this study, the effects of overfitting for different model architectures with the Wine dataset were investigated by Dropout, AlfaDropout, GausianDropout, Batch normalization, Layer normalization, Activity normalization, L1 and L2 regularization methods and the change in loss function the combination with these methods. Combinations that performed well were examined on different datasets using the same model. The binary cross-entropy loss function was used as a performance measurement metric. According to the results, the Layer and Activity regularization combination showed better training and testing performance compared to other combinations.
Anahtar Kelime: Overfitting Machine learning Regularization

Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık
  • [1] H. Akaike, “Information theory and an extension of the maximum likelihood principle,” in Selected Papers of Hirotugu Akaike. Berlin, Germany: Springer, 1998, pp. 199–213.
  • [2] A. Krizhevsky, I. Sutskever, and G. E. Hinton. “Imagenet classification with deep convolutional neural networks”. In Advances in neural information processing systems, pp. 1097–1105, 2012
  • [3] N. Srivastava, G. Hinton, A. Krizhevsky, L. Sutskever and R. Salakhutdinov. "Dropout: a simple way to prevent neural networks from overfitting." The journal of machine learning research 15.1 2014: pp1929-1958.
  • [4] S. Ioffe and C. Szegedy. “Batch normalization: Accelerating deep network training by reducing internal covariate shift”. In Proceedings of the 32nd International Conference on Machine Learning (ICML), 2015. Pp 448-456
  • [5] Y. Wu and K. He, “Group normalization,” in European Conference on Computer Vision (ECCV), 2018, pp. 3–19.
  • [6] J. L. Ba, J. R. Kiros, and G. E Hinton. “Layer normalization”. arXiv preprint 2016, arXiv:1607.06450
  • [7] D.Ulyanov, V. Andrea, and L. Victor . "Instance normalization: The missing ingredient for fast stylization." arXiv preprint, 2016 arXiv:1607.08022 .
  • [8] L. Wan, M. Zeiler,, S. Zhang, Y.L. Cun, and R. Fergus,. “Regularization of neural networks using dropconnect”. In Proceedings of the 30th International Conference on Machine Learning (ICML-13),2013., pp. 1058–1066
  • [9] L. Goodfellow, F. Warde, M. David, C. Mehdi, Aaron, and Y. Bengio,. “Maxout networks”. Proceedings of the International Conference on Learning Representations (ICLR), 2013, pp 1319- 1327.
  • [10] R. Tibshirani. “Regression shrinkage and selection via the lasso”. Journal of the Royal Statistical Society, Series B,1996 58:267 – 288
  • [11] normalization: A simple reparameterization to accelerate training of deep neural networks,” in Advances in Neural Information Processing Systems (NeurIPS), 2016, pp. 901–909
  • [12] S. Akbar , M. Peikari , S. Salama , S.Nofech-Mozes , A. Martel . “”The transition module: a method for preventing overfitting in convolutional neural networks. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 7(3), 260-265.. 2019; 7 (3): 260-265.
  • [13] J. Deng., W. Dong, SocherR. , L.-J.Li, K. Li., and L. Fei-Fei,. “ImageNet: A Large-Scale Hierarchical Image Database”. IEEE conference on computer vision and pattern recognition. Ieee, 2009. p. 248-255.
  • [14] R. Girshick, J. Donahue, T. Darrell,, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation”. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pp. 580–587.
  • [15] P. Cortez., A. Cerdeira., F. Almeida., T. Matos.and J.Reis, “Modeling wine preferences by data mining from physicochemical properties”. Decision support systems, 2009. 47(4), 547-553.
  • [16] Y. LeCun, . "The MNIST database of handwritten digits." http://yann. lecun. com/exdb/mnist/ 1998
  • [17] H. Xiao K. Rasul and R. Vollgraf, “Fashion-mnist: A novel image dataset for benchmarking machine learning algorithms”. arXiv , 2017,arXiv:1708.07747
  • [18] A. Krizhevsky and G. Hinton, "Learning multiple layers of features from tiny images", Computer Science Department University of Toronto Tech. Rep., 2009,vol. 1, no. 4, pp. 7,.
  • [19] L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Semantic image segmentation with deep convolutional nets and fully connected crfs,” in International Conference on Learning Representations (ICLR), 2015
  • [20] K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” in IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1026–1034
  • [21] T. Araújo , G. Aresta, E. Castro, J. Rouco, P. Aguiar , C. Eloy, A. Polónia, A. Campilho..”Classification of breast cancer histology images using convolutional neural networks”. PloS One 12, 2017.
  • [22] Y.Wei., F. Yang,., and , M. J. Wainwright “Early stopping for kernel boosting algorithms: A general analysis with localized complexities”. In Advances in Neural Information Processing Systems, 2017 pp. 6065–6075.
  • [23] A. Ali,.,J.Z. Kolter,., and R.J. Tibshirani,.” A continuoustime view of early stopping for least squares regression”. The 22nd International Conference on Artificial Intelligence and Statistics. PMLR, 2019. p. 1370-1378
  • [24] A.Suggala., A. Prasad., and P.K. Ravikumar,. “Connecting optimization and regularization paths”. In Advances in Neural Information Processing Systems,2018 pp. 10608– 10619.
  • [25] L. Schmidt., S. Santurkar., D. Tsipras., K. Talwar., and A. Madry. “Adversarially robust generalization requires”,2018 arXiv preprint arXiv:1804.11285.
  • [26] T. DeVries. and G. W.Taylor, “Improved regularization of convolutional neural networks with cutout”.2017, arXiv preprint arXiv:1708.04552
  • [27] S. Albawi , T.A Mohammed.and S. Al-Zawi, "Understanding of a convolutional neural network," 2017,International Conference on Engineering and Technology (ICET), pp. 1-6,
  • [28] Ö.F Ertuğrul, E. Acar, E. Aldemir,A. Öztekin A, Automatic diagnosis of cardiovascular disorders by sub images of the ECG signal using multi-feature extraction methods and randomized neural network, 2021, Biomedical Signal Processing and Control, Volume 64, 102260.
  • [29] E. Acar, Detection of unregistered electric distribution transformers in agricultural fields with the aid of Sentinel-1 SAR images by machine learning approaches, Computers and Electronics in Agriculture, 2020,Volume 175, 105559,
  • [30] D. M. Hawkins”The Problem of Overfitting”, J. Chem. Inf. Comput. Sci, , 2004, pp 44, 1-12
  • [31] T. Van Laarhoven,. “L2 regularization versus batch and weight normalization”. arXiv preprint 2017, arXiv:1706.05350.
  • [32] M.Y.Park., & T. Hastie, L1‐regularization path algorithm for generalized linear models. Journal of the Royal Statistical Society: Series B (Statistical Methodology),2007 69(4), 659-677.
APA budak c, MENÇİK V, ASKER M (2021). Effect on model performance of regularization methods. , 757 - 765. 10.24012/dumf.1051352
Chicago budak cafer,MENÇİK VASFİYE,ASKER Mehmet Emin Effect on model performance of regularization methods. (2021): 757 - 765. 10.24012/dumf.1051352
MLA budak cafer,MENÇİK VASFİYE,ASKER Mehmet Emin Effect on model performance of regularization methods. , 2021, ss.757 - 765. 10.24012/dumf.1051352
AMA budak c,MENÇİK V,ASKER M Effect on model performance of regularization methods. . 2021; 757 - 765. 10.24012/dumf.1051352
Vancouver budak c,MENÇİK V,ASKER M Effect on model performance of regularization methods. . 2021; 757 - 765. 10.24012/dumf.1051352
IEEE budak c,MENÇİK V,ASKER M "Effect on model performance of regularization methods." , ss.757 - 765, 2021. 10.24012/dumf.1051352
ISNAD budak, cafer vd. "Effect on model performance of regularization methods". (2021), 757-765. https://doi.org/10.24012/dumf.1051352
APA budak c, MENÇİK V, ASKER M (2021). Effect on model performance of regularization methods. Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi, 12(5), 757 - 765. 10.24012/dumf.1051352
Chicago budak cafer,MENÇİK VASFİYE,ASKER Mehmet Emin Effect on model performance of regularization methods. Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi 12, no.5 (2021): 757 - 765. 10.24012/dumf.1051352
MLA budak cafer,MENÇİK VASFİYE,ASKER Mehmet Emin Effect on model performance of regularization methods. Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi, vol.12, no.5, 2021, ss.757 - 765. 10.24012/dumf.1051352
AMA budak c,MENÇİK V,ASKER M Effect on model performance of regularization methods. Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi. 2021; 12(5): 757 - 765. 10.24012/dumf.1051352
Vancouver budak c,MENÇİK V,ASKER M Effect on model performance of regularization methods. Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi. 2021; 12(5): 757 - 765. 10.24012/dumf.1051352
IEEE budak c,MENÇİK V,ASKER M "Effect on model performance of regularization methods." Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi, 12, ss.757 - 765, 2021. 10.24012/dumf.1051352
ISNAD budak, cafer vd. "Effect on model performance of regularization methods". Dicle Üniversitesi Mühendislik Fakültesi Mühendislik Dergisi 12/5 (2021), 757-765. https://doi.org/10.24012/dumf.1051352