Yıl: 2023 Cilt: 6 Sayı: 2 Sayfa Aralığı: 140 - 148 Metin Dili: İngilizce DOI: 10.35377/saucis...1309103 İndeks Tarihi: 06-09-2023

Price Prediction Using Web Scraping and Machine Learning Algorithms in the Used Car Market

Öz:
The development of technology increases data traffic and data size day by day. Therefore, it has become very important to collect and interpret data. This study, it is aimed to analyze the car sales data collected using web scraping techniques by using machine learning algorithms and to create a price estimation model. The data needed for analysis was collected using Selenium and BeautifulSoup and prepared for analysis by applying various data preprocessing steps. Lasso regression and PCA analysis were used for feature selection and size reduction, and the GridSearchCV method was used for hyperparameter tuning. The results were evaluated with machine learning algorithms. Random Forest, K-Nearest Neighbor, Gradient Boost, AdaBoost, Support Vector and XGBoost regression algorithms were used in the analysis. The obtained analysis results were evaluated together with Mean Square Error (MSE), Root Mean Square Error (RMSE) and Coefficient of Determination (R-square). When the results for data set 1 were examined, the model that gave the best results was XGBoost Regression with 0.973 R2, 0.026 MSE and 0.161 RMSE values. When the results for data set 2 were examined, the model that gave the best results was K-Nearest Neighbor Regression with 0.978 R2, 0.021 MSE and 0.145 RMSE values.
Anahtar Kelime: web scraping machine learning price prediction

Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık
  • [1] Milev, P., Conceptual Approach for Development of Web Scraping Application for Tracking Information. Economic Alternatives, 475-485, 2017.
  • [2] Khder, M., Web Scraping or Web Crawling: State of Art, Techniques, 73 Approaches and Application. International Journal of Advances in Soft Computing and its Applications, 2021.
  • [3] Banerjee, R., Website Scraping, Happiest Minds Technologies, 2014.
  • [4] Haddaway, N., The use of web-scraping software in searching for grey literature. Grey Journal, 11(3):186-190, 2015.
  • [5] Gegic, E.; Isakovic, B.; Keco, D.; Masetic, Z.; Kevric, J. Car price prediction using machine learning techniques. TEM J. 2019, 8, 113.
  • [6] Asghar, M., Mehmood, K., Yasin, S., & Khan, Z. M., Used Cars Price Prediction using Machine Learning with Optimal Features. Pakistan Journal of Engineering and Technology, 4(2), 113-119, 2021.
  • [7] Pandey, A., Rastogi, V., & Singh, S., Car’s selling price prediction using random forest machine learning algorithm. In 5th International Conference on Next Generation Computing Technologies, 2020.
  • [8] Chen, K.-P., Liang, T.-P., Yin, S.-Y., Chang, T., Liu, Y.-C., & Yu, Y.-T., How serious is shill bidding in online auctions? evidence from eBay motors. work, 1–51, 2020.
  • [9] Yolcu360, Available: https://yolcu360.com/blog/oto-ekspertiz-raporunda-ne-yazar. [Accessed: 03-May-2023].
  • [10] Scikit-learn, Available: https://scikit-learn.org/stable/modules/cross_validation.html. [Accessed: 04-May-2023].
  • [11] Breiman, L., Random Forests. Machine Learning, 45(1), 5-32, 2001.
  • [12] Breiman, L., Bagging Predictors. Machine Learning, 24(2), 123-140, 1996.
  • [13] Altman, N. S. An introduction to kernel and nearest-neighbor nonparametric regression, The American Statistician, vol.46, s. 175–185, 1992.
  • [14] Friedman, J. H., Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189-1232, 2001.
  • [15] Freund, Y. and Schapire, R. E. “Experiments with a new boosting algorithm”, Icml, 96, 148-156, 1996.
  • [16] Schapire, R. E., Explaining adaboost. In Empirical Inference, pp. 37–52, Berlin Heidelberg., 2013.
  • [17] Vapnik V., The Nature of Statistical Learning Theory, 1995.
  • [18] Awad M. and Khanna R., Efficient Learning Machines, Apress, 2015.
  • [19] Chen, T., Guestrin, C., XGBoost: A scalable tree boosting system. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 13-17-Augu, 785–794, 2016.
APA Yılmaz S, Selvi I (2023). Price Prediction Using Web Scraping and Machine Learning Algorithms in the Used Car Market. , 140 - 148. 10.35377/saucis...1309103
Chicago Yılmaz Seda,Selvi Ihsan Hakan Price Prediction Using Web Scraping and Machine Learning Algorithms in the Used Car Market. (2023): 140 - 148. 10.35377/saucis...1309103
MLA Yılmaz Seda,Selvi Ihsan Hakan Price Prediction Using Web Scraping and Machine Learning Algorithms in the Used Car Market. , 2023, ss.140 - 148. 10.35377/saucis...1309103
AMA Yılmaz S,Selvi I Price Prediction Using Web Scraping and Machine Learning Algorithms in the Used Car Market. . 2023; 140 - 148. 10.35377/saucis...1309103
Vancouver Yılmaz S,Selvi I Price Prediction Using Web Scraping and Machine Learning Algorithms in the Used Car Market. . 2023; 140 - 148. 10.35377/saucis...1309103
IEEE Yılmaz S,Selvi I "Price Prediction Using Web Scraping and Machine Learning Algorithms in the Used Car Market." , ss.140 - 148, 2023. 10.35377/saucis...1309103
ISNAD Yılmaz, Seda - Selvi, Ihsan Hakan. "Price Prediction Using Web Scraping and Machine Learning Algorithms in the Used Car Market". (2023), 140-148. https://doi.org/10.35377/saucis...1309103
APA Yılmaz S, Selvi I (2023). Price Prediction Using Web Scraping and Machine Learning Algorithms in the Used Car Market. Sakarya University Journal of Computer and Information Sciences (Online), 6(2), 140 - 148. 10.35377/saucis...1309103
Chicago Yılmaz Seda,Selvi Ihsan Hakan Price Prediction Using Web Scraping and Machine Learning Algorithms in the Used Car Market. Sakarya University Journal of Computer and Information Sciences (Online) 6, no.2 (2023): 140 - 148. 10.35377/saucis...1309103
MLA Yılmaz Seda,Selvi Ihsan Hakan Price Prediction Using Web Scraping and Machine Learning Algorithms in the Used Car Market. Sakarya University Journal of Computer and Information Sciences (Online), vol.6, no.2, 2023, ss.140 - 148. 10.35377/saucis...1309103
AMA Yılmaz S,Selvi I Price Prediction Using Web Scraping and Machine Learning Algorithms in the Used Car Market. Sakarya University Journal of Computer and Information Sciences (Online). 2023; 6(2): 140 - 148. 10.35377/saucis...1309103
Vancouver Yılmaz S,Selvi I Price Prediction Using Web Scraping and Machine Learning Algorithms in the Used Car Market. Sakarya University Journal of Computer and Information Sciences (Online). 2023; 6(2): 140 - 148. 10.35377/saucis...1309103
IEEE Yılmaz S,Selvi I "Price Prediction Using Web Scraping and Machine Learning Algorithms in the Used Car Market." Sakarya University Journal of Computer and Information Sciences (Online), 6, ss.140 - 148, 2023. 10.35377/saucis...1309103
ISNAD Yılmaz, Seda - Selvi, Ihsan Hakan. "Price Prediction Using Web Scraping and Machine Learning Algorithms in the Used Car Market". Sakarya University Journal of Computer and Information Sciences (Online) 6/2 (2023), 140-148. https://doi.org/10.35377/saucis...1309103