Yıl: 2019 Cilt: 7 Sayı: 2 Sayfa Aralığı: 99 - 103 Metin Dili: İngilizce DOI: https://doi.org/10.18201//ijisae.2019252788 İndeks Tarihi: 16-01-2020

TurkiS: A Turkish Sentiment Analyzer Using Domain-specific Automatic Labelled Dataset

Öz:
A preliminary task of sentiment analysis aims to detect polarities of a text either positive or negative. These texts vary from movie reviews to customer comments on electronic devices. The polarity detector trained in one domain may not achieve remarkable results in another domain. In this study, we provide a training and test dataset generator for domain-specific sentiment analysis in which machine learning methods can be trained without any human labor. To do it, we extract comments and polarity scores from a popular e-commerce website for electronic devices in Turkey. Also, we translate a well-known sentiment lexicon into Turkish and use this lexicon in a lexicon-based polarity detection. We compare well-known machine learning methods trained by automatically labeled dataset and lexicon-based method for Turkish texts in the electronics domain. The experimental setup is conducted by the generated evaluation dataset for the e-commerce domain. The test set contains 30% randomly selected from the generated dataset. Our lexicon-based method has achieved 0.62 F1 score and other three supervised learning methods achieved the best results. Further, logistic regression gets the highest score when it uses count vectorizer as a feature extraction mechanism.
Anahtar Kelime:

Konular: Bilgisayar Bilimleri, Yapay Zeka
Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık
  • [1] B. Liu, “Sentiment analysis and opinion mining,” Synthesis Lectures on Human Language Technologies, vol. 5, no. 1, pp. 1-167, May 2012.
  • [2] C. Catal and M. Nangir, “A sentiment classification model based on multiple classifiers,” Applied Soft Computing, vol. 50, pp. 135-141, Jan. 2017.
  • [3] B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up?: sentiment classification using machine learning techniques,” In: Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10, Association for Computational Linguistics, 2002, pp. 79-86.
  • [4] K. Ravi and V. Ravi, “A survey on opinion mining and sentiment analysis: Tasks, approaches and applications,” Knowledge-Based Systems, vol. 89, pp. 14-46, Nov. 2015.
  • [5] M. Giatsoglou, M. G. Vozalis, K. Diamantaras, A. Vakali, G. Sarigiannidis, and K. C. Chatzisavvas, “Sentiment analysis leveraging emotions and word embeddings,” Expert Systems with Applications, vol. 69, pp. 214-224, Mar. 2017.
  • [6] S. V. Wawre and S. N. Deshmukh, “Sentiment classification using machine learning techniques,” International Journal of Science and Research (IJSR), vol. 5, no. 4, pp. 819-821, Apr. 2016.
  • [7] R. Dehkharghani, B. Yanikoglu, Y. Saygin, and K. Oflazer, “Sentiment analysis in Turkish at different granularity levels,” Natural Language Engineering, vol. 23, no. 4, pp. 535-559, Jul. 2017
  • [8] O. Bilgin, O. Cetinoglu, and K. Oflazer, “Building a Wordnet for Turkish,” Romanian Journal of Information Science and Technology, vol. 7, no. 1-2, pp. 163-172, 2004.
  • [9] R. Dehkharghani, B. Yanikoglu, Y. Saygin, and K. Oflazer, “Sentiment analysis in Turkish: Towards a complete framework,” Natural Language Engineering, vol. 1, no. 1, 2015.
  • [10] F. Wu and Y. Huang, “Sentiment domain adaptation with multiple sources,” in Proc. of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 301-310, 2016.
  • [11] S. Makinist, I. R. Hallac, B. A. Karakus, and G. Aydin, “Preparation of improved Turkish dataset for sentiment analysis in social media,” in Proc. CMES2017, arXiv preprint arXiv:1801.09975.
  • [12] N. Godbole, M. Srinivasaiah, and S. Skiena, “Large-scale sentiment analysis for news and blogs,” in Proc. of the International Conference on Weblogs and Social Media (ICWSM), vol. 7, no. 21, pp. 219-222, 2007.
  • [13] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space”, in Proceedings of the International Conference on Learning Representations (ICLR 2013), arXiv preprint arXiv:1301.3781, Jan. 2013.
  • [14] L. Zhang, R. Ghosh, M. Dekhil, M. Hsu, and B. Liu, “Combining lexicon-based and learning-based methods for Twitter sentiment analysis,” HP Laboratories, Tech. Rep. HPL-2011-89, Jun. 2011.
  • [15] M. S. Hajmohammadi, R. Ibrahim, and Z. A. Othman, “Opinion mining and sentiment analysis: a survey,” International Journal of Computers & Technology, vol. 2, no. 3, pp.171-178, Jun. 2012.
  • [16] A. Esuli, and F. Sebastiani, “Sentiwordnet: A publicly available lexical resource for opinion mining,” in Proc. of the 5th Conference on Language Resources and Evaluation (LREC’06), 2006, pp. 417- 422.
  • [17] E. Cambria, S. Poria, R. Bajpai, and B. W. Schuller, “Senticnet 4: A semantic resource for sentiment analysis based on conceptual primitives,” in Proc. of the 26th International Conference on Computational Linguistics: Technical Papers (COLING 2016), Osaka, Japan, 2016, pp. 2666-2677.
  • [18] A. A. Akin and M. D. Akin, “Zemberek, an open source nlp framework for Turkic languages,” Structure, vol. 10, pp. 1-5, 2007.
APA İnan E, SOYGAZI F, Mostafapour V (2019). TurkiS: A Turkish Sentiment Analyzer Using Domain-specific Automatic Labelled Dataset. , 99 - 103. https://doi.org/10.18201//ijisae.2019252788
Chicago İnan Emrah,SOYGAZI FATIH,Mostafapour Vahab TurkiS: A Turkish Sentiment Analyzer Using Domain-specific Automatic Labelled Dataset. (2019): 99 - 103. https://doi.org/10.18201//ijisae.2019252788
MLA İnan Emrah,SOYGAZI FATIH,Mostafapour Vahab TurkiS: A Turkish Sentiment Analyzer Using Domain-specific Automatic Labelled Dataset. , 2019, ss.99 - 103. https://doi.org/10.18201//ijisae.2019252788
AMA İnan E,SOYGAZI F,Mostafapour V TurkiS: A Turkish Sentiment Analyzer Using Domain-specific Automatic Labelled Dataset. . 2019; 99 - 103. https://doi.org/10.18201//ijisae.2019252788
Vancouver İnan E,SOYGAZI F,Mostafapour V TurkiS: A Turkish Sentiment Analyzer Using Domain-specific Automatic Labelled Dataset. . 2019; 99 - 103. https://doi.org/10.18201//ijisae.2019252788
IEEE İnan E,SOYGAZI F,Mostafapour V "TurkiS: A Turkish Sentiment Analyzer Using Domain-specific Automatic Labelled Dataset." , ss.99 - 103, 2019. https://doi.org/10.18201//ijisae.2019252788
ISNAD İnan, Emrah vd. "TurkiS: A Turkish Sentiment Analyzer Using Domain-specific Automatic Labelled Dataset". (2019), 99-103. https://doi.org/https://doi.org/10.18201//ijisae.2019252788
APA İnan E, SOYGAZI F, Mostafapour V (2019). TurkiS: A Turkish Sentiment Analyzer Using Domain-specific Automatic Labelled Dataset. International Journal of Intelligent Systems and Applications in Engineering, 7(2), 99 - 103. https://doi.org/10.18201//ijisae.2019252788
Chicago İnan Emrah,SOYGAZI FATIH,Mostafapour Vahab TurkiS: A Turkish Sentiment Analyzer Using Domain-specific Automatic Labelled Dataset. International Journal of Intelligent Systems and Applications in Engineering 7, no.2 (2019): 99 - 103. https://doi.org/10.18201//ijisae.2019252788
MLA İnan Emrah,SOYGAZI FATIH,Mostafapour Vahab TurkiS: A Turkish Sentiment Analyzer Using Domain-specific Automatic Labelled Dataset. International Journal of Intelligent Systems and Applications in Engineering, vol.7, no.2, 2019, ss.99 - 103. https://doi.org/10.18201//ijisae.2019252788
AMA İnan E,SOYGAZI F,Mostafapour V TurkiS: A Turkish Sentiment Analyzer Using Domain-specific Automatic Labelled Dataset. International Journal of Intelligent Systems and Applications in Engineering. 2019; 7(2): 99 - 103. https://doi.org/10.18201//ijisae.2019252788
Vancouver İnan E,SOYGAZI F,Mostafapour V TurkiS: A Turkish Sentiment Analyzer Using Domain-specific Automatic Labelled Dataset. International Journal of Intelligent Systems and Applications in Engineering. 2019; 7(2): 99 - 103. https://doi.org/10.18201//ijisae.2019252788
IEEE İnan E,SOYGAZI F,Mostafapour V "TurkiS: A Turkish Sentiment Analyzer Using Domain-specific Automatic Labelled Dataset." International Journal of Intelligent Systems and Applications in Engineering, 7, ss.99 - 103, 2019. https://doi.org/10.18201//ijisae.2019252788
ISNAD İnan, Emrah vd. "TurkiS: A Turkish Sentiment Analyzer Using Domain-specific Automatic Labelled Dataset". International Journal of Intelligent Systems and Applications in Engineering 7/2 (2019), 99-103. https://doi.org/https://doi.org/10.18201//ijisae.2019252788