Yıl: 2022 Cilt: 30 Sayı: 4 Sayfa Aralığı: 1404 - 1418 Metin Dili: İngilizce DOI: 10.55730/1300-0632.3856 İndeks Tarihi: 18-07-2022

Twitter account classification using account metadata: organization vs. individual

Öz:
Organizations present their existence on social media to gain followers and reach out to the crowds. Social media-related tasks and applications, such as social media graph construction, sentiment analysis, and bot detection, are required to identify the entities’ account types. Some applications focus on personal accounts, whereas others only need nonpersonal accounts. This paper addresses the account classification problem using only minimum amount of data, which is the metadata of the account’s profile. The proposed approach classifies accounts either as organization or individual, in a language-independent manner, without collecting the accounts’ tweet content. The model uses a long short term memory (LSTM) network for processing the textual properties and a fully-connected neural network for processing the numerical features. We apply our solution to a collection of Twitter accounts, as it is one of the most widely used social networks. Our classifier, based solely on the account metadata, achieves an average of 97.4% accuracy under 7-fold cross-validation. The experiments show that the account metadata is a qualified resource for accurately estimating the account types.
Anahtar Kelime:

Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık
  • [1] Mohd M, Jan R, Hakak N. Enhanced bootstrapping algorithm for automatic annotation of tweets. International Journal of Cognitive Informatics and Natural Intelligence 2020; 14 (2): 35–60. doi: 10.4018/IJCINI.2020040103
  • [2] Chu Z, Gianvecchio S, Wang H, Jajodia S. Detecting automation of Twitter accounts: Are you a human, bot, or cyborg? IEEE Transactions on Dependable and Secure Computing 2012; 9 (6): 811–824. doi: 10.1109/TDSC.2012.75
  • [3] Martinez LS, Tsou MH, Spitzberg BH. A case study in belief surveillance, sentiment analysis, and identification of informational targets for e-cigarettes interventions. In 2019 10th International Conference on Social Media and Society, Toronto, ON, Canada; ACM, 2019. pp. 15–23. doi: 10.1145/3328529.3328540
  • [4] Lu C, Lam W, Zhang Y. Twitter user modeling and tweets recommendation based on Wikipedia concept graph. In 2012 Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence, Toronto, ON, Canada; AAAI Press, 2012.
  • [5] García-Silva A, Rodríguez-Doncel V, Corcho Ó. Semantic characterization of tweets using topic models: A use case in the entertainment domain. International Journal on Semantic Web and Information Systems 2013; 9 (3): 1–13. doi: 10.4018/ijswis.2013070101
  • [6] Bharti SK, Pradhan R, Babu KS, Jena SK. Sarcastic sentiment detection based on types of sarcasm occurring in Twitter data. International Journal on Semantic Web and Information Systems 2017; 13 (4): 89–108. doi: 10.4018/IJSWIS.2017100105
  • [7] Patel NV, Chhinkaniwala H. Investigating machine learning techniques for user sentiment analysis. International Journal of Decision Support System Technology 2019; 11 (3): 1–12. doi: 10.4018/IJDSST.2019070101
  • [8] Alharbi JR, Alhalabi WS. Hybrid approach for sentiment analysis of Twitter posts using a dictionary-based approach and fuzzy logic methods: Study case on cloud service providers. International Journal on Semantic Web and Information Systems 2020; 16 (1): 116–145. doi: 10.4018/IJSWIS.2020010106
  • [9] Atefeh F, Khreich W. A survey of techniques for event detection in Twitter. Computational Intelligence 2015; 31 (1): 132–164. doi: 10.1111/coin.12017
  • [10] Ramachandran D, Parvathi R. Enhanced event detection in Twitter through feature analysis. International Journal of Information Technology and Web Engineering 2019; 14 (3): 1–15. doi: 10.4018/IJITWE.2019070101
  • [11] Srivastava R, Bhatia MPS. Real-time unspecified major sub-events detection in the Twitter data stream that cause the change in the sentiment score of the targeted event. International Journal of Information Technology and Web Engineering 2017; 12 (4): 1–21. doi: 10.4018/IJITWE.2017100101
  • [12] Toraman C. Early prediction of public reactions to news events using microblogs. In 2017 Seventh BCS-IRSG Symposium on Future Directions in Information Access; BCS, 2017. pp. 1–4. doi: 10.14236/ewic/FDIA2017.4
  • [13] Wood-Doughty Z, Mahajan P, Dredze M. Johns Hopkins or johnny-hopkins: Classifying individuals versus organizations on Twitter. In 2018 Second Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media; ACL, 2018. pp. 56–61. doi: 10.18653/v1/w18-1108
  • [14] McCorriston J, Jurgens D, Ruths D. Organizations are users too: Characterizing and detecting the presence of organizations on Twitter. In 2015 Ninth International AAAI Conference on Web and Social Media, Oxford, UK; AAAI Press, 2015. pp. 650-653.
  • [15] Sennrich R, Haddow B, Birch A. Neural machine translation of rare words with subword units. In 2016 54th Annual Meeting of the Association for Computational Linguistics; ACL, 2016. pp. 1715–1725. doi: 10.18653/v1/p16-1162
  • [16] Wang Z, Hale SA, Adelani DI, Grabowicz PA, Hartmann T, et al. Demographic inference and representative population estimates from multilingual social media. In 2019 The World Wide Web Conference, San Francisco, CA, USA; ACM, 2019. pp. 2056–2067. doi: 10.1145/3308558.3313684
  • [17] De Silva L, Riloff E. User type classification of tweets with implications for event recognition. In 2014 Joint Workshop on Social Dynamics and Personal Attributes in Social Media; 2014. pp. 98–108.
  • [18] Samborskii I, Filchenkov A, Korneev G, Farseev A. Person, organization, or personage: Towards user account type prediction in microblogs, In 2018 International Conference on Electronic Governance and Open Society: Challenges in Eurasia, St. Petersburg, Russia; Springer, 2018. pp. 111–122. doi: 10.1007/978-3-030-13283-5_9
  • [19] Daouadi KE, Rebaï RZ, Amous I. Organization vs. individual: Twitter user classification. In 2018 Conference on Language Processing and Knowledge Management, Kerkennah (Sfax), Tunisia; CEUR-WS.org, 2018.
  • [20] Tavares G, Faisal A. Scaling-laws of human broadcast communication enable distinction between human, corporate and robot twitter users. PloS one, 2013; 8 (7): 1-11. doi: 10.1371/journal.pone.0065774
  • [21] Tavares G, Mastelini S, Barbon S. User Classification on Online Social Networks by Post Frequency. 2017 13th Brazilian Symposium on Information Systems; SBSI, 2017. pp. 464–471. doi: 10.5753/sbsi.2017.6076
  • [22] Kim SM, Paris C, Power R, Wan S. Distinguishing individuals from organisations on Twitter. In 2017 26th International Conference on World Wide Web Companion, Perth, Australia; ACM, 2017. pp. 805–806. doi: 10.1145/3041021.3054217
  • [23] Alzahrani S, Gore C, Salehi A, Davulcu H. Finding organizational accounts based on structural and behavioral factors on twitter. In 2018 International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, Washington, DC, USA; Springer, 2018. pp. 164–175. doi: 10.1007/978-3-319-93372-6_18
  • [24] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation 1997; 9 (8): 1735–1780. doi: 10.1162/neco.1997.9.8.1735
  • [25] Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA; IEEE Computer Society, 2017. pp. 2261–2269. doi: 10.1109/CVPR.2017.243
  • [26] Muthusami R, Bharathi A. Stance detection and mobile app recommendation discourse on tweets. Computational Intelligence 2019; 35 (4): 1042–1059. doi: 10.1111/coin.12231
  • [27] Karatay D, Karagoz P. User interest modeling in Twitter with named entity recognition. In 2015 5th Workshop on Making Sense of Microposts co-located with the 24th International World Wide Web Conference, Florence, Italy; CEUR Workshop Proceedings, 2015. pp. 17–20.
  • [28] Stavrianou A, Brun C. Expert recommendations based on opinion mining of user-generated product reviews. Computational Intelligence 2015; 31 (1): 165–183. doi: 10.1111/coin.12021
  • [29] Tutaysalgir E, Karagoz P, Toroslu IH. Clustering based personality prediction on Turkish tweets. In 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Vancouver, British Columbia, Canada; IEEE, 2019. pp. 825–828. doi: 10.1145/3341161.3343513
  • [30] Abbas AK, Bayat O, Ucan ON. [2018], Estimation of Twitter user’s nationality based on friends and followers information. Computers & Electrical Engineering 2018; 66 (1): 517–530. doi: 10.1016/j.compeleceng.2017.06.033
  • [31] Sahoo SR, Gupta BB. Hybrid approach for detection of malicious profiles in Twitter. Computers & Electrical Engineering 2019; 76: 65–81. doi: 10.1016/j.compeleceng.2019.03.003
  • [32] Colleoni E, Rozza A, Arvidsson A. Echo chamber or public sphere? predicting political orientation and measuring political homophily in Twitter using big data. Journal of communication 2014; 64 (2): 317–332. doi: 10.1111/jcom.12084
  • [33] Çetinkaya YM, Toroslu ̇İH, Davulcu H. Developing a Twitter bot that can join a discussion using state-of-the-art architectures. Social Network Analysis and Mining 2020; 10 (1): 1–21. doi: 10.1007/s13278-020-00665-4
  • [34] Napoli R, Ertugrul AM, Bozzon A, Brambilla M. A user modeling pipeline for studying polarized political events in social media. In 2018 International Conference on Web Engineering, Cáceres, Spain; Springer, 2018. pp. 101–114. doi: 10.1007/978-3-030-03056-8_9
  • [35] Alowibdi JS, Buy UA, Yu P. Language independent gender classification on Twitter. In 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Niagara, ON, Canada; IEEE, 2013. pp 739–743. doi: 10.1145/2492517.2492632
  • [36] Vicente M, Carvalho JP, Batista F. Using unstructured profile information for gender classification of Portuguese and English Twitter users. In 2015 International Symposium on Languages, Applications and Technologies; Springer, 2015. pp. 57–64. doi: 10.1007/978-3-319-27653-3_6
  • [37] Poldi F. Twint-twitter intelligence tool. 2019. URL: https://github.com/twintproject/twint (visited on 01/27/2020).
  • [38] Pennington J, Socher R, Manning C. Glove: Global vectors for word representation. In 2014 Conference on empirical methods in natural language processing, Doha, Qatar; ACL, 2014. pp. 1532–1543. doi: 10.3115/v1/d14-1162
  • [39] Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. In 2013 27th Annual Conference on Neural Information Processing Systems, Lake Tahoe, Nevada, United States; 2013. pp. 3111–3119.
  • [40] Kudo T, Richardson J. SentencePiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Brussels, Belgium; ACL, 2018. pp. 66–71. doi: 10.18653/v1/d18-2012
  • [41] Shibata Y, Kida T, Fukamachi S, Takeda M, Shinohara A et al. Byte pair encoding: A text compression scheme that accelerates pattern matching. Technical Report DOI-TR-161; 1999; Dept. of Informatics, Kyushu University
  • [42] Heinzerling B, Strube M. BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages. In 2018 Eleventh International Conference on Language Resources and Evaluation, Miyazaki, Japan; ELRA, 2018.
  • [43] Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z et al. TensorFlow: Large-scale machine learning on heterogeneous systems; Software available from tensorflow.org, 2015.
APA Çetinkaya Y, Gürlek M, toroslu i, KARAGOZ P (2022). Twitter account classification using account metadata: organization vs. individual. , 1404 - 1418. 10.55730/1300-0632.3856
Chicago Çetinkaya Yusuf Mücahit,Gürlek Mesut,toroslu ismail,KARAGOZ PINAR Twitter account classification using account metadata: organization vs. individual. (2022): 1404 - 1418. 10.55730/1300-0632.3856
MLA Çetinkaya Yusuf Mücahit,Gürlek Mesut,toroslu ismail,KARAGOZ PINAR Twitter account classification using account metadata: organization vs. individual. , 2022, ss.1404 - 1418. 10.55730/1300-0632.3856
AMA Çetinkaya Y,Gürlek M,toroslu i,KARAGOZ P Twitter account classification using account metadata: organization vs. individual. . 2022; 1404 - 1418. 10.55730/1300-0632.3856
Vancouver Çetinkaya Y,Gürlek M,toroslu i,KARAGOZ P Twitter account classification using account metadata: organization vs. individual. . 2022; 1404 - 1418. 10.55730/1300-0632.3856
IEEE Çetinkaya Y,Gürlek M,toroslu i,KARAGOZ P "Twitter account classification using account metadata: organization vs. individual." , ss.1404 - 1418, 2022. 10.55730/1300-0632.3856
ISNAD Çetinkaya, Yusuf Mücahit vd. "Twitter account classification using account metadata: organization vs. individual". (2022), 1404-1418. https://doi.org/10.55730/1300-0632.3856
APA Çetinkaya Y, Gürlek M, toroslu i, KARAGOZ P (2022). Twitter account classification using account metadata: organization vs. individual. Turkish Journal of Electrical Engineering and Computer Sciences, 30(4), 1404 - 1418. 10.55730/1300-0632.3856
Chicago Çetinkaya Yusuf Mücahit,Gürlek Mesut,toroslu ismail,KARAGOZ PINAR Twitter account classification using account metadata: organization vs. individual. Turkish Journal of Electrical Engineering and Computer Sciences 30, no.4 (2022): 1404 - 1418. 10.55730/1300-0632.3856
MLA Çetinkaya Yusuf Mücahit,Gürlek Mesut,toroslu ismail,KARAGOZ PINAR Twitter account classification using account metadata: organization vs. individual. Turkish Journal of Electrical Engineering and Computer Sciences, vol.30, no.4, 2022, ss.1404 - 1418. 10.55730/1300-0632.3856
AMA Çetinkaya Y,Gürlek M,toroslu i,KARAGOZ P Twitter account classification using account metadata: organization vs. individual. Turkish Journal of Electrical Engineering and Computer Sciences. 2022; 30(4): 1404 - 1418. 10.55730/1300-0632.3856
Vancouver Çetinkaya Y,Gürlek M,toroslu i,KARAGOZ P Twitter account classification using account metadata: organization vs. individual. Turkish Journal of Electrical Engineering and Computer Sciences. 2022; 30(4): 1404 - 1418. 10.55730/1300-0632.3856
IEEE Çetinkaya Y,Gürlek M,toroslu i,KARAGOZ P "Twitter account classification using account metadata: organization vs. individual." Turkish Journal of Electrical Engineering and Computer Sciences, 30, ss.1404 - 1418, 2022. 10.55730/1300-0632.3856
ISNAD Çetinkaya, Yusuf Mücahit vd. "Twitter account classification using account metadata: organization vs. individual". Turkish Journal of Electrical Engineering and Computer Sciences 30/4 (2022), 1404-1418. https://doi.org/10.55730/1300-0632.3856