Turkish sign language recognition based on multistream data fusion

GÜNDÜZ, Cemil; Polat, Hüseyin

doi:10.3906/elk-2005-156

Turkish sign language recognition based on multistream data fusion

Cemil GÜNDÜZ, (Bilişim Sistemleri Anabilim Dalı, Bilişim Enstitüsü, Gazi Üniversitesi, Ankara, Türkiye)

Hüseyin POLAT (Gazi Üniversitesi Teknoloji Fakültesi Bilgisayar Mühendisliği Bölümü, Ankara, Türkiye)

Turkish Journal of Electrical Engineering and Computer Sciences

1 0

Yıl: 2021 Cilt: 29 Sayı: 2 Sayfa Aralığı: 1171 - 1186 Metin Dili: İngilizce DOI: 10.3906/elk-2005-156 İndeks Tarihi: 07-06-2022

Turkish sign language recognition based on multistream data fusion

Öz:

Sign languages are nonverbal, visual languages that hearing- or speech-impaired people use for communication. Aside from hands, other communication channels such as body posture and facial expressions are also valuable in sign languages. As a result of the fact that the gestures in sign languages vary across countries, the significance of communication channels in each sign language also differs. In this study, representing the communication channels used in Turkish sign language, a total of 8 different data streams—4 RGB, 3 pose, 1 optical flow—were analyzed. Inception 3D was used for RGB and optical flow; and LSTM-RNN was used for pose data streams. Experiments were conducted by merging the data streams in different combinations, and then a sign language recognition system that merged the most suitable streams with the help of a multistream late fusion mechanism was proposed. Considering each data stream individually, the accuracies of the RGB streams were between 28% and 79%; pose stream accuracies were between 9% and 50%; and optical flow data accuracy was 78.5%. When these data streams were used in combination, the sign language recognition performance was higher in comparison to any of the data streams alone. The proposed sign language recognition system uses a multistream data fusion mechanism and gives an accuracy of 89.3% on BosphorusSign General dataset. The multistream data fusion mechanisms have a great potential for improving sign language recognition results.

Anahtar Kelime:

Belge Türü: Makale Makale Türü: Araştırma Makalesi Erişim Türü: Erişime Açık

[1] Kumar P, Gauba H, Roy PP, Dogra DP. A multimodal framework for sensor based sign language recognition. Neurocomputing. 2017; 259: 21-38. doi: 10.1016/j.neucom.2016.08.132
[2] Escobedo E, Ramirez L, Camara G. Dynamic sign language recognition based on convolutional neural networks and texture maps. In: 2019 32nd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI); 2019. doi: 10.1109/sibgrapi.2019.00043
[3] Zeshan U. Aspects of Türk isaret dili (Turkish sign language). Sign Language and Linguistics 2003; 6 (1): 43-75. doi: 10.1075/sll.6.1.04zes
[4] Liu T, Zhou W, Li H. Sign language recognition with long short-term memory. In: 2016 IEEE International Conference on Image Processing (ICIP); 2016. pp. 2871-2875. doi: 10.1109/icip.2016.7532884
[5] Cheok MJ, Zaid O, Hisham JM. A review of hand gesture and sign language recognition techniques. International Journal of Machine Learning and Cybernetics 2019; 10 (1): 131-153. doi: 10.1007/s13042-017-0705-5
[6] Abavisani M, Joze HRV, Patel VM. Improving the performance of unimodal dynamic hand-gesture recognition with multimodal training. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 1165-1174.
[7] Cooper H, Holt B, Bowden R. Sign language recognition. In: Moeslund TB, Hilton A, Krüger V, Sigal L (Editor). Visual Analysis of Humans. New York, NY, USA: Springer, 2011: 539-562.
[8] Lim KM, Tan AW, Tan SC. Block-based histogram of optical flow for isolated sign language recognition. Journal of Visual Communication and Image Representation. 2016; 40: 538-545. doi: 10.1016/j.jvcir.2016.07.020
[9] Chuan CH, Regina E, Guardino C. American sign language recognition using leap motion sensor. In: 2014 IEEE 13th International Conference on Machine Learning and Applications; 2014. pp. 541-544.
[10] Huang CL, Huang WY. Sign language recognition using model-based tracking and a 3D Hopfield neural network. Machine Vision and Applications 1998; 10 (5-6): 292-307. doi: 10.1007/s001380050080
[11] Grobel K, Assan M. Isolated sign language recognition using hidden Markov models. In: 1997 IEEE International Conference on Systems, Man, and Cybernetics; Computational Cybernetics and Simulation; 1997. pp. 162-167. doi: 10.1109/icsmc.1997.625742
[12] Büyüksaraç B. Sign language recognition by image analysis. PhD, Middle East Technical University, Ankara, Turkey, 2015.
[13] Haberdar H. Real time Turkish sign language recognition system from video using hidden Markov models. Masters, Yıldız Technical University, İstanbul, Turkey, 2005.
[14] Starner T, Weaver J, Pentland A. Real-time American sign language recognition using desk and wearable computer based video. IEEE Transactions on Pattern Analysis and Machine Intelligence 1998; 20(12): 1371-1375. doi: 10.1109/34.735811
[15] Vogler C, Metaxas D. Parallel hidden markov models for American sign language recognition. In: Proceedings of the Seventh IEEE International Conference on Computer Vision; 1999. pp. 116-122. doi: 10.1109/iccv.1999.791206
[16] Zhang LG, Chen Y, Fang G, Chen X, Gao W. A vision-based sign language recognition system using tied-mixture density HMM. In: Proceedings of the 6th International Conference on Multimodal Interfaces - ICMI 04; 2004: 198-204. doi: 10.1145/1027933.1027967
17] Deng L. A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Transactions on Signal and Information Processing 2014; 3. doi: 10.1017/atsip.2013.9
[18] Koller O, Ney H, Bowden R. Deep learning of mouth shapes for sign language. In: 2015 IEEE International Conference on Computer Vision Workshop (ICCVW); 2015. pp. 85-91. doi: 10.1109/iccvw.2015.69
[19] Demircioğlu B, Bülbül G, Köse H. Turkish sign language recognition with Leap Motion. In: 2016 24th Signal Processing and Communication Application Conference (SIU); 2016: 589-592. doi: 10.1109/siu.2016.7495809
[20] Huang J, Zhou W, Li H, Li W. Sign language recognition using 3D convolutional neural networks. In: 2015 IEEE International Conference on Multimedia and Expo (ICME); 2015: 1-6. doi: 10.1109/icme.2015.7177428
[21] Pigou L, Dieleman S, Kindermans PJ, Schrauwen B. Sign language recognition using convolutional neural networks. In: European Conference on Computer Vision; 2014: 572-578.
[22] Vaezi Joze H, Koller O. MS-ASL: A large-scale data set and benchmark for understanding american sign language. In: The British Machine Vision Conference (BMVC); 2019.
[23] Li D, Opazo CR, Yu X, Li H. Word-level deep sign language recognition from video: A new large-scale dataset and methods comparison. In: 2020 IEEE Winter Conference on Applications of Computer Vision (WACV); 2020: 1459-1469. doi: 10.1109/wacv45572.2020.9093512
[24] Ferreira PM, Cardoso JS, Rebelo A. On the role of multimodal learning in the recognition of sign language. Multimedia Tools and Applications 2019; 78 (8): 10035-10056. doi: 10.1007/s11042-018-6565-5
[25] Zhang S, Meng W, Li H, Cui X. Multimodal spatiotemporal networks for sign language recognition. IEEE Access 2019; 7: 180270-180280. doi: 10.1109/access.2019.2959206
[26] Camgoz NC, Hadfield S, Koller O, Bowden R. SubUNets: End-to-end hand shape and continuous sign language recognition. In: 2017 IEEE International Conference on Computer Vision (ICCV); 2017. pp. 3056-3065. doi: 10.1109/iccv.2017.332
[27] Charles J, Pfister T, Everingham M, Zisserman A. Automatic and eﬀicient human pose estimation for sign language videos. International Journal of Computer Vision. 2014; 110 (1): 70-90. doi: 10.1007/s11263-013-0672-6
[28] Gattupalli S, Ghaderi A, Athitsos V. Evaluation of deep learning based pose estimation for sign language recog- nition. In: Proceedings of the 9th ACM International Conference on PErvasive Technologies Related to Assistive Environments - PETRA 16; 2016. doi: 10.1145/2910674.2910716
[29] Konstantinidis D, Dimitropoulos K, Daras P. Sign language recognition based on hand and body skeletal data. In: 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON); 2018. doi: 10.1109/3dtv.2018.8478467
[30] Unutmaz B, Karaca AC, Güllü MK. Turkish sign language recognition using Kinect skeleton and convolutional neural network. In: 27th Signal Processing and Communications Applications Conference (SIU); 2019. doi: 10.1109/siu.2019.8806380
[31] Özdemir O, Kindiroglu AA, Akarun L. Isolated sign language recognition with fast hand descriptors. In: 2018 26th Signal Processing and Communications Applications Conference (SIU). 2018: 1-4.
[32] Kindiroglu AA, Ozdemir O, Akarun L. Temporal accumulative features for sign language recognition. In: IEEE/CVF International Conference on Computer Vision Workshop (ICCVW); 2019. doi: 10.1109/ic- cvw.2019.00164
[33] Tamer NC, Ozdemir O, Saraclar M, Akarun L. Dynamic time warping based sign retrieval. In: 27th Signal Processing and Communications Applications Conference (SIU); 2019. doi: 10.1109/siu.2019.8806601
[34] Camgöz NC, Kındıroğlu AA, Karabüklü S, Kelepir M, Özsoy AS et al. BosphorusSign: a Turkish sign language recognition corpus in health and finance domains. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16); 2016: 1383-1388.
[35] Carreira J, Zisserman A. Quo Vadis, action recognition? A new model and the Kinetics dataset. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2017: 6299-6308. doi: 10.1109/cvpr.2017.502
[36] Cao Z, Martinez GH, Simon T, Wei SE, Sheikh YA. OpenPose: Realtime multi-person 2D pose estimation using part aﬀinity fields. Transactions on Pattern Analysis and Machine Intelligence 2019. doi: 10.1109/tpami.2019.2929257
[37] Farnebäck G. Two-frame motion estimation based on polynomial expansion. Image Analysis Lecture Notes in Computer Science 2003: 363-370.
[38] Tran D, Bourdev L, Fergus R, Torresani L, Paluri M. Learning spatiotemporal features with 3D convolu- tional networks. In: IEEE International Conference on Computer Vision (ICCV); 2015. pp. 4489-4497. doi: 10.1109/iccv.2015.510
[39] Kosmopoulos D, Oikonomidis I, Constantinopoulos C, Arvanitis N, Antzakas K et al. Towards a visual Sign Language dataset for home care services. In: 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020); 2020: 622-626.
[40] Ronchetti F, Quiroga F, Estrebou CA, Lanzarini LC, Rosete A. LSA64: an Argentinian sign language dataset. In: XXII Congreso Argentino de Ciencias de la Computación (CACIC 2016). 2016.

APA	GÜNDÜZ C, Polat H (2021). Turkish sign language recognition based on multistream data fusion. , 1171 - 1186. 10.3906/elk-2005-156
Chicago	GÜNDÜZ Cemil,Polat Hüseyin Turkish sign language recognition based on multistream data fusion. (2021): 1171 - 1186. 10.3906/elk-2005-156
MLA	GÜNDÜZ Cemil,Polat Hüseyin Turkish sign language recognition based on multistream data fusion. , 2021, ss.1171 - 1186. 10.3906/elk-2005-156
AMA	GÜNDÜZ C,Polat H Turkish sign language recognition based on multistream data fusion. . 2021; 1171 - 1186. 10.3906/elk-2005-156
Vancouver	GÜNDÜZ C,Polat H Turkish sign language recognition based on multistream data fusion. . 2021; 1171 - 1186. 10.3906/elk-2005-156
IEEE	GÜNDÜZ C,Polat H "Turkish sign language recognition based on multistream data fusion." , ss.1171 - 1186, 2021. 10.3906/elk-2005-156
ISNAD	GÜNDÜZ, Cemil - Polat, Hüseyin. "Turkish sign language recognition based on multistream data fusion". (2021), 1171-1186. https://doi.org/10.3906/elk-2005-156

APA	GÜNDÜZ C, Polat H (2021). Turkish sign language recognition based on multistream data fusion. Turkish Journal of Electrical Engineering and Computer Sciences, 29(2), 1171 - 1186. 10.3906/elk-2005-156
Chicago	GÜNDÜZ Cemil,Polat Hüseyin Turkish sign language recognition based on multistream data fusion. Turkish Journal of Electrical Engineering and Computer Sciences 29, no.2 (2021): 1171 - 1186. 10.3906/elk-2005-156
MLA	GÜNDÜZ Cemil,Polat Hüseyin Turkish sign language recognition based on multistream data fusion. Turkish Journal of Electrical Engineering and Computer Sciences, vol.29, no.2, 2021, ss.1171 - 1186. 10.3906/elk-2005-156
AMA	GÜNDÜZ C,Polat H Turkish sign language recognition based on multistream data fusion. Turkish Journal of Electrical Engineering and Computer Sciences. 2021; 29(2): 1171 - 1186. 10.3906/elk-2005-156
Vancouver	GÜNDÜZ C,Polat H Turkish sign language recognition based on multistream data fusion. Turkish Journal of Electrical Engineering and Computer Sciences. 2021; 29(2): 1171 - 1186. 10.3906/elk-2005-156
IEEE	GÜNDÜZ C,Polat H "Turkish sign language recognition based on multistream data fusion." Turkish Journal of Electrical Engineering and Computer Sciences, 29, ss.1171 - 1186, 2021. 10.3906/elk-2005-156
ISNAD	GÜNDÜZ, Cemil - Polat, Hüseyin. "Turkish sign language recognition based on multistream data fusion". Turkish Journal of Electrical Engineering and Computer Sciences 29/2 (2021), 1171-1186. https://doi.org/10.3906/elk-2005-156