Articles | Open Access | https://doi.org/10.37547/ajast/Volume05Issue07-11

Modern Sign Language Recognition Systems

Kayumov Oybek Achilovich , Jizzakh Branch of National University of Uzbekistan Named After Mirzo Ulugbek, Uzbekistan

Abstract

This article is dedicated to the development, technological foundations, and practical applications of modern Sign Language Recognition (SLR) systems. Advanced vision-based systems—particularly architectures such as MediaPipe Holistic, OpenPose, SignAll, Sign Language Transformer, and RWTH-PHOENIX—are analyzed in terms of their algorithmic principles, advantages, and limitations. These systems, based on artificial intelligence and deep learning architectures, enable the spatial-temporal, multimodal, and contextual recognition of sign language glosses.

The MediaPipe system provides real-time detection of facial, body, and hand movements, while OpenPose excels at modeling the user’s body pose in 2D and 3D formats. The SignAll system integrates NLP components for translating sign language glosses. SLR systems based on the PHOENIX14T corpus, developed by RWTH Aachen University, are considered a benchmark for sign segmentation. In particular, the Transformer-based Sign Language Transformer model allows for seamless translation of sign language glosses into English text.

The article thoroughly addresses issues such as multimodal signal analysis (gesture, pose, facial expression) for more accurate interpretation of sign movements, the creation of a contextual semantic representation model, real-time processing, and platform integration. Additionally, the practical significance of modern SLR systems in education, communication, and human-computer interaction (HCI) is analyzed.

Keywords

Sign Language Recognition (SLR), deep learning, vision-based technologies

References

Hu, H., Zhou, W., Li, H., & Li, W. (2023). SignBERT+: Hand-model-aware self-supervised pretraining for sign language understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(5), 5678–5692.

Zuo, Z., Fang, Y., & Wang, S. (2023). MS2SL: Multisource-to-Sign-Language model for synchronized multimodal sign recognition. Computer Vision and Image Understanding, 228, 103610.

Google Research. (2021). MediaPipe Holistic: Simultaneous face, hand, and body pose detection. Retrieved from https://google.github.io/mediapipe

Cao, Z., Hidalgo, G., Simon, T., Wei, S. E., & Sheikh, Y. (2021). OpenPose: Realtime multi-person 2D pose estimation using Part Affinity Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(1), 172–186.

SignAll Technologies. (2022). SignAll real-time sign language translation system. Retrieved from https://www.signall.us

Koller, O., Zargaran, S., Ney, H., & Bowden, R. (2020). Quantifying translation quality of sign language recognition systems on PHOENIX14T. Proceedings of the European Conference on Computer Vision (ECCV), 477–494.

Saunders, B., Camgoz, N. C., & Bowden, R. (2020). Progressive Transformers for end-to-end sign language production. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 12324–12333.

Baltrušaitis, T., Ahuja, C., & Morency, L. P. (2019). Multimodal machine learning: A survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(2), 423–443.

Article Statistics

Copyright License

Download Citations

How to Cite

Kayumov Oybek Achilovich. (2025). Modern Sign Language Recognition Systems. American Journal of Applied Science and Technology, 5(07), 67–72. https://doi.org/10.37547/ajast/Volume05Issue07-11