Articles | Open Access | https://doi.org/10.37547/ajast/Volume05Issue12-14

Developing A Question-Answering System In The Uzbek Language Based On The Xlm-Roberta Model

Khujayarov I.Sh. , Department of Information Technology, Samarkand branch of Tashkent University of Information Technologies named after Muhammad Al-Khwarizmi, Uzbekistan
Ochilov M.M. , Department of Artificial Intelligence, Tashkent University of Information Technologies named after Muhammad Al-Khwarizmi, Uzbekistan
Kholmatov O.A. , Department of Artificial Intelligence, Tashkent University of Information Technologies named after Muhammad Al-Khwarizmi, Uzbekistan

Abstract

This article presents the issue of testing the XML-RoBERTa model for generating questions and answers in the Uzbek language. In the study, the XML-RoBERTa model was adapted to the Uzbek language from a dataset consisting of context, question and answer pairs in the Uzbek language and, as a result, a model was developed to generate a fragment of the answer to the user's question from the context. ROUGE, EM (Exact Match) and F1 control metrics were used to determine the performance of the model.

Keywords

XLM-RoBERTa, question-answer system, natural language processing

References

Muhammadjon Mahmudovich, Ochilov Mannon Musinovich, Xolmatov Orzimurod Abjalolovich, Narzullayev Oybek Otabek o‘g‘li. Vektor fazo modeli hamda jumlalar o‘xshashligi o‘lchovlariga asoslangan savol - javob tizimi ishlab chiqish. (2025). digital transformation and artificial intelligence, 3(1),23-30. https://dtai.tsue.uz/index.php/dtai/article/view/v3i14

Liu Y etcs. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of NAACL-HLT 2019. Minneapolis, MN, USA. June 2–7, 2019.

Raffel, C. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Proceedings of the 37th International Conference on Machine Learning (ICML 2020).

Raffel, C., Shinn, E., Roberts, A., Lee, K., & Narang, S. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Proceedings of the 37th International Conference on Machine Learning (ICML), Long Beach, CA, USA, June 9–15, 2019.

Devlin, Jacob, Chang, Ming-Wei, Lee, Kenton, Toutanova, Kristina. "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding". October 11, 2018. arXiv:1810.04805v2

Ethayarajh, Kawin, How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings. September 1, 2019. arXiv:1909.00512

Zhang Tianyi, Wu Felix, Katiyar Arzoo, Weinberger Kilian Q, Artzi Yoav. Revisiting Few-sample BERT Fine-tuning, March 11, 2021. arXiv:2006.05987

Victor SANH, Lysandre DEBUT, Julien CHAUMOND, Thomas WOLF. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. 1 Mar 2020. arXiv:1910.01108v4

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, OmerLevy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov. RoBERTa: ARobustly Optimized BERT Pretraining Approach. 26 Jul 2019. arXiv:1907.11692v1.

Zhenzhong Lan, Mingda Chen, Piyush Sharma, Google Research Sebastian Goodman, Radu Soricut. ALBERT: A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS. 9 Feb 2020. arXiv:1909.11942v6 [cs.CL]

Goyal N., Chaudhary V., Wenzek G., Guzmán F., Grave E., Ott M., Zettlemoyer L., & Stoyanov V. Unsupervised Cross-lingual Representation Learning at Scale. arXiv preprint arXiv:1911.02116v2 [cs.CL], 8 April 2020.

Goyal, N., Chaudhary, V., Wenzek, G., Guzmán, F., Grave, E., Ott, M., Zettlemoyer, L., & Stoyanov, V. Unsupervised Cross-lingual Representation Learning at Scale. 2020. arXiv:1911.02116v2.

Article Statistics

Copyright License

Download Citations

How to Cite

Khujayarov I.Sh., Ochilov M.M., & Kholmatov O.A. (2025). Developing A Question-Answering System In The Uzbek Language Based On The Xlm-Roberta Model. American Journal of Applied Science and Technology, 5(12), 89–94. https://doi.org/10.37547/ajast/Volume05Issue12-14