Articles
| Open Access | A Learning-Driven Queuing Framework For Dynamic Workload Management In Cloud Computing
Abstract
Cloud computing has evolved into the dominant paradigm for delivering computational resources, application services, and digital infrastructures across virtually every sector of the global economy. Despite this dominance, the fundamental challenge of how to dynamically allocate tasks and computational workloads across heterogeneous, multi-tenant cloud environments remains largely unresolved at both theoretical and practical levels. Contemporary cloud infrastructures operate under extreme uncertainty arising from fluctuating demand, unpredictable workloads, hardware failures, network variability, and complex virtualized resource sharing. Traditional deterministic or static scheduling mechanisms, even when grounded in rigorous queuing theory, struggle to adapt to such volatility, leading to inefficiencies in service response time, resource utilization, energy consumption, and service-level agreement compliance. Recent advances in deep reinforcement learning have opened new theoretical and operational avenues for addressing these challenges by enabling systems to learn optimal policies from interaction with dynamic environments rather than relying solely on pre-defined rules.
This study develops a comprehensive theoretical and analytical framework that integrates deep Q-learning with classical and modern queuing theory to model and optimize task scheduling in cloud computing centers. Building on foundational work in cloud service performance, queuing networks, and dynamic resource allocation, the article positions learning-based control as a natural evolution of cloud scheduling theory, extending beyond the limitations of static or heuristic-based approaches. A central reference point is the deep Q-learning-driven optimal task scheduling model proposed by Kanikanti et al. (2025), which demonstrated that reinforcement learning guided by queuing feedback can significantly improve task throughput and response time in cloud environments characterized by stochastic arrivals and finite server capacities. Rather than replicating or summarizing this prior work, the present article situates it within a much broader intellectual lineage that spans classical queueing networks, performance modeling, reliability theory, and modern cloud resource management research.
The article develops a unified conceptual architecture in which cloud servers, virtual machines, and application tiers are modeled as interconnected queues whose states feed into a reinforcement learning agent responsible for task admission, routing, and scheduling decisions. This approach allows the system to internalize not only instantaneous load conditions but also long-term performance consequences, including congestion propagation, resource contention, and reliability degradation. Extensive theoretical elaboration is provided to explain how deep Q-learning overcomes the curse of dimensionality inherent in multi-server cloud environments, and how queuing-based state representations provide the statistical structure necessary for stable and convergent learning. The results of this conceptual synthesis indicate that learning-enhanced queuing systems can achieve superior stability, lower response time variance, and improved utilization compared to purely analytical or heuristic schedulers, a conclusion that aligns with empirical and analytical trends reported in cloud performance research.
The discussion critically engages with existing performance modeling traditions, highlighting both their enduring relevance and their limitations in the face of modern cloud complexity. It also explores the implications of reinforcement learning-based scheduling for reliability engineering, energy efficiency, and service-level agreement enforcement. By grounding every analytical claim in the established literature while extending it through deep reinforcement learning theory, the article provides a rigorous and forward-looking contribution to the field of cloud computing research.
Keywords
Cloud computing, deep Q-learning, task scheduling
References
Martinello M, Kaaniche M, Kanoun K. Web service availability: impact of error recovery and traffic model. J Reliab Eng Syst Saf. 2005;89(1):6–16.
Nguyen BM, Tran D, Nguyen G. Enhancing service capability with multiple finite capacity server queues in cloud data centers. Clust Comput. 2016;19(4):1747–1767.
Kanikanti VSN, Tiwari SK, Nayan V, Suryawanshi S, Chauhan R. Deep Q-Learning Driven Dynamic Optimal Task Scheduling for Cloud Computing Using Optimal Queuing. In: Proceedings of the 2025 International Conference on Computational Intelligence and Knowledge Economy; 2025; 217–222.
Cho Y, Ko YM. Stabilizing the virtual response time in single-server processor sharing queues with slowly time-varying arrival rates. 2018.
Vaquero LM, Rodero-Merino L, Caceres J, Lindner M. A break in the clouds: towards a cloud definition. ACM SIGCOMM Comput Commun Rev. 2008;39:50–55.
Khazaei H, Misic J, Misic VB. A fine-grained performance model of cloud computing centers. IEEE Trans Parallel Distrib Syst. 2013;24(11):2138–2147.
Beloglazov A, Buyya R. Optimal online deterministic algorithms and adaptive heuristics for energy and performance efficient dynamic consolidation of virtual machines in Cloud data centers. Concurr Comput Pract Exp. 2012;24(13):1397–1420.
Xiong K, Perros H. Service performance and analysis in cloud computing. IEEE World Conf Serv. 2009;693–700.
Keller M, Karl H. Response time-optimized distributed cloud resource allocation. In: Proceedings of the ACM SIGCOMM Workshop on Distributed Cloud Computing; 2014.
Khazaei H, Misic J, Misic VB. Performance analysis of cloud computing centers using M/G/m/m+r queuing systems. IEEE Trans Parallel Distrib Syst. 2011;23(5):936–943.
Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I, Zaharia M. A view of cloud computing. Commun ACM. 2010;53(4):50–58.
Liu X, Tong W, Zhi X, Fu Z, WenZhao L. Performance analysis of cloud computing services considering resources sharing among virtual machines. J Supercomput. 2014;69(1):357–374.
RahimiZadeh K, AnaLoui M, Kabiri P, Javadi B. Performance modeling and analysis of virtualized multi-tier applications under dynamic workloads. J Netw Comput Appl. 2015;56:166–187.
Sun P, Wu D, Qiu X, Luo L, Li H. Performance analysis of cloud service considering reliability. IEEE Int Conf Softw Qual Reliab Secur Companion. 2016.
Jackson JR. Networks of waiting lines. Oper Res. 1957;5:518–521.
El Kafhali S, Salah K. Modeling and analysis of performance and energy consumption in cloud data centers. Arabian J Sci Eng. 2018;43:7789–7802.
Ma N, Mark J. Approximation of the mean queue length of an M/G/c queueing system. Oper Res. 1998;43:158–165.
Iosup A, Yigitbasi N, Epema D. On the performance variability of production cloud services. IEEE ACM Int Symp Cluster Cloud Grid Comput. 2011;104–113.
Vishwanath KV, Nagappan N. Characterizing cloud computing hardware reliability. Proc ACM Symp Cloud Comput. 2010;193–204.
Vilaplana J, Solsona F, Teixido I. A performance model for scalable cloud computing. Proc Australasian Symp Parallel Distrib Comput. 2015.
Vakilinia S, Ali MM, Qiu D. Modeling of the resource allocation in cloud computing centers. Comput Netw. 2015;91:453–470.
Article Statistics
Copyright License
Copyright (c) 2026 Dr. Lorenzo Bianchi

This work is licensed under a Creative Commons Attribution 4.0 International License.