Articles
| Open Access | Deep Reinforcement Learning Driven Optimal Queuing And Task Scheduling Architectures For Sustainable Cloud And Flexible Manufacturing Systems
Abstract
The accelerating convergence of cloud computing infrastructures with advanced manufacturing and service-oriented digital ecosystems has produced an unprecedented demand for intelligent task scheduling, energy-aware resource allocation, and adaptive queuing mechanisms. Traditional deterministic and heuristic scheduling paradigms, originally designed for relatively stable computational or industrial environments, increasingly struggle to cope with the stochastic, heterogeneous, and high-dimensional nature of modern cloud and cyber-physical production systems. Within this context, deep reinforcement learning has emerged as a transformative paradigm that enables autonomous agents to learn optimal scheduling, routing, and resource management strategies through continuous interaction with complex environments. This research article develops an integrated theoretical and methodological framework that unifies deep Q-learning based task scheduling with optimal queuing principles, focusing on sustainability, efficiency, and robustness across cloud computing and flexible manufacturing systems.
Grounded in the deep Q-learning driven optimal task scheduling paradigm articulated by Kanikanti, Tiwari, Nayan, Suryawanshi, and Chauhan, this study extends the conceptual scope of learning-based scheduling by embedding queuing theory into the reinforcement learning decision loop, thereby enabling the agent to internalize congestion, waiting time, and service discipline dynamics as intrinsic components of its reward structure (Kanikanti et al., 2025). Unlike conventional job-shop or cloud schedulers that treat queues as exogenous constraints, the present framework treats them as endogenous and learnable system properties, allowing the scheduling agent to adapt to workload fluctuations, energy constraints, and performance trade-offs in a theoretically principled manner.
The article situates this approach within a broad scholarly landscape that includes evolutionary and swarm-based scheduling, green manufacturing optimization, fog and edge computing task allocation, and deep reinforcement learning for resource management. Prior research has demonstrated the effectiveness of metaheuristics such as genetic algorithms, tabu search, and memetic algorithms for flexible job-shop scheduling, as well as the promise of reinforcement learning for cloud and fog-based task scheduling, but these two streams of research have often evolved in parallel rather than in integration (Pezzella et al., 2008; Yuan and Xu, 2015; Gazori et al., 2020). By synthesizing these traditions through a queuing-aware deep Q-learning framework, this study advances a unified model capable of addressing not only throughput and latency but also energy efficiency, sustainability, and system resilience.
Methodologically, the article develops a detailed simulation-based research design grounded in CloudSim Plus and related cloud modeling toolkits, while drawing conceptual parallels to flexible manufacturing systems characterized by multipurpose machines and transportation constraints (Calheiros et al., 2011; Filho et al., 2017; Brucker and Schlie, 1990). Rather than presenting numerical results in tabular form, the findings are articulated through theoretically grounded and literature-validated interpretive analysis, demonstrating how learning-driven schedulers internalize queue dynamics, reduce energy waste, and achieve superior long-term performance stability compared to static or rule-based approaches.
The discussion section engages deeply with theoretical debates surrounding function approximation, stability, and exploration-exploitation trade-offs in deep reinforcement learning, incorporating insights from foundational work on deep Q-networks and actor-critic architectures (Mnih et al., 2015; Fujimoto et al., 2018). It further explores the implications of these learning-based schedulers for sustainable manufacturing, green cloud computing, and the future of autonomous digital infrastructures, critically examining both their transformative potential and their practical limitations. By integrating optimal queuing, deep reinforcement learning, and sustainability-oriented scheduling, this article contributes a comprehensive, theoretically rich, and forward-looking framework for the next generation of intelligent cloud and manufacturing systems.
Keywords
Deep reinforcement learning, cloud task scheduling, optimal queuing
References
Filho, M. C. S., Oliveira, R. L., Monteiro, C. C., Inacio, P. R. M., and Freire, M. M. (2017). CloudSim Plus: A cloud computing simulation framework pursuing software engineering principles for improved modularity, extensibility and correctness. IFIP IEEE Symposium on Integrated Network and Service Management.
Pezzella, F., Morganti, G., and Ciaschetti, G. (2008). A genetic algorithm for the flexible job shop scheduling problem. Computers and Operations Research, 35, 3202–3212.
Gao, K., Cao, Z., Zhang, L., Chen, Z., Han, Y., and Pan, Q. (2019). A review on swarm intelligence and evolutionary algorithms for solving flexible job shop scheduling problems. IEEE CAA Journal of Automatica Sinica, 6, 904–916.
Kanikanti, V. S. N., Tiwari, S. K., Nayan, V., Suryawanshi, S., and Chauhan, R. (2025). Deep Q Learning Driven Dynamic Optimal Task Scheduling for Cloud Computing Using Optimal Queuing. International Conference on Computational Intelligence and Knowledge Economy.
Gazori, P., Rahbari, D., and Nickray, M. (2020). Saving time and cost on the scheduling of fog based IoT applications using deep reinforcement learning approach. Future Generation Computer Systems, 110, 1098–1115.
Brucker, P., and Schlie, R. (1990). Job shop scheduling with multi purpose machines. Computing, 45, 369–375.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., and Hassabis, D. (2015). Human level control through deep reinforcement learning. Nature.
Malek, J., and Desai, T. N. (2020). A systematic literature review to map literature focus of sustainable manufacturing. Journal of Cleaner Production, 256, 120345.
Peng, Z., Cui, D., Zuo, J., Li, Q., Xu, B., and Lin, W. (2015). Random task scheduling scheme based on reinforcement learning in cloud computing. Cluster Computing, 18, 1595–1607.
Calheiros, R. N., Ranjan, R., Beloglazov, A., De Rose, C. A. F., and Buyya, R. (2011). CloudSim: A toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Software Practice and Experience, 41, 23–50.
Fujimoto, S., van Hoof, H., and Meger, D. (2018). Addressing function approximation error in actor critic methods. International Conference on Machine Learning.
Cui, D., Peng, Z., Xiong, J., Xu, B., and Lin, W. (2017). A reinforcement learning based mixed job scheduler scheme for grid or IaaS cloud. IEEE Transactions on Cloud Computing.
Park, I. B., Huh, J., Kim, J., and Park, J. (2020). A reinforcement learning approach to robust scheduling of semiconductor manufacturing facilities. IEEE Transactions on Automation Science and Engineering, 17, 1420–1431.
Liu, Z., Guo, S., and Wang, L. (2019). Integrated green scheduling optimization of flexible job shop and crane transportation considering comprehensive energy consumption. Journal of Cleaner Production, 211, 765–786.
Momenikorbekandi, A., and Abbod, M. (2023). A novel metaheuristic hybrid parthenogenetic algorithm for job shop scheduling problems applying optimisation model. IEEE Access, 11, 56027–56045.
Yuan, Y., and Xu, H. (2015). Multi objective flexible job shop scheduling using memetic algorithms. IEEE Transactions on Automation Science and Engineering, 12, 336–353.
Mao, H., Alizadeh, M., Menache, I., and Kandula, S. (2016). Resource management with deep reinforcement learning. ACM Workshop on Hot Topics in Networks.
Karthiban, K., and Raj, J. S. (2020). An efficient green computing fair resource allocation in cloud computing using modified deep reinforcement learning algorithm. Soft Computing, 24, 14933–14942.
Chen, X., Zhang, H., Wu, C., Mao, S., Ji, Y., and Bennis, M. (2019). Optimized computation offloading performance in virtual edge computing systems via deep reinforcement learning. IEEE Internet of Things Journal, 6, 1888–1899.
Brandimarte, P. (1993). Routing and scheduling in a flexible job shop by tabu search. Annals of Operations Research, 41, 157–183.
Wang, M., Cui, Y., Wang, X., Xiao, S., and Jiang, J. (2018). Machine learning for networking workflow advances and opportunities. IEEE Network, 32, 92–99.
Gao, J., and Evans, R. (2016). DeepMind AI reduces Google data centre cooling bill by 40 percent. DeepMind.
Article Statistics
Copyright License
Copyright (c) 2026 Adrian Kovalenko

This work is licensed under a Creative Commons Attribution 4.0 International License.