A solution to the shortest path problem in communication networks based on Q-learning
Main Article Content
Abstract
This paper proposes an application of a Q-learning algorithm to solve the shortest path problem in communication networks, which are widely used in both wireline and wireless communications. The objective is to minimize the cost occurred in a transmission process. The cost may refer to transmit power and time delay. To bench mark such a proposed method, Dijkstra’s, Bellman-Ford’s, and Floy-Warshall’s algorithms, which are existing methods, are taken into a comparison. The research method begins with studying Dijkstra’s, Bellman-Ford’s, and Floy-Warshall’s algorithms to obtain some insights of advantages and disadvantages. In applying the Q-learning, the shortest path problem is decomposed into Q-learning’s components such as agent, action, state, environment, and reward which can be completely used in training process. Experimental results show that all methods provide the same solution. For the Q-learning method, it can be stated that this method requires a small number of episodes in training procedure e.g., 35 episodes for the interested example. Furthermore, it can also be found that the Q-learning algorithm offers all paths for any sources to the destination node. This is comparable with the Floy-Warshall’s algorithm. However, the Q-learning method requires less complexity in calculation. Furthermore, it can be concluded from the study results that the Q-learning method can be applied in multiple source-destination pairs.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Published manuscript are the rights of their original owners and RMUTSB Academic Journal. The manuscript content belongs to the authors' idea, it is not the opinion of the journal's committee and not the responsibility of Rajamangala University of Technology Suvarnabhumi
References
Abdoos, M., Mozayani, N., & Bazzan, A. L. C. (2011). Traffic light control in non-stationary environments based on multi agent Q-learning. 14th International IEEE Conference on Intelligent Transportation Systems (ITSC) (pp. 1580-1585). Washington, D.C.: IEEE.
Aermsa-Ard, P., Wangsamad, C., & Mamat, K. (2023). On applying Q-learning to optimize power allocation in 2-user NOMA system. The journal of Industrial Technology, 19(1), 104-116.
Al-Tam, F., Correia, N., & Rodruguez., J. (2020). Learn to schedule (LEACH): A deep reinforcement learning approach for radio resource scheduling in the 5G MAC layer. IEEE Access, 8, 108088-108101.
Aneja, Y. P. (2001). Routing in wavelength routed optical networks. IEEE Workshop on High Performance Switching and Routing (pp. 155-158). Dallas, TX: IEEE.
Buysse, M. H., Mets, K., & Latre, S. (2022). Hierarchical reinforcement learning: A survey and open research challenges. Machine Learning and Knowledge extraction, 4(1), 172-221.
Chanda, A., Rout, R. R., & Lingam, G. (2018). An Improvement Friendship-based Routing Algorithm in Mobile Social Networks. 13th International Conference on Industrial and Information System (ICIIS) (pp. 286-291). Rupnagar, India: IEEE.
Chen, W., He, R., Wang, G., Zhang, J., Wang, F., Xiong, K., Ai, B., & Zhong, Z. (2021). AI assisted PHY in future wireless systems: Recent developments and challenges. China Communications, 18(5), 285-297.
Danuphon1999. (2023). Dijksta-Bellman-floyd-warshall-Q-learning. Retrieved 16 October 2023, from https://github.com/Danuphon1999/Dijksta-Bellman-floyd-warshall-Q-learning?search=1
Forouzan, B. A. (2007). Data communications and networking (4th ed.). Boston Burr Ridge: McGraw-Hill.
Javaid, A. (2013). Understanding Dijkstra algorithm. Electronic Journal, 1, 1-27.
Li, R., Zheng, C., & Zhang, Y. (2011). Study of power-aware routing protocol in wireless sensor networks. International Conference on Electrical and Control Engineering (pp. 3173-3176). Yichang: IEEE.
Mete, E., & Girici, T. (2020). Q-learning based scheduling with successive interference cancellation. IEEE Access, 8, 172034-172042.
Mukhlif, F., & Safi, A. (2020). Comparative study on Bellman-Ford and Dijkstra algorithms. International Conference on Communication, Electrical and Computer Networks (ICCECN 2020) (pp.1-5). Kuala Lumpur: University of Malaya.
Nguyen, D. C., Peng, C., Ming, D., Lopez-Perez, D. Pathirana, P. N., Li, J., Aruna, S., Younghui, L., & Poor, H. V. (2021). Enabling AI in Future Wireless Networks : A Data Life Cycle Perspective. IEEE Communication Surveys & Tutorial, 23(1), 553-595.
Pryshchenko, O., Dumin, O., Plakhtii, V., Shyrokorad, D., & Pochanin, G. (2021). Collective artificial intelligence approach for the problem of object classification with UWB GPR. IEEE 26th International Seminar/Workshop on Direct and Inverse Problems of Electromagnetic and Acoustic Wave Theory (DIPED) (pp. 185-190). Tbilisi: IEEE.
Raza, A. D., & Muhammad, S. S. (2015). Achievable capacity region of a Gaussian optical wireless relay channel. Journal of Optical Communications and Networking, 7(12), 83-95.
Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). Cambridge, MA: MIT Press.
Wilson, R. J. (1996). Introduction to graph theory (4th ed.). Harlow: Addison Wesley Longman.
Wuttisittikulkij, L. (2016). Principle of Telecommunication Engineering (2nd ed.). Bangkok: Chulalongkorn University Press.