Deep Reinforcement Learning for Robust USVs Navigation in Diverse Environmental Scenarios
DOI:
https://doi.org/10.59461/ijdiic.v4i4.219Keywords:
Generative Adversarial , Proximal Policy Optimization, Reinforcement Learning , Unmanned Surface Vehicles, Environmental FactorsAbstract
Collision avoidance is essential for the safe operation of unmanned surface vehicles (USVs) in marine environments. While existing studies have addressed USV collaboration and navigation, they often overlook environmental challenges. In this research, we develop a novel deep reinforcement learning model and train USVs to navigate safely under different situations, such as rain, wind, and their combined factors that can disrupt control and increase collision risk. We apply Generative Adversarial Imitation Learning (GAIL) and Proximal Policy Optimization (PPO) to improve the agent's performance using expert demonstrations. The experimental results demonstrate that the proposed framework significantly outperforms the baseline across multiple metrics: the maximum episode length increased from 15 to 350 steps, the cultivated reward improved by 158%, the extrinsic reward increased by 175%, and the number of collisions decreased by 75-78% across all environmental conditions. Moreover, the policy loss stabilized after 1.1 million training steps, confirming efficient convergence. These results show that our model performs better than the baseline model in mean reward, episode length, value estimates, and collision reduction. In the end, some concluding remarks and future directions are added by the authors.
Downloads
References
Z.-H. Zhou, “Introduction,” in Machine Learning, Singapore: Springer Singapore, 2021, pp. 1–24. doi: 10.1007/978-981-15-1967-3_1.
T. Jo, Machine Learning Foundations. Cham: Springer International Publishing, 2021. doi: 10.1007/978-3-030-65900-4.
D. Han, B. Mulyana, V. Stankovic, and S. Cheng, “A Survey on Deep Reinforcement Learning Algorithms for Robotic Manipulation,” Sensors, vol. 23, no. 7, p. 3762, Apr. 2023, doi: 10.3390/s23073762.
K. Souchleris, G. K. Sidiropoulos, and G. A. Papakostas, “Reinforcement Learning in Game Industry—Review, Prospects and Challenges,” Appl Sci, vol. 13, no. 4, p. 2443, Feb. 2023, doi: 10.3390/app13042443.
R. Liu, F. Nageotte, P. Zanne, M. de Mathelin, and B. Dresp-Langley, “Deep Reinforcement Learning for the Control of Robotic Manipulation: A Focussed Mini-Review,” Robotics, vol. 10, no. 1, p. 22, Jan. 2021, doi: 10.3390/robotics10010022.
M. Krichen, “Deep Reinforcement Learning,” in 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT), IEEE, Jul. 2023, pp. 1–7. doi: 10.1109/ICCCNT56998.2023.10306453.
D. Wang, W. Li, L. Zhu, and J. Pan, “Learning to control and coordinate mixed traffic through robot vehicles at complex and unsignalized intersections,” Int J Rob Res, vol. 44, no. 5, pp. 805–825, Apr. 2025, doi: 10.1177/02783649241284069.
C. Yu, J. Liu, S. Nemati, and G. Yin, “Reinforcement Learning in Healthcare: A Survey,” ACM Comput Surv, vol. 55, no. 1, pp. 1–36, Jan. 2023, doi: 10.1145/3477600.
X. Xu, Y. Lu, X. Liu, and W. Zhang, “Intelligent collision avoidance algorithms for USVs via deep reinforcement learning under COLREGs,” Ocean Eng, vol. 217, p. 107704, Dec. 2020, doi: 10.1016/j.oceaneng.2020.107704.
R. Bloss, “Autonomous unmanned vehicles take over on land, sea and in the air,” Ind Robot An Int J, vol. 40, no. 2, pp. 100–105, Mar. 2013, doi: 10.1108/01439911311297676.
C. Chen, F. Ma, X. Xu, Y. Chen, and J. Wang, “A Novel Ship Collision Avoidance Awareness Approach for Cooperating Ships Using Multi-Agent Deep Reinforcement Learning,” J Mar Sci Eng, vol. 9, no. 10, p. 1056, Sep. 2021, doi: 10.3390/jmse9101056.
X. Wu et al., “The autonomous navigation and obstacle avoidance for USVs with ANOA deep reinforcement learning method,” Knowledge-Based Syst, vol. 196, p. 105201, May 2020, doi: 10.1016/j.knosys.2019.105201.
N. Yan, S. Huang, and C. Kong, “Reinforcement Learning-Based Autonomous Navigation and Obstacle Avoidance for USVs under Partially Observable Conditions,” Math Probl Eng, vol. 2021, pp. 1–13, May 2021, doi: 10.1155/2021/5519033.
C. Lamini, S. Benhlima, and A. Elbekri, “Genetic Algorithm Based Approach for Autonomous Mobile Robot Path Planning,” Procedia Comput Sci, vol. 127, pp. 180–189, 2018, doi: 10.1016/j.procs.2018.01.113.
W. Li et al., “Crowd intelligence in AI 2.0 era,” Front Inf Technol Electron Eng, vol. 18, no. 1, pp. 15–43, Jan. 2017, doi: 10.1631/FITEE.1601859.
Rahim Ullah, Muhammad Saeed, W. Ali, Junaid Nazar, and Fakhra Nazar, “A Cooperative Heterogeneous Multi-Agent System Leveraging Deep Reinforcement Learning,” Knowl Decis Syst with Appl, vol. 1, pp. 112–124, Mar. 2025, doi: 10.59543/kadsa.v1i.13931.
E. Raboin, P. Svec, D. Nau, and S. K. Gupta, “Model-predictive target defense by team of unmanned surface vehicles operating in uncertain environments,” in 2013 IEEE International Conference on Robotics and Automation, IEEE, May 2013, pp. 3517–3522. doi: 10.1109/ICRA.2013.6631069.
M. Soori, B. Arezoo, and R. Dastres, “Artificial intelligence, machine learning and deep learning in advanced robotics, a review,” Cogn Robot, vol. 3, pp. 54–70, 2023, doi: 10.1016/j.cogr.2023.04.001.
M. Ghasemi and D. Ebrahimi, “Introduction to Reinforcement Learning,” Dec. 2024, [Online]. Available: http://arxiv.org/abs/2408.07712
D. Ruhela and A. Ruhela, “Tuning Apex DQN: A Reinforcement Learning based Deep Q-Network Algorithm,” in Practice and Experience in Advanced Research Computing 2024: Human Powered Computing, New York, NY, USA: ACM, Jul. 2024, pp. 1–5. doi: 10.1145/3626203.3670581.
H. Van Hasselt, A. Guez, and D. Silver, “Deep Reinforcement Learning with Double Q-Learning,” Proc AAAI Conf Artif Intell, vol. 30, no. 1, Mar. 2016, doi: 10.1609/aaai.v30i1.10295.
Q. Wu, W. Wang, P. Fan, Q. Fan, H. Zhu, and K. B. Letaief, “Cooperative Edge Caching Based on Elastic Federated and Multi-Agent Deep Reinforcement Learning in Next-Generation Networks,” IEEE Trans Netw Serv Manag, vol. 21, no. 4, pp. 4179–4196, Aug. 2024, doi: 10.1109/TNSM.2024.3403842.
A. Feriani and E. Hossain, “Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled Wireless Networks: A Tutorial,” IEEE Commun Surv Tutorials, vol. 23, no. 2, pp. 1226–1252, 2021, doi: 10.1109/COMST.2021.3063822.
K. Ahmic, J. Ultsch, J. Brembeck, and C. Winter, “Reinforcement Learning-Based Path Following Control with Dynamics Randomization for Parametric Uncertainties in Autonomous Driving,” Appl Sci, vol. 13, no. 6, p. 3456, Mar. 2023, doi: 10.3390/app13063456.
H. Huang, Y. Yang, H. Wang, Z. Ding, H. Sari, and F. Adachi, “Deep Reinforcement Learning for UAV Navigation Through Massive MIMO Technique,” IEEE Trans Veh Technol, vol. 69, no. 1, pp. 1117–1121, Jan. 2020, doi: 10.1109/TVT.2019.2952549.
H. Shi, Z. Lin, K.-S. Hwang, S. Yang, and J. Chen, “An Adaptive Strategy Selection Method With Reinforcement Learning for Robotic Soccer Games,” IEEE Access, vol. 6, pp. 8376–8386, 2018, doi: 10.1109/ACCESS.2018.2808266.
D. Sisodia, S. K. Shrivastava, and R. C. Jain, “ISVM for Face Recognition,” in 2010 International Conference on Computational Intelligence and Communication Networks, IEEE, Nov. 2010, pp. 554–559. doi: 10.1109/CICN.2010.109.
F. L. Da Silva, G. Warnell, A. H. R. Costa, and P. Stone, “Agents teaching agents: a survey on inter-agent transfer learning,” Auton Agent Multi Agent Syst, vol. 34, no. 1, p. 9, Apr. 2020, doi: 10.1007/s10458-019-09430-0.
Y. Mushtaq, W. Ali, U. Ghani, R. U. Khan, and A. Kumar Adak, “Advancing Aviation Safety and Sustainable Infrastructure: High-Accuracy Detection and Classification of Foreign Object Debris Using Deep Learning Models,” Int J Sustain Dev Goals, vol. 1, pp. 82–98, May 2025, doi: 10.59543/ijsdg.v1i.14279.
A. Saha, B. K. Debnath, P. Chatterjee, A. K. Panaiyappan, S. Das, and G. Anusha, “Generalized Dombi-based probabilistic hesitant fuzzy consensus reaching model for supplier selection under healthcare supply chain framework,” Eng Appl Artif Intell, vol. 133, p. 107966, Jul. 2024, doi: 10.1016/j.engappai.2024.107966.
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Rahim Ullah, Wajid Ali, Usman Ghanni

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.