Malicious URL detection using machine learning techniques

Authors

  • Mohamed Cherradi Abdelmalek Essaâdi University (UAE), ENSAH, Tetouan, Morocco
  • Hajar El Mahajer Abdelmalek Essaâdi University (UAE), FSTT, Tetouan, Morocco

DOI:

https://doi.org/10.59461/ijdiic.v4i2.187

Keywords:

Cybersecurity, Malicious URLs, Machine Learning, Classification

Abstract

With numerous new websites being created every day, it's getting increasingly challenging to tell which ones are safe and which could be dangerous. These websites frequently gather sensitive user data that may be hacked in the absence of proper cybersecurity safeguards, such as the effective identification and categorization of dangerous URLs. In order to improve cybersecurity, this study attempts to create models based on machine learning algorithms for the effective detection and categorization of harmful URLs. In this regard, our proposal uses decision trees, logistic regression, support vector machines, and Naive Bayes to reliably categorize dangerous URLs. To improve classification efficiency, we have integrated hyper-parameter tuning using the Grid Search technique, optimizing model performance for more accurate and reliable results. The results demonstrate the effectiveness of Naive Bayes in achieving high accuracy (91.9%) and reliable performance in detecting malicious URLs. Implementation as a web service of the study provides evidence of the practicality and natural fit into more generalized security frameworks. Ultimately, our approach significantly enhances the detection of unsafe URLs, offering a robust solution to address the growing challenges in cybersecurity.

Downloads

Download data is not yet available.

References

B. B. Gupta, K. Yadav, I. Razzak, K. Psannis, A. Castiglione, and X. Chang, “A novel approach for phishing URLs detection using lexical based machine learning in a real-time environment,” Comput. Commun., vol. 175, pp. 47–57, Jul. 2021, doi: 10.1016/j.comcom.2021.04.023.

M. Veale and I. Brown, “Cybersecurity,” Internet Policy Rev., vol. 9, no. 4, Dec. 2020, doi: 10.14763/2020.4.1533.

S. H. Ahammad et al., “Phishing URL detection using machine learning methods,” Adv. Eng. Softw., vol. 173, p. 103288, Nov. 2022, doi: 10.1016/j.advengsoft.2022.103288.

B. Wardman, “Phorecasting Phishing Attacks: A New Approach for Predicting the Appearance of Phishing Websites,” Int. J. Cyber-Security Digit. Forensics, vol. 5, no. 3, pp. 142–154, 2016, doi: 10.17781/P002156.

N. Virvilis, A. Mylonas, N. Tsalis, and D. Gritzalis, “Security Busters: Web browser security vs. rogue sites,” Comput. Secur., vol. 52, pp. 90–105, Jul. 2015, doi: 10.1016/j.cose.2015.04.009.

F. O. Catak, K. Sahinbas, and V. Dörtkardeş, “Malicious URL Detection Using Machine Learning,” 2021, pp. 160–180. doi: 10.4018/978-1-7998-5101-1.ch008.

C. Crisci, B. Ghattas, and G. Perera, “A review of supervised machine learning algorithms and their applications to ecological data,” Ecol. Modell., vol. 240, pp. 113–122, Aug. 2012, doi: 10.1016/j.ecolmodel.2012.03.001.

M. Aldwairi and R. Alsalman, “MALURLS: A Lightweight Malicious Website Classification Based on URL Features,” J. Emerg. Technol. Web Intell., vol. 4, no. 2, May 2012, doi: 10.4304/jetwi.4.2.128-133.

C. Do Xuan, H. Dinh, and T. Victor, “Malicious URL Detection based on Machine Learning,” Int. J. Adv. Comput. Sci. Appl., vol. 11, no. 1, 2020, doi: 10.14569/IJACSA.2020.0110119.

S. He, J. Xin, H. Peng, and E. Zhang, “Research on Malicious URL Detection Based on Feature Contribution Tendency,” in 2021 IEEE 6th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), IEEE, Apr. 2021, pp. 576–581. doi: 10.1109/ICCCBDA51879.2021.9442606.

X. Yu, “Phishing Websites Detection Based on Hybrid Model of Deep Belief Network and Support Vector Machine,” IOP Conf. Ser. Earth Environ. Sci., vol. 602, no. 1, p. 012001, Nov. 2020, doi: 10.1088/1755-1315/602/1/012001.

A. Zamir et al., “Phishing web site detection using diverse machine learning algorithms,” Electron. Libr., vol. 38, no. 1, pp. 65–80, Mar. 2020, doi: 10.1108/EL-05-2019-0118.

R. S. Rao and A. R. Pais, “Detection of phishing websites using an efficient feature-based machine learning framework,” Neural Comput. Appl., vol. 31, no. 8, pp. 3851–3873, Aug. 2019, doi: 10.1007/s00521-017-3305-0.

K. S. Adewole, A. G. Akintola, S. A. Salihu, N. Faruk, and R. G. Jimoh, “Hybrid Rule-Based Model for Phishing URLs Detection,” 2019, pp. 119–135. doi: 10.1007/978-3-030-23943-5_9.

N. Reyes-Dorta, P. Caballero-Gil, and C. Rosa-Remedios, “Detection of malicious URLs using machine learning,” Wirel. Networks, vol. 30, no. 9, pp. 7543–7560, Dec. 2024, doi: 10.1007/s11276-024-03700-w.

A. Hamza, F. Hammam, M. Abouzeid, M. A. Ahmed, S. Dhou, and F. Aloul, “Malicious URL and Intrusion Detection using Machine Learning,” in 2024 International Conference on Information Networking (ICOIN), IEEE, Jan. 2024, pp. 795–800. doi: 10.1109/ICOIN59985.2024.10572207.

D. Orozco-Fonseca, G. Marín, and A. Lara, “Taxonomy of Malicious URL Detection Techniques,” 2024, pp. 73–81. doi: 10.1007/978-3-031-54235-0_7.

A. Astorino, A. Chiarello, M. Gaudioso, and A. Piccolo, “Malicious URL detection via spherical classification,” Neural Comput. Appl., vol. 28, no. S1, pp. 699–705, Dec. 2017, doi: 10.1007/s00521-016-2374-9.

Downloads

Published

25-05-2025

How to Cite

Mohamed Cherradi, & Hajar El Mahajer. (2025). Malicious URL detection using machine learning techniques. International Journal of Data Informatics and Intelligent Computing, 4(2), 41–52. https://doi.org/10.59461/ijdiic.v4i2.187

Issue

Section

Regular Issue