|Authors||F. O. Catak, K. Sahinbas and V. Dörtkardeş|
|Editors||V. Sugumaran, A. K. Luhach and A. Elçi|
|Title||Artificial Intelligence Paradigms for Smart Cyber-Physical SystemsMalicious - URL Detection Using Machine Learning|
|Project(s)||Department of Engineering Complex Software Systems|
|Publication Type||Book Chapter|
|Year of Publication||2021|
|Pagination||160 - 180|
Recently, with the increase in Internet usage, cybersecurity has been a significant challenge for computer systems. Different malicious URLs emit different malicious software and try to capture user information. Signature-based approaches have often been used to detect such websites and detected malicious URLs have been attempted to restrict access by using various security components. This chapter proposes using host-based and lexical features of the associated URLs to better improve the performance of classifiers for detecting malicious web sites. Random forest models and gradient boosting classifier are applied to create a URL classifier using URL string attributes as features. The highest accuracy was achieved by random forest as 98.6%. The results show that being able to identify malicious websites based on URL alone and classify them as spam URLs without relying on page content will result in significant resource savings as well as safe browsing experience for the user.