Project

Detecting malicious shortened URLs using machine learning

With advances in technology, there is a constant need of sharing resources in the form of social media, blogs, and through links to websites. To make sharing easier, URL shortening services like bit.ly, goog.gl, tinyurl.com, ow.ly are used, but often lead to unforeseen issues. The seemingly benign short URLs conceal malicious content. For example, a user who visits malicious website can become a victim of malicious activities such as phishing, spamming, social engineering, and drive-by-download.
 This project aims to detect malicious shortened URLs with the help of machine learning techniques. Random Forest is one of the best classification algorithms that has a higher accuracy rate as it employs the use of higher number of trees, splitting points and the bagging concept. The model is trained with the shortened URL dataset, along with its features, thus achieving the accuracy of 96.29% for this project.
 An extension for Chrome is developed to detect the shortened malicious URLs with the use of our generated machine learning model. While using the extension, if it encounters a malicious shortened URL, it informs user with details like normal form of the URL, risk percentage (which depends on accuracy) and with the option of ‘still load the webpage’ or ‘go back’. This extension acts as a barrier between users and malicious websites, helping them educate choices to ensure their safety and privacy.

Project (M.S., Computer Science)--California State University, Sacramento, 2018.

With advances in technology, there is a constant need of sharing resources in the form of social media, blogs, and through links to websites. To make sharing easier, URL shortening services like bit.ly, goog.gl, tinyurl.com, ow.ly are used, but often lead to unforeseen issues. The seemingly benign short URLs conceal malicious content. For example, a user who visits malicious website can become a victim of malicious activities such as phishing, spamming, social engineering, and drive-by-download. This project aims to detect malicious shortened URLs with the help of machine learning techniques. Random Forest is one of the best classification algorithms that has a higher accuracy rate as it employs the use of higher number of trees, splitting points and the bagging concept. The model is trained with the shortened URL dataset, along with its features, thus achieving the accuracy of 96.29% for this project. An extension for Chrome is developed to detect the shortened malicious URLs with the use of our generated machine learning model. While using the extension, if it encounters a malicious shortened URL, it informs user with details like normal form of the URL, risk percentage (which depends on accuracy) and with the option of ‘still load the webpage’ or ‘go back’. This extension acts as a barrier between users and malicious websites, helping them educate choices to ensure their safety and privacy.

Relationships

Items