Ransomware Auto-Detection In IoT Devices Using Machine Learning
Abstract
The term Internet of Things (often abbreviated IoT) was coined by industry researchers but has emerged into mainstream public view only more recently. The IoT is a massive group of devices containing sensors or actuators connected over wired or wireless networks. IoT has been rapidly growing over the past decade and, during the growth, security has been identified as one of the weakest areas in IoT. There are over six billion estimated devices currently connected to the Internet and an estimate of over 25 billion connected by 2020. IoT and its applications propagate to majority of life’s infrastructure ranging from health and food production to smart cities and urban management. While efficiency and prevalence of IoT are increasing, security issues remain a necessary concern for industries. Internet connected devices, including those deployed in an IoT architecture, are increasingly targeted by cybercriminals due to their pervasiveness and the ability to use the compromised devices to further attack the underlying architecture. In the case of ransomware, for example, devices that can store a reasonably amount of data are likely to be targeted. Thus, ensuring the security of IoT nodes against threats such as malware is a topic of ongoing interest. While malware detection and mitigation research are now new, ransomware detection and mitigation remain challenging. Ransomware is a relatively new malware type that attempts to encrypt a compromised device’s data using a strong encryption algorithm. The victim will then have to pay the ransom (usually using bitcoins) to obtain the password or decryption key. Consequences include temporary or permanent loss of sensitive information, disruption of regular operations, direct/indirect financial losses. In this paper, we present a machine learning based approach to detect ransomware of IoT devices. Specifically, our proposed approach outperforms K-Nearest Neighbors, Neural Networks, Support Vector Machine and Random Forest, in terms of accuracy rate, recall rate and precision rate.