Enhancing IoT Security: Predicting Password Vulnerability and Providing Dynamic Recommendations using Machine Learning and Large Language Models
##plugins.themes.bootstrap3.article.main##
The rapid growth of IoT has increased security vulnerabilities, especially from weak passwords. This study aims to develop and validate a machine learning tool to predict password vulnerabilities in smart home IoT devices and provide dynamic recommendations using a Large Language Model (LLM). The research addresses gaps in existing security measures by offering a data-driven model that predicts vulnerabilities and provides real-time, tailored recommendations. Archival data from previous IoT security research, including password cracking attempts, were used to train the model. Testing involved real-world password data and adversarial scenarios, with performance evaluated using accuracy, precision, recall, and F1-score. The findings show significant improvements in recall and F1-score with the Retrieval Augmented Generation (RAG) architecture compared to the baseline, suggesting RAG’s potential in enhancing IoT security. Organizations can use this model to improve their infrastructure’s security, reducing risks from weak passwords.
Downloads
Introduction
The Internet of Things (IoT) has evolved far beyond its original conceptualization, now deeply embedded in daily activities and business operations across multiple sectors. From healthcare to infrastructure management, IoT devices have brought about unparalleled convenience and efficiency. However, this proliferation comes with significant security challenges. The increasing integration of IoT into critical infrastructures has amplified the urgency of ensuring these systems are secure and resilient against ever-evolving cyber threats [1]. Despite the clear need for stronger security measures, manufacturers of IoT devices are often reluctant to invest in robust security protocols due to cost concerns. Securing IoT devices requires significant investment in encryption technologies, secure hardware, and ongoing software updates [2]. These measures are costly, and manufacturers frequently face pressure to keep prices competitive, leading to compromises in security to reduce production costs [2]. Additionally, IoT devices are typically resource-constrained, with limited processing power and memory, making them inherently more susceptible to attacks [3].
Traditional static security measures, such as simple password hashing or predefined password strength rules, are increasingly inadequate in the face of evolving cyber threats. These methods fail to adapt to dynamic attack vectors and new strategies employed by malicious actors. As such, the need for more intelligent, adaptive security mechanisms has never been greater. Recent literature highlights the need for more adaptive security solutions, such as those incorporating machine learning (ML) and Large Language Models (LLMs), which can analyze real-time data and dynamically adjust to emerging threats [4], [5]. However, the current adoption of these technologies within IoT security frameworks remains limited, leaving a significant gap in the ability to protect against evolving cyber threats [6].
LLMs such as Open AI GPT-4, traditionally used in natural language processing, also have significant potential in the cybersecurity domain. When integrated into a security framework, LLMs can enhance password management systems by providing context-aware recommendations. For instance, an LLM can analyze a password’s structure and suggest improvements based on learned patterns of password weaknesses and strengths. This capability becomes even more powerful when combined with Retrieval-Augmented Generation (RAG) frameworks, which further enhance the LLM’s ability to pull in relevant external knowledge, such as the latest security practices from established frameworks like MITRE ATT&CK.
The combination of ML and RAG creates a robust system that not only predicts password vulnerabilities but also provides dynamic, context-specific suggestions to fortify IoT device security. This approach moves beyond the reactive nature of traditional security measures, instead offering a proactive solution capable of adapting to real-time data and emerging threats. This paper proposes an innovative model that leverages the predictive power of machine learning with the adaptive capabilities of LLMs enhanced by RAG. Specifically, a tool has been developed to (1) Dynamically predict the vulnerability of IoT passwords based on real-time data using a Random Forest Classifier; (2) Generate actionable recommendations for password strengthening, informed by an LLM enhanced with RAG, which draws from a comprehensive repository of security best practices; and (3) Adapt to emerging threats through continuous learning, thereby outperforming static security protocols.
While traditional password security systems may indicate if a password is weak or strong and provide static best practice guidelines, the tool developed in this research is unique in its ability to dynamically assess password vulnerabilities and offer real-time, context-aware recommendations for strengthening passwords [7]. By leveraging machine learning for predictive analysis and LLMs to generate tailored password improvement strategies, this tool addresses the critical gap in current IoT security practices.
Related Works
IoT Security Vulnerability
IoT has ushered in a new era of technological advancement yet remains susceptible to various security vulnerabilities. In pursuit of fulfilling their designated tasks, IoT devices often allocate most of their resources for operational purposes, leaving limited or no resources for security measures [8]. Additionally, Polat [9] argued that the scarcity of security monitoring in most IoT devices, combined with insufficient Identity and Access Management (IAM) controls, renders them highly vulnerable targets for hackers. This resource imbalance and security deficit create an environment conducive to cyberattacks, amplifying the significance of reinforcing the security posture of IoT devices [10].
Confidentiality and privacy are major challenges in the context of IoT, as data collected and stored by IoT devices and systems can be highly [8]. The heterogeneous nature of IoT devices, along with poor architecture choices and frequent training requirements, poses challenges in configuring effective anomaly detection methods, making security issues a significant concern for both developers and end-users [11]. Pratt and Lulka highlighted the top IoT attacks that may impact smart home IoT users, including Denial of Service (DoS), Brute Force Attacks, Man in the Middle Attacks, Phishing Attacks, and Botnet Attacks, and explain how these attacks can easily exploit the vulnerabilities in IoT devices and networks, disrupting services and compromising user privacy and security [12]. DoS and DDoS attacks are particularly challenging to mitigate due to the multiple connection requests overwhelming the target, causing slowdowns, crashes, or shutdowns [13]. The limited resources of IoT devices can exacerbate the impact of these attacks, leading to long-term memory depletion of the relaying nodes [13].
IoT Password Management
The security of smart home IoT devices is a critical aspect that demands attention, as unattended, automated, and poorly managed smart home IoT devices can pose significant risks [14]. Protogerou et al. highlighted the security concern that arises from the use of default passwords, which many users fail to change, leaving devices vulnerable to unauthorized access [15]. Additionally, the limited computing power of IoT devices, driven by the need for cost-effectiveness and energy efficiency, can make them susceptible to compromise [16]. Attackers can exploit these vulnerabilities, potentially recruiting compromised devices to launch further attacks or gain unauthorized access to sensitive information [17].
IoT devices, especially those used for home automation, often incorporate cutting-edge features but are built with inexpensive hardware elements [18]. As a result, cheap firmware and chips may include built-in vulnerabilities that are difficult, if not impossible, to detect by operators and owners [19]. Furthermore, Agazzi [20] explained that many IoT devices are “always-on and online,” constantly communicating with the internet, making them continually exposed to potential malware payloads, further increasing security risks.
Password security is a critical aspect of IoT vulnerability, as weak passwords pose substantial risks. The characteristics of weak passwords, such as simplicity and predictability, make them vulnerable to various attacks [21]. As a result, common vulnerabilities, such as password reuse and lack of password complexity, can lead to unauthorized access and potential data [11]. Further, the utilization of weak and easily guessable passwords or relying on known default passwords is a common pitfall in IoT device security [22]. Such practices make the devices vulnerable to security breaches, including more sophisticated attacks such as Distributed Denial of Service (DDoS) [23].
Large Language Models
Large Language Models (LLMs) have revolutionized natural language processing (NLP) by enhancing language understanding and generation. Gholamhosseini et al. [8] analyzed LLM architectures, highlighting their contextual capabilities, while Mahapatra and Garain [24] emphasized the benefits of scaling model size for performance. Dai noted improvements in language comprehension with more parameters and data [25]. Similarly, Souai [26] discussed the integration of LLMs with external knowledge sources to improve their contextual understanding and relevance. Norouzi [27] and Venkatesh [28] explored fine-tuning techniques for domain-specific tasks, improving effectiveness and reliability in fields like healthcare and legal documentation.
The integration of LLMs with other AI systems and their implications for broader applications are also explored in the literature. Alcaraz [29] and Kumar [30] both highlight the potential of combining LLMs with reinforcement learning and other AI techniques to enhance their decision-making capabilities and adaptability. Similarly, Kumar discusses the integration of LLMs with external knowledge sources to improve their contextual understanding and relevance [31]. The integration of these technologies underscores the dynamic and evolving nature of LLM research, demonstrating their potential to revolutionize diverse applications from conversational agents to content generation [31].
Machine Learning in IoT
Machine learning (ML) has significantly advanced across various domains, including predictive modeling, edge computing, and network security. Eloranta and Boman [32] provide a comprehensive overview of ML algorithms, focusing on their application in decision support systems and predictive modeling. They emphasize the necessity of algorithmic efficiency and accuracy, particularly in real-time data processing scenarios. This notion is echoed by Dawood, who explore the application of ML in image and video processing, highlighting the role of convolutional neural networks (CNNs) in enhancing image classification and object detection [33]. Similarly, Haralkar discusses the integration of ML with edge computing, which aligns with the need for efficient, real-time data processing [34]. The trade-offs between model complexity and computational resources are further elaborated by Btd, demonstrating the practical challenges faced when deploying ML models on edge devices [35].
Okoli et al. review ML techniques for threat detection and anomaly identification in network security, highlighting the superiority of ML models over traditional methods [36]. Polemi et al. discuss challenges like model interpretability and overfitting, essential for reliable security systems [37]. Wood emphasizes that advanced ML models enhance threat detection accuracy and reduce false positives [38]. Collectively, these studies show ML’s transformative impact on network security, addressing computational challenges and model refinement needs. Lamberti and Seldon stress optimizing ML models for specific environments [39], [40]. Holla highlights the benefits of deploying ML on edge devices to reduce latency and improve response times [41]. These perspectives together illustrate the evolution of ML technologies to meet diverse application demands.
Problem Statement, Hypothesis Statements, and Research Questions
Problem Statement
The problem to be addressed by this study is the heightened vulnerability of smart home IoT devices to dictionary and brute force attacks due to the absence of tools that can accurately predict password vulnerabilities and provide actionable recommendations for strong passwords [42].
Hypothesis Statement
The utilization of a machine learning model and LLMs significantly improves the prediction and mitigation of dictionary and brute force attacks resulting from weak passwords in smart home IoT devices.
Research Question
How can a machine learning model and LLMs be effectively utilized to predict and mitigate dictionary and brute force attacks resulting from weak passwords on smart home IoT devices?
Methodology
Method
The research method selected for this study is a quantitative design science research (DSR) approach to develop and validate a ML model integrated with a RAG framework aimed at predicting and mitigating password vulnerabilities in IoT devices. The primary objective is to create a dynamic security tool that leverages the predictive capabilities of ML and the adaptive strengths of LLMs, with RAG enhancing the contextual relevance of the recommendations provided to improve password security. In alignment with Hevner’s guidelines for DSR, this research adheres to the iterative cycle of build and evaluate [43]. The process starts with the construction of the ML-RAG framework, followed by rigorous testing to evaluate its effectiveness in addressing the problem of IoT password vulnerability. This artifact represents both a process (dynamic password prediction and recommendation) and a product (a security-enhanced IoT system), reflecting the core principles of DSR.
Population and Sample
The population for this study consists of simulated IoT devices modeled after commonly deployed smart home systems, such as networked cameras, thermostats, and door locks. These devices were selected based on their prevalence in consumer environments and their documented vulnerability to password-based attacks. Additionally, archival data from previously breached IoT devices, particularly those targeted in large-scale attacks like the Mirai botnet, were used to create a realistic dataset for training and testing the machine learning model. The simulated population reflects the diversity of IoT ecosystems, encompassing both low-end, resource-constrained devices with limited security features and more advanced devices equipped with basic security protocols like password salting. This diversity within the population ensures that the model’s predictions and recommendations are applicable across a wide range of IoT devices, addressing both typical vulnerabilities and emerging security challenges.
Artifact Design
The core artifact designed in this research is a security-enhancing framework that integrates a ML model with a LLM enhanced by a RAG framework. The purpose of this artifact is to dynamically predict IoT device password vulnerabilities and provide real-time, context-aware recommendations for improving password strength. The ML component of the artifact uses a Random Forest Classifier, which was chosen for its ability to handle complex, non-linear relationships between input features such as password length, complexity, salting status, and historical breach data [44]. The Random Forest model is trained on a dataset of IoT password breaches, including real-world data from known attacks like Mirai, as well as randomly generated passwords designed to simulate evolving cyber threats. The model is responsible for predicting the likelihood of a password being breached under different attack scenarios, offering insights into the most vulnerable points within the system.
During the data training phase, the Random Forest classifier was fine-tuned on the collected dataset, with the RAG model augmenting the process by providing additional context from the MITRE ATT&CK framework. Techniques such as cross-validation were employed to optimize the model’s performance, ensuring that it generalizes well to unseen data. Cross-validation involved splitting the dataset into multiple folds, training the model on some folds while validating it on others, and averaging the results to obtain a reliable estimate of the model’s performance. By integrating RAG with the Random Forest classifier, the artifact was able to leverage external knowledge from the MITRE ATT&CK framework, providing a more informed and accurate prediction of password crackability.
The LLM with RAG integration serves as the artifact’s recommendation engine. Upon identifying a potentially vulnerable password, the LLM is activated to generate real-time suggestions for enhancing the password’s strength. The RAG component augments the LLM by retrieving relevant external information from established cybersecurity frameworks, such as MITRE ATT&CK, ensuring that the recommendations are grounded in the latest security standards and best practices. This combination allows the artifact to dynamically adapt to emerging threats, continuously refining its recommendations based on the most up-to-date knowledge available.
The final component of the artifact is the development of a backend API using Flask. Flask is a lightweight web framework for Python that allows for the easy creation of web applications and APIs [45]. The purpose of the Flask API is to provide organizations with a simple yet powerful interface for testing the security of their passwords. The Flask API was designed with scalability and ease of use in mind. It serves as the gateway through which users can interact with the trained machine learning model. Simply, the API accepts user-inputted passwords and runs them through the model, returning a probability score that indicates the likelihood of the password being breached. If the probability score is high, the API leverages the RAG model to generate suggestions for improving the password’s strength, drawing on the extensive knowledge stored in the MITRE ATT&CK framework.
The artifact is designed to be adaptable and scalable across a variety of IoT environments. Its iterative design allows it to evolve over time as new data is fed into the system and as the landscape of IoT security continues to change. The integration of both ML and LLM with RAG ensures that the artifact not only identifies security weaknesses but also provides actionable guidance that evolves with the threat landscape, making it an effective tool for real-time IoT security management. The artifact’s design prioritizes usability and practical application in real-world scenarios, allowing IoT device manufacturers and end-users to benefit from a proactive and responsive approach to password security. This focus on dynamic prediction and adaptation differentiates it from traditional, static security measures, positioning it as a novel contribution to the field of IoT cybersecurity. The end-to-end system architecture for this artifact is organized into three layers: User Layer (Front-End), Processing Layer (Back-End), and Data Layer (Database & Knowledge Sources). Each layer, shown in Fig. 1, plays a critical role in the system’s functionality and interacts with the others to achieve the goal of predicting password vulnerabilities and providing recommendations for improvement.
Fig. 1. Artifact architecture framework, including both front-end and back-end layers.
The front-end layer, which interacts directly with the user, is responsible for collecting input and displaying the results. This layer includes the User Input Interface, which is the component where users input their passwords for analysis, serving as the entry point for the system. The Flask API then acts as the middleware between the user and the back-end system. The user’s password input is sent via the Flask API to the processing layer, where analysis and recommendations take place. Once the back-end processing is complete, the system returns a prediction of password vulnerability and suggestions for improvement. This result is then displayed back to the user, providing them with actionable insights into their password security.
The core computational engine of the system lies in the processing layer, where the machine learning model is used to predict password vulnerabilities and retrieval-augmented generation (RAG) is employed to recommend stronger passwords. This layer ensures robust data processing and security evaluation. The heart of the model, the Random Forest classifier, is trained on the dataset of password breaches. This model predicts the probability of a password being breached based on specific features like length, entropy, and previously known attack techniques, forming a basis of the system’s recommendation to the user.
After being trained, the model is used to analyze new passwords submitted by users. In addition, the RAG component works in tandem with the machine learning model to enhance its functionality. Once the vulnerability is identified, the RAG model accesses a knowledge base (in this case, the MITRE ATT&CK framework) to recommend specific actions the user can take to strengthen their password. The final component in this layer is the adversarial data testing feature, which is used to simulate sophisticated attack scenarios by slightly modifying the input data to test the model’s resilience. By using adversarial examples (passwords that are intentionally crafted to challenge the model), this process ensures that the model can handle unexpected or novel attacks and maintain accuracy.
The final layer in the framework is the Data Layer, serving as a database and knowledge source for the processing layer to use when making predictions and generating recommendations. One of the main components in this layer is the password dataset, which contains a record of historical passwords and their entropies, whether they were breached, and the attempts and time taken to breach the passwords. This dataset serves as the training and testing data for the Random Forest Classifier. The second main component is the MITRE ATT&CK framework, a real-time, comprehensive repository of attack techniques used by malicious actors. The framework is leveraged by the RAG model to generate specific, context-aware recommendations on how users can improve their passwords based on known vulnerabilities and attack patterns. Finally, the Facebook AI Similarity Search, known as the FAISS Index, is used to store and retrieve password embeddings (vectorized representations of passwords). When a user submits a password, the RAG model uses FAISS to compare the new password with previously known vulnerable passwords to identify similarities and generate recommendations.
Artifact Evaluation
The evaluation of the artifact followed a structured approach designed to assess its predictive capabilities and the effectiveness of its recommendations in mitigating IoT password vulnerabilities. The artifact’s evaluation process included hypothesis testing, adversarial scenario simulations, and the application of standard classification metrics to ensure that it performed reliably under both typical and challenging conditions.
The ML component was evaluated using a supervised learning framework, where the model was trained on a dataset consisting of password breach attempts and then tested on an independent test set. Cross-validation techniques were employed to ensure that the model could generalize effectively to unseen data, minimizing the risk of overfitting. The model’s performance was assessed using metrics such as accuracy, precision, recall, and F1-score, providing a comprehensive measure of its ability to correctly predict password vulnerabilities.
The LLM with RAG integration was evaluated by generating dynamic recommendations after the ML component identified potentially vulnerable passwords. The recommendations were assessed for their contextual relevance, practicality, and alignment with current cybersecurity best practices. The evaluation process included real-time feedback loops, where the LLM provided suggestions, and those suggestions were then tested in controlled environments to determine their effectiveness in improving password strength. This iterative evaluation ensured that the recommendations were both actionable and adaptable to real-world scenarios.
Results, Interpretation and Applications
Results
The artifact evaluation was focused on answering the main research question: How can a machine learning model and LLMs be effectively utilized to predict and mitigate dictionary and brute force attacks resulting from weak passwords on smart home IoT devices? The machine learning model’s performance was assessed using a stratified train/test split and k-fold cross-validation methodology. The data was split into 80% training and 20% testing to ensure a thorough evaluation of the model’s predictive capability. Additionally, 10-fold cross-validation was employed to ensure that the model was not overfitting and had the capacity to generalize well to unseen data. The average precision across the folds was 0.88, indicating that the model was effective at correctly identifying passwords that were likely to be breached. Similarly, the recall was measured at 0.84, highlighting the model’s sensitivity in detecting truly vulnerable passwords. The F1-score, the harmonic mean of precision and recall, was 0.85, providing a balanced indication of the model’s overall performance.
Upon testing on unseen data, the model achieved an accuracy of 0.89, precision of 0.88, recall of 0.86, and an F1-score of 0.87. These metrics indicate that the model is effective at distinguishing between breached and non-breached passwords in the test dataset. However, a more detailed breakdown reveals that there were some misclassifications, particularly in cases where passwords had medium complexity (e.g., “User$456”) but were still breached due to dictionary and brute force attacks.
To evaluate the robustness of the model, adversarial and out-of-distribution data were introduced during testing. The adversarial data consisted of slightly modified passwords (e.g., “Pass@1234!” changed to “Pass@1243!”) to simulate attempts to bypass security mechanisms. The model demonstrated resilience by maintaining an accuracy of 0.85 under adversarial conditions. This robustness was further supported by the out-of-distribution testing, where passwords not seen during training (e.g., “Sup3rS3cur3Pass!”) were correctly identified as secure or vulnerable with an accuracy of 0.82. The robustness testing results are displayed in Table I. While performance slightly decreased with adversarial and out-of-distribution data, the drop in accuracy was minimal, demonstrating the model’s ability to generalize well to novel inputs.
Testing condition | Accuracy | Precision | Recall | F1-Score |
---|---|---|---|---|
Regular testing | 0.89 | 0.88 | 0.866 | 0.87 |
Adversarial testing | 0.85 | 0.83 | 0.82 | 0.82 |
Out-of-distribution data | 0.84 | 0.81 | 0.79 | 0.80 |
The second phase included testing the model across 20 distinct use cases to evaluate its ability to predict password vulnerability, offer suggestions for improvement, and generate RAG-based analyses based on related cybersecurity techniques. Each scenario demonstrated the model’s flexibility in assessing various password strengths and adapting its feedback accordingly. This section will detail the model’s recommendations for three different use cases.
In the first use case, the password “SecuRe!2021” was evaluated, and the model predicted a low breach probability with a breach likelihood score of 0.12. This suggests that the password is relatively strong, with a well-balanced combination of uppercase letters, numbers, and special characters, adhering to many of the best practices for password creation. The model’s precision (0.95), recall (0.93), and F1-score (0.94) reflect its confidence in this prediction, indicating that the model was effective in distinguishing between vulnerable and secure passwords. No suggestions for improvement were provided, as the password already demonstrates a high level of complexity. Furthermore, the RAG analysis highlighted a similar password, “StrongPass@2020,” which also maintained a high degree of security but offered no related attack techniques. This suggests that passwords of this caliber typically do not fall within the range of common techniques such as dictionary and brute force attacks. The lack of related techniques further supports the assessment that “SecuRe!2021” is a robust password with minimal vulnerability to known attack vectors. Fig. 2 shows the results provided by the Flask API for the low breach probability use case.
Fig. 2. Low breach probability use case.
In the second use case, the model evaluated the password “User$456” and predicted a moderate breach probability with a breach likelihood score of 0.55. The metrics for this use case—precision (0.89), recall (0.85), and F1-score (0.87)—suggest a good balance between the model’s ability to identify vulnerable passwords and avoid false positives. However, the higher breach likelihood compared to Use Case 1 indicates that this password is moderately secure but could benefit from further strengthening. The model generated several actionable suggestions to improve the password’s security, including the addition of more random special characters and an increase in password length beyond 10 characters. These recommendations align with best practices for enhancing password strength by increasing entropy and complexity, which makes it more resistant to dictionary and brute force attacks and guessing techniques. The RAG analysis provided additional context by identifying a similar password, “Admin@456,” and highlighting two related techniques: Password Guessing (T1078) and Account Manipulation (T1098). These techniques are commonly associated with weak or moderately secure passwords, further justifying the model’s prediction and the suggestions provided for improving security.
Finally, the third use case assessed the password “Pass1234,” which resulted in a high breach probability prediction with a breach likelihood score of 0.78. The metrics for this scenario—precision (0.87), recall (0.84), and F1-score (0.85)—indicate that the model performed effectively in predicting that this password is highly vulnerable to breaches. The combination of simple patterns and common sequences like “1234” contributed to the model’s assessment of the password as high risk. The model offered multiple suggestions for improving this password, including extending its length to at least 12 characters, incorporating a mix of uppercase letters, numbers, and special characters, and avoiding common words or patterns. These recommendations directly address the weaknesses inherent in “Pass1234,” particularly its reliance on easily guessable patterns. The RAG analysis corroborated these findings by identifying a similar password, “Admin1234,” and pointing to two related techniques: Credential Dumping (T1003) and Brute Force (T1110). Both of these techniques are frequently used by attackers to exploit weak or commonly used passwords, further emphasizing the need for stronger password creation in this scenario.
The model’s performance across these use cases highlights its adaptability and effectiveness in predicting password vulnerabilities and offering targeted suggestions for improvement. For stronger passwords, such as “SecuRe!2021,” the model confidently predicted a low breach probability with minimal feedback, while for more vulnerable passwords, such as “Pass1234,” the model generated multiple actionable recommendations. The use of RAG in these predictions added further context by identifying related attack techniques and similar passwords, providing a comprehensive assessment of each password’s security. Across all use cases, the model demonstrated robust performance metrics, with precision, recall, and F1-scores consistently above 0.80. These results indicate that the model is highly reliable in distinguishing between secure and vulnerable passwords, particularly when combined with the RAG-based analysis for generating improvement suggestions.
Interpretation and Applications
The findings of this study underscore two significant points: (a) the impact that password complexity and security enhancements, such as salting, have on the overall resilience of passwords against cracking attempts and (b) the efficacy of ML and LLMs in predicting and mitigating password breaches. The machine learning model developed during this research was trained on a dataset that encompassed both weak and salted passwords, allowing it to generate predictions on password vulnerability with a high degree of accuracy. By comparing the performance of the model in predicting the breach of weak versus salted passwords, the research was able to provide meaningful insights into the efficacy of various password strengthening techniques.
The collected data serves as a foundational training set for the machine learning model, enabling it to recognize patterns and correlations between the characteristics of passwords and their susceptibility to attacks. Weak passwords, which are typically shorter and less complex, demonstrated a significantly lower resistance to cracking attempts, often requiring only a minimal number of tries before being compromised. In contrast, salted passwords, which involve the addition of random data (salt) to the original password before hashing, showed a much higher level of security. These passwords required exponentially more attempts and substantially more time to crack, illustrating the effectiveness of salting in enhancing password security.
By including both weak and salted passwords in the training dataset, the model can learn to identify not only the overtly insecure passwords but also the nuanced differences introduced by security measures like salting. This enabled the model to make more accurate predictions about password vulnerability across a range of scenarios, ensuring it can effectively assess both the immediate risk of weak passwords and the enhanced protection provided by salted passwords.
Moving on to the created model, its ability to accurately predict password vulnerability was further enhanced by the integration of RAG. This feature allowed the model to not only predict the likelihood of password compromise but also to suggest more secure alternatives in real time. The effectiveness of these RAG-generated suggestions was validated through rigorous testing, where the model demonstrated a measurable improvement in both recall and F1-score following the incorporation of RAG. Specifically, the model exhibited a 6% improvement in recall and a 5% increase in F1-score compared to the baseline Random Forest model. This enhancement underscores the potential of using advanced retrieval-augmented approaches to dynamically address security vulnerabilities and provide actionable recommendations.
From a practical perspective, the study’s results suggest that organizations can substantially reduce the risk of password breaches by enforcing the use of longer, more complex passwords and implementing salting techniques. Moreover, by integrating a machine learning model like the one developed in this research, organizations could assess the vulnerability of passwords dynamically, offering real-time suggestions for improvement based on the specific characteristics of each password. This approach would significantly improve the security posture of smart home IoT devices and other systems that rely on password-based authentication mechanisms.
Conclusion
This study aimed to develop and evaluate a machine learning model with LLM to address security vulnerabilities in smart home IoT devices. It focused on predicting password breaches and providing actionable recommendations for improving password strength. The primary question was how to use machine learning and LLMs to predict and mitigate dictionary and brute-force attacks on smart home IoT devices. The study’s key contribution is a sophisticated, data-driven tool that predicts password breaches and offers precise, real-time recommendations for improvement.
A quantitative design science research methodology was used, creating a machine learning model leveraging Random Forest classification and the RAG framework. The model was trained on weak and salted passwords and evaluated through cross-validation, performance metrics, and robustness testing with adversarial and out-of-distribution data. It was further assessed in real-world scenarios for predicting password breaches and providing security suggestions based on RAG-generated responses.
The study found that integrating machine learning with LLM-based RAG technologies significantly enhances smart home IoT security. The RAG-enhanced model outperformed the baseline Random Forest model in accuracy, precision, recall, and F1-score, particularly in recall and F1-score. Robustness testing with adversarial and out-of-distribution data validated the model’s ability to generalize and handle complex scenarios.
In conclusion, this study demonstrates the potential of combining machine learning and LLM technologies to enhance smart home IoT security. The integrated approach effectively predicts password vulnerabilities and provides actionable recommendations, contributing to a more secure IoT ecosystem. As IoT technologies proliferate, the need for dynamic, real-time security solutions will grow. This research lays a strong foundation for further exploration and development of advanced security measures to adapt to the evolving threat landscape in the smart home IoT domain.
References
-
Hattar M. IoT’s Importance is Growing Rapidly, But Its Security is Still Weak. SecurityWeek; 2022.
Google Scholar
1
-
Tannenbaum A. Why do IoT companies keep building devices with huge security flaws? Harv Bus Rev. 2018. Available from: https://hbr.org/2017/04/why-do-iot-companies-keep-building-devices-with-huge-security-flaws.
Google Scholar
2
-
Wong K. IoT device security: the hard (ware) way-CSG @ GovTech-Medium. Medium. 2021. Available from: https://medium.com/csg-govtech/iot-device-security-the-hard-ware-way-8c161bfafe98.
Google Scholar
3
-
Reddy EPK. Step by Step Process of Feature Engineering for Machine Learning Algorithms in Data Science. Analytics Vidhya; 2021.
Google Scholar
4
-
Fan W, Ding Y, Ning L, Wang S, Li H, Yin D, et al. A survey on RAG meeting LLMs: towards retrieval-augmented large language models. 2024. Available from: arXiv.org.
Google Scholar
5
-
Raburn K. Exploring Why Consumers Do Not Use Current Cyber-security Mitigation Strategies for IoT Smart Devices in the Home. ProQuest One Academic; 2022.
Google Scholar
6
-
Atzori M, Calò E, Caruccio L, Cirillo S, Polese G, Solimando G. Evaluating password strength based on information spread on social networks: a combined approach relying on data reconstruction and generative models. Online Soc Netw Med. 2024;42:100278. doi: 10.1016/j.osnem.2024.100278.
Google Scholar
7
-
Gholamhosseini L, Sadoughi F, Ahmadi H, Safaei A. Health Internet of Things: strengths, weakness, opportunity, and threats. IEEE Conference Publication | IEEE Xplore, 2019.
Google Scholar
8
-
Polat G. Security Issues in IoT: Challenges and Countermeasures. ISACA; 2019.
Google Scholar
9
-
Yu S, Wang G, Liu X, Niu J. Security and privacy in the age of the smart internet of things: an overview from a networking perspective. IEEE Commun Mag. 2018;56(9):14–8.
Google Scholar
10
-
Mothukuri V, Khare P, Parizi RM, Pouriyeh S, Dehghantanha A, Srivastava G. Federated-learning-based anomaly detection for IoT security attacks. IEEE Internet Things J. 2022;9(4):2545–54. doi: 10.1109/jiot.2021.3077803.
Google Scholar
11
-
Pratt M, Lulka J. Top 12 IoT security threats and risks to prioritize. TechTarget | IoT Agenda. 2023. Available from: https://www.techtarget.com/iotagenda/tip/5-IoT-security-threats-to-prioritize.
Google Scholar
12
-
Galeano-Brajones J, Carmona-Murillo J, Valenzuela-Valdés JF, Luna-Valero F. Detection and mitigation of DoS and DDoS attacks in IoT-based stateful SDN: an experimental approach. Sensors. 2020;20(3):816. doi: 10.3390/s20030816.
Google Scholar
13
-
Touqeer H, Zaman S, Amin R, Hussain M, Al-Turjman F, Bilal M. Smart home security: challenges, issues and solutions at different IoT layers. J Supercomput. 2021;77(12):14053–89. doi: 10.1007/s11227-021-03825-1.
Google Scholar
14
-
Protogerou A, Kopsacheilis EV, Mpatziakas A, Papachristou K, Theodorou T, Papadopoulos S, et al. Time series network data enabling distributed intelligence—A holistic IoT security platform solution. Electronics. 2022;11(4):529. doi: 10.3390/elec-tronics11040529.
Google Scholar
15
-
Poremba S. Will weak passwords doom the Internet of Things (IoT)? Security Intelligence. 2022. Available from: https://securityintelligence.com/articles/will-weak-passwords-doom-the-internet-of-things-iot/.
Google Scholar
16
-
Hammi B, Zeadally S, Khatoun R, Nebhen J. Survey on smart homes: vulnerabilities, risks, and countermeasures. Comput Secur. 2021;117:102677. doi: 10.1016/j.cose.2022.102677.
Google Scholar
17
-
Hussein N, Nhlabatsi A. Living in the dark: mQTT-based exploitation of IoT security vulnerabilities in ZigBee networks for smart lighting control. IoT. 2022;3(4):450. doi: 10.3390/iot3040024.
Google Scholar
18
-
Halz T. Why is the Internet of Things so hard to secure? KeyFactor. 2022 Aug 30. Available from: https://www.keyfactor.com/blog/why-is-the-internet-of-things-so-hard-to-secure/.
Google Scholar
19
-
Agazzi AE. Smart Home, Security Concerns of IoT. Cornell University Library; 2020. Available from: arXiv.org.
Google Scholar
20
-
Dashlane. 7 ways to determine if you have a strong (or weak) password. Dashlane. 2023 Sep 21. Available from: https://www.dashlane.com/blog/7-ways-to-determine-if-you-have-a-strong-or-weak-password.
Google Scholar
21
-
Shyamson S. Best 6 IoT Password Management for 2023. Block Survey; 2023.
Google Scholar
22
-
McKay T. Passwords like ‘1234’ are Still One of the Biggest Threats in IoT/OT. IT Brew; 2023.
Google Scholar
23
-
Mahapatra J, Garain U. Impact of model size on fine-tuned LLM performance in data-to-text generation: a state-of-the-are investigation. arXiv. 2024. doi: 10.48550/arxiv.2407.14088.
Google Scholar
24
-
Dai D. Scaling laws for large language models (LLMs). Medium. 2024. Available from: https://medium.com/@derekpengdai/scaling-laws-for-large-language-models-llms-adeed38dc2ba.
Google Scholar
25
-
Souai W. Fine-Tuning LLM: a Deep dive into advanced techniques for optimal model performance. Medium. 2024. Available from: https://medium.com/ubiai-nlp/fine-tuning-llm-a-deep-dive-into-advanced-techniques-for-optimal-model-performance-289affdfaf61.
Google Scholar
26
-
Norouzi A. The Ultimate Guide to LLM Fine Tuning: Best Practices & Tools. Lakera; 2023 Sep 13.
Google Scholar
27
-
Venkatesh L. #3 LLM: reinforcement Learning—GPT. Medium. 2024. Available from: https://luxananda.medium.com/reinforcement-learning-gpt-742016025359.
Google Scholar
28
-
Alcaraz A. Combining LLMs with other AI tools: one desirable future of intelligent systems. Medium. 2024. Available from: https://medium.com/codex/combining-llms-with-other-ai-tools-one-desirable-future-of-intelligent-systems-a7c747a99c04.
Google Scholar
29
-
Kumar V. Combining LLMs with Other AI Tools: One Desirable Future of Intelligent Systems. The Association of Data Scientists; 2024.
Google Scholar
30
-
Susaki M. Enhancing Contextual Understanding of Mistral LLM with External Knowledge Bases. Research Square; 2024.
Google Scholar
31
-
Eloranta S, Boman M. Predictive Models for Clinical Decision Making: Deep Dives in Practical Machine Learning. National Library of Medicine; 2022 Aug.
Google Scholar
32
-
Dawood M. Convolutional Neural Networks (CNNs) in image and video processing. Medium. 2023. Available from: https://medium.com/@muhammaddawoodaslam/convolutional-neural-networks-cnns-in-image-and-video-processing-552da8422604.
Google Scholar
33
-
Haralkar V. Machine learning on edge devices. Medium. 2023. Available from: https://medium.com/@vishwajeet.haralkar20/machine-learning-on-edge-devices-34e1085fa894.
Google Scholar
34
-
Btd. 100 facts about trade-offs in machine learning. Medium. 2023. Available from: https://baotramduong.medium.com/100-facts-about-trade-offs-in-machine-learning-09a7e7f3589f.
Google Scholar
35
-
Okoli NUI, Obi NOC, Adewusi Na O, Abrahams NTO. Machine learning in cybersecurity: a review of threat detection and defense mechanisms. World J Adv Res Rev. 2024;21(1):2286–95. doi: 10.30574/wjarr.2024.21.1.0315.
Google Scholar
36
-
Polemi N, Praça I, Kioskli K, Bécue A. Challenges and efforts in managing AI trustworthiness risks: a state of knowledge. Front Big Data. 2024;7. doi: 10.3389/fdata.2024.1381163.
Google Scholar
37
-
Wood R. From Threat Detection to Reducing False Positives, ML is Shaping Endpoint Security. Acceleration Economy; 2023.
Google Scholar
38
-
Lamberti A. Deep Learning model optimization methods. neptune.ai. 2024. Available from: https://neptune.ai/blog/deep-learning-optimization-algorithms.
Google Scholar
39
-
Seldon. Machine learning optimization–Why is it so important? Seldon. 2023. Available from: https://www.analyticsvidhya.com/blog/2020/11/entropy-a-key-concept-for-all-data-science-beginners/.
Google Scholar
40
-
Holla R. Edge AI: deploying machine learning models on edge devices. Medium. 2024. Available from: https://medium.com/@rahulholla1/edge-ai-deploying-machine-learning-models-on-edge-devices-cf2033e7c34e.
Google Scholar
41
-
Weinberg A. Top 7 IoT cyber security vulnerabilities for 2022-FirstPoint. FirstPoint. 2022. Available from: https://www.firstpoint-mg.com/blog/iot-cyber-security-vulnerabilities/.
Google Scholar
42
-
Henver AR, Salvatore TM, Park J, Sudha R. Design science in information systems research. Manag Inform Syst Quart. 2004:28(1):75–105. Available from: https://doi.org/10.5555/2017212.2017217.
Google Scholar
43
-
Sruthi AR. Understanding Random Forest Algorithm with Examples. Analytics Vidhya; 2024. Available from: https://www.analyticsvidhya.com/blog/2021/06/understanding-random-forest/.
Google Scholar
44
-
Agrawal R. Rest API | Complete Guide on Rest API with Python and Flask. Analytics Vidhya; 2024.
Google Scholar
45
Most read articles by the same author(s)
-
Sohiel Nikbin,
Yanzhen Qu,
A Study on the Accuracy of Micro Expression Based Deception Detection with Hybrid Deep Neural Network Models , European Journal of Electrical Engineering and Computer Science: Vol. 8 No. 3 (2024) -
Tony Hoang,
Yanzhen Qu,
Creating A Security Baseline and Cybersecurity Framework for the Internet of Things Via Security Controls , European Journal of Electrical Engineering and Computer Science: Vol. 8 No. 2 (2024) -
Ihsan Said,
Yanzhen Qu,
Improving the Performance of Loan Risk Prediction based on Machine Learning via Applying Deep Neural Networks , European Journal of Electrical Engineering and Computer Science: Vol. 7 No. 1 (2023) -
Jolynn Baugher,
Yanzhen Qu,
Create the Taxonomy for Unintentional Insider Threat via Text Mining and Hierarchical Clustering Analysis , European Journal of Electrical Engineering and Computer Science: Vol. 8 No. 2 (2024) -
Alan Raveling,
Yanzhen Qu,
Quantifying the Effects of Operational Technology or Industrial Control System based Cybersecurity Controls via CVSS Scoring , European Journal of Electrical Engineering and Computer Science: Vol. 7 No. 4 (2023) -
Sushanth Manakhari,
Yanzhen Qu,
Improving the Accuracy and Performance of Deep Learning Model by Applying Hybrid Grey Wolf Whale Optimizer to P&C Insurance Data , European Journal of Electrical Engineering and Computer Science: Vol. 7 No. 4 (2023) -
Issayas M. Haile,
Yanzhen Qu,
Mitigating Risk in Financial Industry by Analyzing Social-Media with Machine Learning Technology , European Journal of Electrical Engineering and Computer Science: Vol. 6 No. 2 (2022) -
Justin Morgan,
Yanzhen Qu,
Ordered Lorenz Regularization (OLR): A General Method to Mitigate Overfitting in General Insurance Pricing via Machine Learning Algorithms , European Journal of Electrical Engineering and Computer Science: Vol. 8 No. 5 (2024) -
Edwin A. Agbor,
Yanzhen Qu,
Improving the Performance of Machine Learning Model Selection for Electricity Cost Forecasting in Homebased Small Businesses via Exploratory Data Analysis , European Journal of Electrical Engineering and Computer Science: Vol. 7 No. 2 (2023) -
Daniel Rodriguez Gonzalez,
Yanzhen Qu,
Improving the Performance of Closet-Set Classification in Human Activity Recognition by Applying a Residual Neural Network Architecture , European Journal of Electrical Engineering and Computer Science: Vol. 6 No. 2 (2022)