Abstract
According to the latest Verizon DBIR report, credential abuse, including password reuse and
human factors in password creation, remains the leading attack vector. It was revealed that
most users change their passwords only when they forget them, and 35% of respondents
find mandatory password rotation policies inconvenient. These findings highlight the
importance of combining technical solutions with user-focused education to strengthen
password security. In this research, the “human factor in the creation of usernames and
passwords” is considered a vulnerability, as identifying the patterns or rules used by users
in password generation can significantly reduce the number of possible combinations that
attackers need to try in order to gain access to personal data. The proposed method based on
an LSTM model operates at a character level, detecting recurrent structures and generating
generalized masks that reflect the most common components in password creation. Open
datasets of 31,000 compromised passwords from real-world leaks were used to train the
model and it achieved over 90% test accuracy without signs of overfitting. A new method
of evaluating the individual password creation habits of users and automatically fetching
context-rich keywords from a user’s public web and social media footprint via a keyword-
extraction algorithm is developed, and this approach is incorporated into a web application
that allows clients to locally fine-tune an LSTM model locally, run it through ONNX, and
carry out all inference on-device, ensuring complete data confidentiality and adherence to
privacy regulations.
human factors in password creation, remains the leading attack vector. It was revealed that
most users change their passwords only when they forget them, and 35% of respondents
find mandatory password rotation policies inconvenient. These findings highlight the
importance of combining technical solutions with user-focused education to strengthen
password security. In this research, the “human factor in the creation of usernames and
passwords” is considered a vulnerability, as identifying the patterns or rules used by users
in password generation can significantly reduce the number of possible combinations that
attackers need to try in order to gain access to personal data. The proposed method based on
an LSTM model operates at a character level, detecting recurrent structures and generating
generalized masks that reflect the most common components in password creation. Open
datasets of 31,000 compromised passwords from real-world leaks were used to train the
model and it achieved over 90% test accuracy without signs of overfitting. A new method
of evaluating the individual password creation habits of users and automatically fetching
context-rich keywords from a user’s public web and social media footprint via a keyword-
extraction algorithm is developed, and this approach is incorporated into a web application
that allows clients to locally fine-tune an LSTM model locally, run it through ONNX, and
carry out all inference on-device, ensuring complete data confidentiality and adherence to
privacy regulations.
| Original language | English |
|---|---|
| Journal | Information |
| Volume | 16 |
| Issue number | 8 |
| DOIs | |
| Publication status | Published (VoR) - 31 Jul 2025 |