This project uses machine learning to identify password creation habits of users. A PCFG model is generated by training on a list of disclosed plaintext/cracked passwords. In the context of this project, the model is referred to as a ruleset and contains many different parts of the passwords identified during training, along with their associated probabilities.
This stemming can be useful for other cracking tools such as PRINCE, and/or parts of the ruleset can be directly incorporated into more traditional dictionary-based attacks. This project also includes a PCFG guess generator that makes use of this ruleset to generate password guesses in probability order.
This is much more powerful than standard dictionary attacks, and in testing has proven to be able to crack passwords on average with significantly less guesses than other publicly available methods.
The downside is that generating guesses in probability order is slow, meaning it is creating on average 50-100k guesses a second, where GPU based algorithms can create millions to billions (and up), of guesses a second against fast hashing algorithms.
Therefore, the PCFG guesser is best used against large numbers of salted hashes, or other slow hashing algorithms, where the performance cost of the algorithm is made up for with the accuracy of the guesses.
In short: A collection of tools to perform research into how humans generate passwords. These can be used to crack password hashes, but also create synthetic passwords (honeywords), or help develop better password strength algorithms.
Also Read – S3enum : Fast Amazon S3 Bucket Enumeration Tool For Pentesters
Requirements & Installation
pip3 install chardet
Quick Start Guide
Training
The default ruleset included in this repo was created by training on a 1 million password subset of the RockYou dataset. Better performance can be achieved by training on the full 32 million password set for RockYou, but that was excluded to keep the download size small.
You can use the default ruleset to start generating passwords without having to train on a new list, but it is recommended to train on a target set of passwords that may be closer to what you are trying to target. If you do create your own ruleset, here is a quick guide:
python3 trainer.py -t INPUT_PASSWORD_LIST -r NEW_RULESET
python3 trainer.py -t INPUT_PASSWORD_LIST -r NEW_RULESET -c 0.6
b. –save_sensitive: If this is specified, sensitive data such as e-mail addresses and full websites which are discovered during training will be saved in the ruleset. While the PCFG guess generator does not currently make use of this data, it is very valuable during a real password cracking attack. This by default is off to make this tool easier to use in an academic setting. Note, even when this is off, there will almost certainly still be PII data saved inside a ruleset, so protect generated rulesets appropriately. Example: python3 trainer.py -t INPUT_PASSWORD_LIST -r NEW_RULESET --save_sensitive
c. –comments: Adds a comment to your ruleset config file. This is useful so you know why and how you generated your ruleset when looking back at it later. Include the comment you want to add in quotes.Guess Generation
This generates guesses to stdout using a previously training PCFG ruleset. These guesses can then be piped into any program that you want to make use of them. If no ruleset is specified, the default ruleset DEFAULT will be used. For the purposes of this guide it will assume the ruleset being used is NEW_RULESET.
python3 pcfg_guesser.py -r NEW_RULESET
-s SESSION_NAMEpython3 pcfg_guesser.py -s SESSION_NAME --load
Password Strength Scoring
There are many cases where you may want to estimate the probability of a password being generated by a previously trained ruleset. For example, this could be part of a password strength metric, or used for other research purposes. A sample program has been included to perform this.
python3 password_scorer -r NEW_RULESET -i INPUT_LIST
Prince-Ling Wordlist Generator
**Name: **PRINCE Language Idexed N-Grams (Prince-Ling)
Using Prince-Ling
python3 prince-ling.py -r RULESET_NAME -s SIZE_OF_WORDLIST_TO_CREATE -o OUTPUT_FILENAME
Example Cracking Passwords Using Joh
n the Ripper
python3 pcfg_guesser -r NEW_RULESET -s SESSION_NAME | ./john –stdin –format=bcrypt PASSWORDS_TO_CRACK.txt
garak checks if an LLM can be made to fail in a way we don't…
Vermilion is a simple and lightweight CLI tool designed for rapid collection, and optional exfiltration…
ADCFFS is a PowerShell script that can be used to exploit the AD CS container…
Tartufo will, by default, scan the entire history of a git repository for any text…
Loco is strongly inspired by Rails. If you know Rails and Rust, you'll feel at…
A data hoarder’s dream come true: bundle any web page into a single HTML file.…