Password Attacks

PwnedPasswordsDownloader – Efficient Downloading Of HIBP Password Hashes Using Curl Parallelism

Thanks for HIBP and this downloader. At first I was considering using it, but the API of HIBP passwords is so easy that I wrote a small shell script for it.

It was slow as hell, because it had no parallelism at all. It was far too slow for my taste, thus I was thinking about adding parallelism. And that’s when I stumbled on curl’s URL globbing feature.

curl is the swiss army knife of HTTP downloading and it supports patterns/globbing and massive parallelism, and pretty much every aspect of HTTP downloads (proxies, HTTP1/2/3, all SSL/TLS versions, etc.).

Here’s a single curl commandline that downloads the entire HIBP password hash database into the current working directory:

curl -s --remote-name-all --parallel --parallel-max 150 "https://api.pwnedpasswords.com/range/{0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F}{0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F}{0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F}{0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F}{0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F}"

On an always-free Oracle Cloud VM this finished (for me) in 13.5 minutes. 😉
The URL globbing and the --remote-name-all options have been around in curl for ages (i.e. for over a decade) and the --parallel* options have been added in Sep 2019 (v7.66.0).

So pretty much all “recent” Linux distros already contain a curl version that fully support this commandline.
Curl is cross-platform, e.g. you can download a Windows version too.
If you don’t have the necessary 30-40 GB free space for the entire hash dump, you can get away with less by downloading in smaller batches and instantly compressing them.
Here’s a command that you can fire and forget (i.e. disconnect / log out) on any Linux PC/server and it’ll do the job in batches:

d="$(date "+%F")"
nohup bash -c 'd="'$d'"; chars=(0 1 2 3 4 5 6 7 8 9 A B C D E F); printf -v joined "%s," "${chars[@]}"; charscomma="${joined%,}"; hibpd="$(pwd)/hibp_${d}"; for c in "${chars[@]}"; do prefix="hibp_${d}_${c}" dir="${hibpd}/${prefix}"; mkdir -p "$dir"; cd "$dir"; date; echo "Starting in $dir with prefix $c ..."; curl -s --remote-name-all --parallel --parallel-max 150 -w "%{url}\n" "https://api.pwnedpasswords.com/range/${c}{$charscomma}{$charscomma}{$charscomma}{$charscomma}"; cd "$hibpd"; BZIP2=-9 tar cjf "${prefix}.tar.bz2" "$prefix" && rm -r "$prefix"; done; echo "Finished"; date' > "hibp_$d.log" 2>&1 &!

This doesn’t support suspend and resume of the download job (other HIBP downloaders do), but since it finishes pretty quickly (if you have a good enough internet connection), I don’t see any reason for this feature.

You can easily assemble the server responses into a single ~38 GB file with the following commandline (on Linux):

find . -type f -print | egrep -ia '/[0-9a-f]{5}$' | xargs -r -d '\n' awk -F: '{ sub(/\r$/,""); print substr(FILENAME, length(FILENAME)-4, 5) $1 ":" $2 }' > hibp_all.txt

You can sort it easily based on the second field to get the most “popular” hashes:

sort -t: -k2 -rn hibp_all.txt | head -n100

To get just the most popular hashes:

sort -t: -k2 -rn hibp_all.txt | head -n100 | cut -d: -f1
Varshini

Varshini is a Cyber Security expert in Threat Analysis, Vulnerability Assessment, and Research. Passionate about staying ahead of emerging Threats and Technologies.

Recent Posts

How AI Puts Data Security at Risk

Artificial Intelligence (AI) is changing how industries operate, automating processes, and driving new innovations. However,…

2 days ago

The Evolution of Cloud Technology: Where We Started and Where We’re Headed

Image credit:pexels.com If you think back to the early days of personal computing, you probably…

6 days ago

The Evolution of Online Finance Tools In a Tech-Driven World

In an era defined by technological innovation, the way people handle and understand money has…

6 days ago

A Complete Guide to Lenso.ai and Its Reverse Image Search Capabilities

The online world becomes more visually driven with every passing year. Images spread across websites,…

7 days ago

How Web Application Firewalls (WAFs) Work

General Working of a Web Application Firewall (WAF) A Web Application Firewall (WAF) acts as…

1 month ago

How to Send POST Requests Using curl in Linux

How to Send POST Requests Using curl in Linux If you work with APIs, servers,…

1 month ago