Homoglyphs is used to get similar letters, convert to ASCII, detect possible languages and UTF-8 group. Also can say python library for getting it and converting to ASCII.
It’s smarter version of confusable_homoglyphs:
Also Read Whatsapp_Automation : Collection Of APIs Interact With WhatsApp Running In An Android Emulator
sudo pip install homoglyphs
Importing:
import homoglyphs as hg
#detect
hg.Languages.detect('w')
# {'pl', 'da', 'nl', 'fi', 'cz', 'sr', 'pt', 'it', 'en', 'es', 'sk', 'de', 'fr', 'ro'}
hg.Languages.detect('т')
# {'mk', 'ru', 'be', 'bg', 'sr'}
hg.Languages.detect('.')
# set()
# get alphabet for languages
hg.Languages.get_alphabet(['ru'])
# {'в', 'Ё', 'К', 'Т', ..., 'Р', 'З', 'Э'}
Categories — (aliases from ISO 15924).
#detect
hg.Categories.detect('w')
# 'LATIN'
hg.Categories.detect('т')
# 'CYRILLIC'
hg.Categories.detect('.')
# 'COMMON'
# get alphabet for categories
hg.Categories.get_alphabet(['CYRILLIC'])
# {'ӗ', 'Ԍ', 'Ґ', 'Я', ..., 'Э', 'ԕ', 'ӻ'}
Get it:
# get homoglyphs (latin alphabet initialized by default)
hg.Homoglyphs().get_combinations('q')
# ['q', '𝐪', '𝑞', '𝒒', '𝓆', '𝓺', '𝔮', '𝕢', '𝖖', '𝗊', '𝗾', '𝘲', '𝙦', '𝚚']
Alphabet loading:
# load alphabet on init by categories
homoglyphs = hg.Homoglyphs(categories=('LATIN', 'COMMON', 'CYRILLIC')) # alphabet loaded here
homoglyphs.get_combinations('гы')
# ['rы', 'гы', 'ꭇы', 'ꭈы', '𝐫ы', '𝑟ы', '𝒓ы', '𝓇ы', '𝓻ы', '𝔯ы', '𝕣ы', '𝖗ы', '𝗋ы', '𝗿ы', '𝘳ы', '𝙧ы', '𝚛ы']
# load alphabet on init by languages
homoglyphs = hg.Homoglyphs(languages={'ru', 'en'}) # alphabet will be loaded here
homoglyphs.get_combinations('гы')
# ['rы', 'гы']
# manual set alphabet on init # eng rus
homoglyphs = hg.Homoglyphs(alphabet='abc абс')
homoglyphs.get_combinations('с')
# ['c', 'с']
# load alphabet on demand
homoglyphs = hg.Homoglyphs(languages={'en'}, strategy=hg.STRATEGY_LOAD)
# ^ alphabet will be loaded here for "en" language
homoglyphs.get_combinations('гы')
# ^ alphabet will be loaded here for "ru" language
# ['rы', 'гы']
You can combine categories
, languages
, alphabet
and any strategies as you want.
homoglyphs = hg.Homoglyphs(languages={'en'}, strategy=hg.STRATEGY_LOAD)
# convert
homoglyphs.to_ascii('тест')
# ['tect']
homoglyphs.to_ascii('ХР123.') # this is cyrillic "х" and "р"
# ['XP123.', 'XPI23.', 'XPl23.']
# string with chars which can't be converted by default will be ignored
homoglyphs.to_ascii('лол')
# []
# you can set strategy for removing not converted non-ASCII chars from result
homoglyphs = hg.Homoglyphs(
languages={'en'},
strategy=hg.STRATEGY_LOAD,
ascii_strategy=hg.STRATEGY_REMOVE,
)
homoglyphs.to_ascii('лол')
# ['o']
Docker is a powerful open-source containerization platform that allows developers to build, test, and deploy…
Docker is one of the most widely used containerization platforms. But there may come a…
Introduction Google Dorking is a technique where advanced search operators are used to uncover information…
Introduction In cybersecurity and IT operations, logging fundamentals form the backbone of monitoring, forensics, and…
What is Networking? Networking brings together devices like computers, servers, routers, and switches so they…
Introduction In the world of Open Source Intelligence (OSINT), anonymity and operational security (OPSEC) are…