Homoglyphs is used to get similar letters, convert to ASCII, detect possible languages and UTF-8 group. Also can say python library for getting it and converting to ASCII.
It’s smarter version of confusable_homoglyphs:
Also Read Whatsapp_Automation : Collection Of APIs Interact With WhatsApp Running In An Android Emulator
sudo pip install homoglyphs
Importing:
import homoglyphs as hg #detect
hg.Languages.detect('w')
# {'pl', 'da', 'nl', 'fi', 'cz', 'sr', 'pt', 'it', 'en', 'es', 'sk', 'de', 'fr', 'ro'}
hg.Languages.detect('т')
# {'mk', 'ru', 'be', 'bg', 'sr'}
hg.Languages.detect('.')
# set()
# get alphabet for languages
hg.Languages.get_alphabet(['ru'])
# {'в', 'Ё', 'К', 'Т', ..., 'Р', 'З', 'Э'} Categories — (aliases from ISO 15924).
#detect
hg.Categories.detect('w')
# 'LATIN'
hg.Categories.detect('т')
# 'CYRILLIC'
hg.Categories.detect('.')
# 'COMMON'
# get alphabet for categories
hg.Categories.get_alphabet(['CYRILLIC'])
# {'ӗ', 'Ԍ', 'Ґ', 'Я', ..., 'Э', 'ԕ', 'ӻ'} Get it:
# get homoglyphs (latin alphabet initialized by default)
hg.Homoglyphs().get_combinations('q')
# ['q', '𝐪', '𝑞', '𝒒', '𝓆', '𝓺', '𝔮', '𝕢', '𝖖', '𝗊', '𝗾', '𝘲', '𝙦', '𝚚'] Alphabet loading:
# load alphabet on init by categories
homoglyphs = hg.Homoglyphs(categories=('LATIN', 'COMMON', 'CYRILLIC')) # alphabet loaded here
homoglyphs.get_combinations('гы')
# ['rы', 'гы', 'ꭇы', 'ꭈы', '𝐫ы', '𝑟ы', '𝒓ы', '𝓇ы', '𝓻ы', '𝔯ы', '𝕣ы', '𝖗ы', '𝗋ы', '𝗿ы', '𝘳ы', '𝙧ы', '𝚛ы']
# load alphabet on init by languages
homoglyphs = hg.Homoglyphs(languages={'ru', 'en'}) # alphabet will be loaded here
homoglyphs.get_combinations('гы')
# ['rы', 'гы']
# manual set alphabet on init # eng rus
homoglyphs = hg.Homoglyphs(alphabet='abc абс')
homoglyphs.get_combinations('с')
# ['c', 'с']
# load alphabet on demand
homoglyphs = hg.Homoglyphs(languages={'en'}, strategy=hg.STRATEGY_LOAD)
# ^ alphabet will be loaded here for "en" language
homoglyphs.get_combinations('гы')
# ^ alphabet will be loaded here for "ru" language
# ['rы', 'гы'] You can combine categories, languages, alphabet and any strategies as you want.
homoglyphs = hg.Homoglyphs(languages={'en'}, strategy=hg.STRATEGY_LOAD)
# convert
homoglyphs.to_ascii('тест')
# ['tect']
homoglyphs.to_ascii('ХР123.') # this is cyrillic "х" and "р"
# ['XP123.', 'XPI23.', 'XPl23.']
# string with chars which can't be converted by default will be ignored
homoglyphs.to_ascii('лол')
# []
# you can set strategy for removing not converted non-ASCII chars from result
homoglyphs = hg.Homoglyphs(
languages={'en'},
strategy=hg.STRATEGY_LOAD,
ascii_strategy=hg.STRATEGY_REMOVE,
)
homoglyphs.to_ascii('лол')
# ['o'] Java remains one of the most widely used programming platforms for servers, enterprise applications, Android…
Ubuntu users often download software directly from developer websites instead of using the default app…
Installing Ubuntu 26.04 LTS is only the first step toward building a smooth, secure, and…
What is a Software Supply Chain Attack? A software supply chain attack occurs when a…
When people ask how UDP works, the simplest answer is this: UDP sends data quickly…
Endpoint Detection and Response (EDR) solutions have become a cornerstone of modern cybersecurity, designed to…