2018-07-20
Python’s self explanatory module called unicodedata provides the user with access to the Unicode Character Database and implicitly every character’s properties.
Lookup a character by name with lookup:
>>> import unicodedata >>> unicodedata.lookup('RIGHT SQUARE BRACKET') ']' >>> three_wise_monkeys = ["SEE-NO-EVIL MONKEY", "HEAR-NO-EVIL MONKEY", "SPEAK-NO-EVIL MONKEY"] >>> ''.join(map(unicodedata.lookup, three_wise_monkeys)) '🙈🙉🙊' Get a character’s name with name:
>>> unicodedata.name(u'~') 'TILDE' Get the category of a character:
>>> unicodedata.category(u'X') 'Lu' # L = letter, u = uppercase Also, using the unicodedata Python module, it’s easy to normalize any unicode data strings (remove accents, etc):
Continue reading