Rezha Julio

The Hard Coded Chemist

Unicode Character Database at Your Hand

2018-07-20
Python’s self explanatory module called unicodedata provides the user with access to the Unicode Character Database and implicitly every character’s properties. Lookup a character by name with lookup: >>> import unicodedata >>> unicodedata.lookup('RIGHT SQUARE BRACKET') ']' >>> three_wise_monkeys = ["SEE-NO-EVIL MONKEY", "HEAR-NO-EVIL MONKEY", "SPEAK-NO-EVIL MONKEY"] >>> ''.join(map(unicodedata.lookup, three_wise_monkeys)) '🙈🙉🙊' Get a character’s name with name: >>> unicodedata.name(u'~') 'TILDE' Get the category of a character: >>> unicodedata.category(u'X') 'Lu' # L = letter, u = uppercase Also, using the unicodedata Python module, it’s easy to normalize any unicode data strings (remove accents, etc): Continue reading