Rezha Julio

The Hard Coded Chemist

Unicode Character Database at Your Hand

Python’s self explanatory module called unicodedata provides the user with access to the Unicode Character Database and implicitly every character’s properties. Lookup a character by name with lookup: >>> import unicodedata >>> unicodedata.lookup('RIGHT SQUARE BRACKET') ']' >>> three_wise_monkeys = ["SEE-NO-EVIL MONKEY", "HEAR-NO-EVIL MONKEY", "SPEAK-NO-EVIL MONKEY"] >>> ''.join(map(unicodedata.lookup, three_wise_monkeys)) '🙈🙉🙊' Get a character’s name with name: >>>'~') 'TILDE' Get the category of a character: >>> unicodedata.category(u'X') 'Lu' # L = letter, u = uppercase Also, using the unicodedata Python module, it’s easy to normalize any unicode data strings (remove accents, etc): Continue reading