Simple patterns like [a-z] or \d no longer cut the mustard, partly because Unicode is such a large character set, and partly because of multiple ways of writing characters with diacritics. There are many land mines in regular expressions now that Unicode matters