[xquery-talk] Convert diacritics to low-ascii
Geert Josten
geert.josten at daidalos.nl
Tue Jun 21 05:58:15 PDT 2011
Thanks!
I now notice I just didn’t looked close enough. The XQuery engine was returning: "abcdëf", I didn't notice the 'e' in front of the &#x..;
:)
-----Oorspronkelijk bericht-----
Van: Martin Honnen [mailto:Martin.Honnen at gmx.de]
Verzonden: dinsdag 21 juni 2011 14:55
Aan: Geert Josten
Onderwerp: Re: [xquery-talk] Convert diacritics to low-ascii
Geert Josten wrote:
> Works just fine in XQuery too. But have to admit that it looks a bit
> funny to me. Replace something with nothing and still end up with all
> characters? Can anyone explain what this \p{M} is matching? Unicode
> spec isn't making it much clearer to me.. :-P
Well when you do e.g.
normalize-unicode('äé', 'NFD')
you get a string with four characters 'a', ' ̈', 'e', and '́'. And the
replace(normalize-unicode('abcdëf', 'NFD'), '[\p{M}]', '')
removes the second and fourth character.
--
Martin Honnen --- MVP Data Platform Development
http://msmvps.com/blogs/martin_honnen/
More information about the talk
mailing list