[xquery-talk] Convert diacritics to low-ascii

Geert Josten geert.josten at daidalos.nl
Tue Jun 21 05:33:11 PDT 2011


Thanx Andy!

Works just fine in XQuery too. But have to admit that it looks a bit funny to me. Replace something with nothing and still end up with all characters? Can anyone explain what this \p{M} is matching? Unicode spec isn't making it much clearer to me.. :-P

Kind regards,
Geert

-----Oorspronkelijk bericht-----
Van: Houghton,Andrew [mailto:houghtoa at oclc.org] 
Verzonden: dinsdag 21 juni 2011 14:18
Aan: Geert Josten
Onderwerp: Re: [xquery-talk] Convert diacritics to low-ascii

If you are using XSLT 2.0 then convert the string to Unicode NFD or NFKD with normalize-unicode, then use regex replace with a unicode category to remove the diacritics. For example: 

<xsl:variable name="text" as="xsd:string" select="replace(normalize-unicode('abcdëf', 'NFD'), '[\p{M}]', '')" /> 

Hope that helps, Andy


----- Original Message -----
From: talk-bounces at x-query.com <talk-bounces at x-query.com>
To: talk at x-query.com <talk at x-query.com>
Sent: Tue Jun 21 07:32:06 2011
Subject: [xquery-talk] Convert diacritics to low-ascii

Hi,

Does anyone know a simple trick to convert characters like é and ä to their low-ascii counterparts?

Kind regards,
Geert

_______________________________________________
talk at x-query.com
http://x-query.com/mailman/listinfo/talk





More information about the talk mailing list