Setting locale for java.util.regex at runtime

Alex Polite · Jun 5, 2004

"\w" doesn't match wordchars outside of [A-Za-z].

I suppose that this is in some way controlled by locale.

Is there any way to make this localesetting at runtime?

alex

Alan Moore · Jun 5, 2004

"\w" doesn't match wordchars outside of [A-Za-z].

I suppose that this is in some way controlled by locale.

Is there any way to make this localesetting at runtime?

alex

The java.util.regex package is not locale-senistive at all. The
character-class shorthands (\w, \d, \s) and POSIX character classes
(\p{Alpha}, \p{Digit}, etc.) only ever match ASCII characters. If you
want to match non-ASCII characters, you have to use Unicode blocks
like \p{InGreek}, or categories like \p{IsLetter} (which can be
shortened to \pL).

Oddly enough, the word-boundary construct, \b, works with *all*
Unicode letters and digits, not just the ASCII ones. That makes sense
when I think about how frustrating it would be if it didn't, but it
makes it seem that much stranger that \w, \d and \s are limited to the
ASCII range.

RegEx	0	Sep 1, 2022
Setting locale at runtime	3	Apr 11, 2005
Module locale throws exception: unsupported locale setting	2	Nov 19, 2010
Setting C++ locale for 1 category	3	Oct 10, 2011
Where did my language setting go?	4	Jul 1, 2013
character classes, locale and utf8 - strange behaviour	0	Apr 29, 2011
Changing Locale setting at runtime	13	Jul 10, 2007
i18n: Fallback more than 1 locale?	3	Jun 3, 2004

Setting locale for java.util.regex at runtime

Alex Polite

Alan Moore

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads