Making \w catch Swedish characters (setlocale)


D

David Hag

Hi,

I'm a bit puzzeled. I'm stuck in the place where one has to deal with
different sets of charachters, and it sucks.

Here goes my problem:

I want to do lots of transformations on Sweish texts, using s// and
m// etc. However, perl's \w for example does not cover any of the
Swedish chars "å", "ä" or "ö", besides missing any accented char as
"á" etc. This is a REAL problem for me, and I could make nasty work
arounds like defining a set of characters to look for each time needed
(like: [A-Za-zÅÄÖåäöéèáÀàÀ...]), but that is just crazy.

As far as I understand it should be possible to tell perl that I want
to use a specific locale, and that e.g. \w then should know about the
correct char set. First of all, my locale is already
Swedish-ISO-8859-1 on my Linux system. I tried to use the POSIX
module:

use POSIX qw(setlocale LC_ALL);

setlocale(LC_ALL, "sv_SE.iso-8859-1");


No complaints from perl, but still doing "m/\w+/" misses any word
containing "åäö"...

Any tips on how to solve this in a general way will be very much
appreciated!

Thanks,

David
 
Ad

Advertisements


Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top