Problems with utf8, locale and regex

Discussion in 'Perl' started by Thore Harald Høye, Dec 5, 2007.

  1. I have made this testcase:
    -----------------------
    #!/usr/bin/perl
    #use locale;
    #use encoding 'iso-8859-1';
    use utf8;
    binmode(STDOUT, ":utf8");

    print "\\x{00D8}:\n";
    test("\x{00D8}");

    print "\nØ:\n";
    test("Ø");

    sub test {
    my $chr = shift;
    print "ord: " . ord($chr) . ", '$chr', lc: " . lc($chr) . "\n";
    print "isutf8: " . utf8::is_utf8($chr) . "\n";
    $chr =~ /$chr/i && print "Caseinsensitive matches\n";
    $chr =~ /$chr/ && print "Casesensitive matches\n";
    }

    -----------------------

    The weirdest thing here is that if "use locale" is enabled, the case
    insensitive test in the last test() will fail. Without use encoding it
    will work in the first version (which does not get the utf8-flag), but not
    in the last. Without use locale both works.

    If I run the program with "use encoding.." enabled, both versions will
    have the utf8-flag, and both fails. It will also print the result in
    ISO-8859-1, even though I have the binmode() later.

    It doesn't seem to matter what the locale is. I have tried no_NO.UTF-8 and
    en_US.UTF-8. lc($chr) works in both cases, but it only sorts arrays
    correctly with no_NO.

    I save the file in UTF-8 mode.
    Thore Harald Høye, Dec 5, 2007
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Maurice Hulsman
    Replies:
    1
    Views:
    1,830
    Guus Bosman
    Jul 25, 2004
  2. zade
    Replies:
    1
    Views:
    597
    James Kanze
    Mar 5, 2010
  3. gry
    Replies:
    2
    Views:
    712
    Alf P. Steinbach
    Mar 13, 2012
  4. Replies:
    2
    Views:
    389
  5. Michal Jankowski
    Replies:
    0
    Views:
    150
    Michal Jankowski
    Apr 29, 2011
Loading...

Share This Page