binmode blues

Discussion in 'Perl Misc' started by jidanni, Sep 11, 2008.

  1. jidanni

    jidanni Guest

    What is wrong with
    $ cat normalize
    #!/usr/bin/perl
    binmode STDIN, ":utf8"; binmode STDOUT, ":utf8";
    use Unicode::Normalize q(decompose);
    local $/; $_=<>; print decompose($_);

    which is causing these to differ?:
    $ normalize 4.htm |head -c 11|od -x|head -n 1
    0000000 2020 cc61 c282 c280 20a2 0061
    $ normalize < 4.htm |head -c 11|od -x|head -n 1
    0000000 2020 80e2 20a2 b0e5 e88e 00a6

    I.e., why must I remember to use a "<" to get proper results?
     
    jidanni, Sep 11, 2008
    #1
    1. Advertisements

  2. jidanni

    brian d foy Guest

    which is causing these to differ?:
    $ normalize 4.htm |head -c 11|od -x|head -n 1
    0000000 2020 cc61 c282 c280 20a2 0061[/QUOTE]

    This one is reading from ARGV. What happens when you bin mode that
    filehandle too?
    This one is reading from STDIN.
     
    brian d foy, Sep 11, 2008
    #2
    1. Advertisements

  3. 23:03:38 38 [0:0]$ perl -wle '
    $x = <>; system qq|ls -l /proc/$$/fd|' /etc/passwd
    Name "main::x" used only once: possible typo at -e line 1.
    total 0
    lrwx------ 1 whynot whynot 64 2008-09-11 23:31 0 -> /dev/pts/1
    lrwx------ 1 whynot whynot 64 2008-09-11 23:31 1 -> /dev/pts/1
    lrwx------ 1 whynot whynot 64 2008-09-11 23:31 2 -> /dev/pts/1
    lr-x------ 1 whynot whynot 64 2008-09-11 23:31 3 -> /etc/passwd
    ^^^^^^^^^^^^^^^^
    lr-x------ 1 whynot whynot 64 2008-09-11 23:31 4 -> pipe:[4517507]
    l-wx------ 1 whynot whynot 64 2008-09-11 23:31 5 -> pipe:[4517507]
    23:31:10 39 [0:0]$ perl -wle '
    $x = <>; system qq|ls -l /proc/$$/fd|' </etc/passwd
    Name "main::x" used only once: possible typo at -e line 1.
    total 0
    lr-x------ 1 whynot whynot 64 2008-09-11 23:31 0 -> /etc/passwd
    ^^^^^^^^^^^^^^^^
    lrwx------ 1 whynot whynot 64 2008-09-11 23:31 1 -> /dev/pts/1
    lrwx------ 1 whynot whynot 64 2008-09-11 23:31 2 -> /dev/pts/1
    lr-x------ 1 whynot whynot 64 2008-09-11 23:31 3 -> pipe:[4517541]
     
    Eric Pozharski, Sep 11, 2008
    #3
  4. jidanni

    Ben Morrow Guest

    Quoth :
    Global filehandles are a little odd in Perl. You want

    for (\*STDIN, \*STDOUT, \*ARGV) {
    binmode $_, ":utf8";
    }

    except you *really* don't want to be using the :utf8 layer. Use
    :encoding(utf8) instead: it's much safer in te face of invalid data.

    Ben
     
    Ben Morrow, Sep 12, 2008
    #4
  5. And you may want to also include ARGVOUT in there? Of course you
    *could* do it like this instead:

    use open qw/:std :utf8/;

    use open qw/:std :encoding(utf8)/;



    John
     
    John W. Krahn, Sep 12, 2008
    #5
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.