binmode blues

J

jidanni

What is wrong with
$ cat normalize
#!/usr/bin/perl
binmode STDIN, ":utf8"; binmode STDOUT, ":utf8";
use Unicode::Normalize q(decompose);
local $/; $_=<>; print decompose($_);

which is causing these to differ?:
$ normalize 4.htm |head -c 11|od -x|head -n 1
0000000 2020 cc61 c282 c280 20a2 0061
$ normalize < 4.htm |head -c 11|od -x|head -n 1
0000000 2020 80e2 20a2 b0e5 e88e 00a6

I.e., why must I remember to use a "<" to get proper results?
 
B

brian d foy

which is causing these to differ?:
$ normalize 4.htm |head -c 11|od -x|head -n 1
0000000 2020 cc61 c282 c280 20a2 0061[/QUOTE]

This one is reading from ARGV. What happens when you bin mode that
filehandle too?
$ normalize < 4.htm |head -c 11|od -x|head -n 1
0000000 2020 80e2 20a2 b0e5 e88e 00a6

This one is reading from STDIN.
 
E

Eric Pozharski

What is wrong with
$ cat normalize
#!/usr/bin/perl
binmode STDIN, ":utf8"; binmode STDOUT, ":utf8";
use Unicode::Normalize q(decompose);
local $/; $_=<>; print decompose($_);
which is causing these to differ?:
$ normalize 4.htm |head -c 11|od -x|head -n 1
0000000 2020 cc61 c282 c280 20a2 0061
$ normalize < 4.htm |head -c 11|od -x|head -n 1
0000000 2020 80e2 20a2 b0e5 e88e 00a6
I.e., why must I remember to use a "<" to get proper results?

23:03:38 38 [0:0]$ perl -wle '
$x = <>; system qq|ls -l /proc/$$/fd|' /etc/passwd
Name "main::x" used only once: possible typo at -e line 1.
total 0
lrwx------ 1 whynot whynot 64 2008-09-11 23:31 0 -> /dev/pts/1
lrwx------ 1 whynot whynot 64 2008-09-11 23:31 1 -> /dev/pts/1
lrwx------ 1 whynot whynot 64 2008-09-11 23:31 2 -> /dev/pts/1
lr-x------ 1 whynot whynot 64 2008-09-11 23:31 3 -> /etc/passwd
^^^^^^^^^^^^^^^^
lr-x------ 1 whynot whynot 64 2008-09-11 23:31 4 -> pipe:[4517507]
l-wx------ 1 whynot whynot 64 2008-09-11 23:31 5 -> pipe:[4517507]
23:31:10 39 [0:0]$ perl -wle '
$x = <>; system qq|ls -l /proc/$$/fd|' </etc/passwd
Name "main::x" used only once: possible typo at -e line 1.
total 0
lr-x------ 1 whynot whynot 64 2008-09-11 23:31 0 -> /etc/passwd
^^^^^^^^^^^^^^^^
lrwx------ 1 whynot whynot 64 2008-09-11 23:31 1 -> /dev/pts/1
lrwx------ 1 whynot whynot 64 2008-09-11 23:31 2 -> /dev/pts/1
lr-x------ 1 whynot whynot 64 2008-09-11 23:31 3 -> pipe:[4517541]
 
B

Ben Morrow

Quoth (e-mail address removed):
bdf> This one is reading from ARGV. What happens when you bin mode that
bdf> filehandle too?

I see. OK, so what should I write instead of just
so that it will work no matter if my program is called with any of
$ myprogram < file1
$ myprogram file1
$ myprogram file1 file2 file3 ...
$ myprogram file1 file2 file3 < file4
Ah, for(STDIN,STDOUT,@ARGV){binmode $_, ":utf8"}... shot down real fast:
Bareword "STDIN" not allowed while "strict subs"... as we see, yes I don't
know what I'm doing.

Global filehandles are a little odd in Perl. You want

for (\*STDIN, \*STDOUT, \*ARGV) {
binmode $_, ":utf8";
}

except you *really* don't want to be using the :utf8 layer. Use
:encoding(utf8) instead: it's much safer in te face of invalid data.

Ben
 
J

John W. Krahn

Ben said:
Quoth (e-mail address removed):

Global filehandles are a little odd in Perl. You want

for (\*STDIN, \*STDOUT, \*ARGV) {
binmode $_, ":utf8";
}

And you may want to also include ARGVOUT in there? Of course you
*could* do it like this instead:

use open qw/:std :utf8/;

except you *really* don't want to be using the :utf8 layer. Use
:encoding(utf8) instead: it's much safer in te face of invalid data.

use open qw/:std :encoding(utf8)/;



John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top