split is not convinient

A

A. Sinan Unur

As has been mentioned somewhere above, that's simply incorrect code in
most circumstances. while (<handle>) { ... } is a special case and if
you really must use a named variable the construct should be

while (defined(my $var = <>)) { ... }

As far as I know, the magic test for defined in a while loop is the same
whether you are reading into a $_ or any lexical variable. Here is a
short test:

C:\DOCUME~1\asu1\LOCALS~1\Temp\m> cat m.txt
0
1

2
3

4
C:\DOCUME~1\asu1\LOCALS~1\Temp\m> xxd m.txt
0000000: 300a 310a 0a32 0a33 0a0a 340a 0a 0.1..2.3..4..

C:\DOCUME~1\asu1\LOCALS~1\Temp\m> cat m.pl
#!/usr/bin/perl

use strict;
use warnings;

my $name = 'm.txt';

my %dispatch = (
without_defined => \&without_defined,
with_defined => \&with_defined,
without_var => \&without_var,
);

for my $case ( sort keys %dispatch ) {
print "$case:\n";

open my $f, '<', $name
or die "Cannot open '$name': $!";
$dispatch{ $case }->( $f );
close $f
or die "Cannot close '$name': $!";
}

sub without_defined {
my $f = shift;

while ( my $line = <$f> ) {
print decorate($line);
}
}

sub with_defined {
my $f = shift;

while ( defined( my $line = <$f> ) ) {
print decorate($line);
}
}

sub without_var {
my $f = shift;

while ( <$f> ) {
print decorate($_);
}
}

sub decorate {
my ($s) = @_;
$s =~ s/\n/</;
return "'$s'\n";
}

__END__

C:\DOCUME~1\asu1\LOCALS~1\Temp\m> m
Without var assignment in while:
'0<'
'1<'
'<'
'2<'
'3<'
'<'
'4<'
'<'
With 'defined' in while:
'0<'
'1<'
'<'
'2<'
'3<'
'<'
'4<'
'<'
Without 'defined' in while:
'0<'
'1<'
'<'
'2<'
'3<'
'<'
'4<'
'<'
 
B

Ben Morrow

Quoth Joost Diepenmaat said:
As has been mentioned somewhere above, that's simply incorrect code in
most circumstances. while (<handle>) { ... } is a special case and if you
really must use a named variable the construct should be

while (defined(my $var = <>)) { ... }

while (my $var = <HANDLE>) gets the same special-case treatment:

~% perl -MO=Deparse -e'while (my $x = <HANDLE>) { 1 }'
while (defined(my $x = <HANDLE>)) {
'???';
}
-e syntax OK

Ben
 
C

Charlton Wilbur

JE> On the other hand a lot of the elegance of Perl comes from its
JE> default "work space". It is so ingrained in the Perl world
JE> that I would almost call it a prerequisit for programming in
JE> Perl. If you don't know how to take advantage of $_ and @_
JE> then maybe you shouldn't program in Perl or maintain Perl
JE> code.

I would add, here, that if you don't know when to *not* take advantage
of $_ and @_, you shouldn't program in Perl or maintain Perl code,
either -- and that knowing *when* and *whether* to use a feature of a
programming language is far more important than knowing *how* to use
that feature.

Charlton
 
X

xhoster

Todd said:
Hi all,

Seems nobody here mentioned the magic that
split ' ', expr
is same as
split /\s+/, expr

We haven't mentioned it because it isn't true. Read the docs more
carefully.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 
X

xhoster

Mark Seger said:
Maybe I've just been programming for too many years and looked at too
much code written by people who like to take shortcuts, but I think your
cases 1/2 should never be used! In fact I avoid any syntax that has a
default such as "while (<>)" which is why I always use "while
($var=<>)".

You are missing a defined, and maybe a my, I would think.
After all, you never know when someone is going to insert a
line of code that blows away your default with a different one. In
other words it's all about supportability!

I write thousands of Perl programs every year. I make no pretense that I
will even be able to find 99% of them six months from now, much less
support them. When writing code I know will be in use for a very long time,
I do make an effort to make is sustainable. But avoiding reasonable
defaults is not among them.
It's also common for someone to pick up a piece of code in a language
they're not familiar with and try to figure out what it does with
minimal effort and having something that is not obvious is a real pain.

If you want people who don't know Perl to be understand your programs,
then don't program in Perl in the first place. Anyway, it isn't like
"split ' ', $_;" is somehow more obvious than "split;". ' ' is the same
as / /, right? Oh, you mean it isn't?

Are people that lazy that typing a few extra characters to make what
they're doing more explicit too much of a burden? How is this any
different from commenting code? Or don't people do that any more either?

I always love commented code, like this:

## increment the variable named "foo"
$foo++;

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 
M

Mark Seger

A. Sinan Unur said:
As far as I know, the magic test for defined in a while loop is the same
whether you are reading into a $_ or any lexical variable. Here is a
short test:

C:\DOCUME~1\asu1\LOCALS~1\Temp\m> cat m.txt
0
1

2
3

4
C:\DOCUME~1\asu1\LOCALS~1\Temp\m> xxd m.txt
0000000: 300a 310a 0a32 0a33 0a0a 340a 0a 0.1..2.3..4..

C:\DOCUME~1\asu1\LOCALS~1\Temp\m> cat m.pl
#!/usr/bin/perl

use strict;
use warnings;

my $name = 'm.txt';

my %dispatch = (
without_defined => \&without_defined,
with_defined => \&with_defined,
without_var => \&without_var,
);

for my $case ( sort keys %dispatch ) {
print "$case:\n";

open my $f, '<', $name
or die "Cannot open '$name': $!";
$dispatch{ $case }->( $f );
close $f
or die "Cannot close '$name': $!";
}

sub without_defined {
my $f = shift;

while ( my $line = <$f> ) {
print decorate($line);
}
}

sub with_defined {
my $f = shift;

while ( defined( my $line = <$f> ) ) {
print decorate($line);
}
}

sub without_var {
my $f = shift;

while ( <$f> ) {
print decorate($_);
}
}

sub decorate {
my ($s) = @_;
$s =~ s/\n/</;
return "'$s'\n";
}

__END__

C:\DOCUME~1\asu1\LOCALS~1\Temp\m> m
Without var assignment in while:
'0<'
'1<'
'<'
'2<'
'3<'
'<'
'4<'
'<'
With 'defined' in while:
'0<'
'1<'
'<'
'2<'
'3<'
'<'
'4<'
'<'
Without 'defined' in while:
'0<'
'1<'
'<'
'2<'
'3<'
'<'
'4<'
'<'

Am I going blind are is the output from all 3 examples above the same?
In any event this discussion caused me to go back and read the
documentation since nobody actually said why you needed 'defined' and
the only explanation I could find is that you can't distinguish between
an undef, 0 or '' in a boolean!

I guess the reason I've been lucky up until now is that when leaving off
the 'defined' and reading a file I can never read in a 0 or '' since
each string will have a \n at the end, but I can also appreciate why
including the 'defined' would be good form even if you know it won't
effect your particular situation - or is there something else I'm
missing here? In any event I would continue to argue that's why I think
it's always good form to include the control variable but I'm also sure
I'll never win that argument with this crowd. :cool:

-mark
 
M

Mark Seger

It's also common for someone to pick up a piece of code in a language
If you want people who don't know Perl to be understand your programs,
then don't program in Perl in the first place.

I hope you're not serious. I would suspect many people find themselves
periodically having to look at code in languages they're not familiar
with when there is a problem that needs to be addressed. Sometimes that
code is easy to follow and other times it is not and it can be a
nightmare trying to figure out what's going on. I've had people look at
my perl code who aren't perl programmers yet they're usually able to
figure out what it's doing and in some cases submit patches.
I always love commented code, like this:

## increment the variable named "foo"
$foo++;
at least we can agree on that. 9-)
-mark
 
J

John W. Krahn

Mark said:
Am I going blind are is the output from all 3 examples above the same?

Yes, because modern versions of perl automagically check for undefined
values.
In any event this discussion caused me to go back and read the
documentation since nobody actually said why you needed 'defined' and
the only explanation I could find is that you can't distinguish between
an undef, 0 or '' in a boolean!

I guess the reason I've been lucky up until now is that when leaving off
the 'defined' and reading a file I can never read in a 0 or '' since
each string will have a \n at the end, but I can also appreciate why
including the 'defined' would be good form even if you know it won't
effect your particular situation - or is there something else I'm
missing here?

Some "text" files may not contain a final newline and if the last
character is "0" the last line read will be false but defined. If the
Input Record Separator is set to the length 1 ($/ = \1;) then any "0"
character read in will be false but defined.



John
 
M

Mark Seger

John said:
Yes, because modern versions of perl automagically check for undefined
values.


Some "text" files may not contain a final newline and if the last
character is "0" the last line read will be false but defined. If the
Input Record Separator is set to the length 1 ($/ = \1;) then any "0"
character read in will be false but defined.

excellent example, so I tried it! I built a file consisting of 3 lines,
the last one not containing a terminator. However, I would have
expected my script to just print 1 & 2 but for the 'while' to exit when
it read the 1 character 0 at the end but it didn't. I even printed out
the length of the string to convince myself there weren't any trailing
chars. any thoughts?

[root@cag-dl380-01 collectl]# cat x
1
2
0[root@cag-dl380-01 collectl]# cat test.pl
#!/usr/bin/perl -w
open FILE, "<x" or die;

-mark
 
X

xhoster

Mark Seger said:
excellent example, so I tried it! I built a file consisting of 3 lines,
the last one not containing a terminator. However, I would have
expected my script to just print 1 & 2 but for the 'while' to exit when
it read the 1 character 0 at the end but it didn't. I even printed out
the length of the string to convince myself there weren't any trailing
chars. any thoughts?

Sufficiently modern Perls automatically test for defined
in certain while (<>)-like constructs. But this is one of those shortcuts,
so presumably you will want to spell it out explicitly.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 
J

John W. Krahn

Mark said:
Some "text" files may not contain a final newline and if the last
character is "0" the last line read will be false but defined. If the
Input Record Separator is set to the length 1 ($/ = \1;) then any "0"
character read in will be false but defined.

excellent example, so I tried it! I built a file consisting of 3 lines,
the last one not containing a terminator. However, I would have
expected my script to just print 1 & 2 but for the 'while' to exit when
it read the 1 character 0 at the end but it didn't. I even printed out
the length of the string to convince myself there weren't any trailing
chars. any thoughts?

[root@cag-dl380-01 collectl]# cat x
1
2
0[root@cag-dl380-01 collectl]# cat test.pl
#!/usr/bin/perl -w
open FILE, "<x" or die;

You wrote:

while ($a=<FILE>) {

but perl interprets that as:

while (defined($a=<FILE>)) {

so that example will not work on modern versions of Perl.

You could bypass the built-in defined() test by using a compound
expression:

$ echo -n "1
2
<< Len: 2

$ echo -n "1
2
0
3
<< Len: 1



John
 
M

Mark Seger

John said:
Mark said:
John said:
Some "text" files may not contain a final newline and if the last
character is "0" the last line read will be false but defined. If the
Input Record Separator is set to the length 1 ($/ = \1;) then any "0"
character read in will be false but defined.
excellent example, so I tried it! I built a file consisting of 3 lines,
the last one not containing a terminator. However, I would have
expected my script to just print 1 & 2 but for the 'while' to exit when
it read the 1 character 0 at the end but it didn't. I even printed out
the length of the string to convince myself there weren't any trailing
chars. any thoughts?

[root@cag-dl380-01 collectl]# cat x
1
2
0[root@cag-dl380-01 collectl]# cat test.pl
#!/usr/bin/perl -w
open FILE, "<x" or die;
while ($a= said:
1 << Len: 2
2 << Len: 2
0<< Len: 1

You wrote:

while ($a=<FILE>) {

but perl interprets that as:

while (defined($a=<FILE>)) {

so that example will not work on modern versions of Perl.

You could bypass the built-in defined() test by using a compound
expression:

$ echo -n "1
2
<< Len: 2

$ echo -n "1
2
0
3
<< Len: 1



John

cool... so to [hopefully] close this topic out could someone explain to
me why using an explicit assignment, even with 'defined', if different
than than not using any assignment? I still don't get it...
-mark
 
B

Ben Morrow

Quoth Mark Seger said:
cool... so to [hopefully] close this topic out could someone explain to
me why using an explicit assignment, even with 'defined', if different
than than not using any assignment? I still don't get it...

while (<>)

is a shortcut for

while($_ = <>)

which is a shortcut for

while( defined($_ = <>) )

which is a shortcut for

while( defined($_ = <ARGV>) )

where ARGV is a magic filehandle that reads from the files specified on
the commandline or from stdin. If you run a small example with perl
-MO=Deparse it will expand these expressions for you.

Ben
 
M

Mark Seger

Ben said:
Quoth Mark Seger said:
cool... so to [hopefully] close this topic out could someone explain to
me why using an explicit assignment, even with 'defined', if different
than than not using any assignment? I still don't get it...

while (<>)

is a shortcut for

while($_ = <>)

which is a shortcut for

while( defined($_ = <>) )

which is a shortcut for

while( defined($_ = <ARGV>) )

where ARGV is a magic filehandle that reads from the files specified on
the commandline or from stdin. If you run a small example with perl
-MO=Deparse it will expand these expressions for you.
thanks, but I thought someone had said earlier that there was a
different behavior using $_ and explicitly using my own variable and I
still don't get that as it seems to me they're one in the same, unless
of course $_ has special properties I don't know about. and that's
always possible...
-mark
 
A

A. Sinan Unur

Mark Seger said:
thanks, but I thought someone had said earlier that there was a
different behavior using $_ and explicitly using my own variable and I
still don't get that as it seems to me they're one in the same, unless
of course $_ has special properties I don't know about. and that's
always possible...

That was claimed by Joost Diepenmaat <[email protected]> in

You misinterpreted my refutation by example in
Message-ID: <[email protected]>

As an aside,

Some "text" files may not contain a final newline and if the last
character is "0" the last line read will be false but defined. If the
Input Record Separator is set to the length 1 ($/ = \1;) then any "0"
character read in will be false but defined.

Strictly speaking, if the final newline is omitted, the file is not a
text file, but I do understand that we encounter the missing final
newline in files that are otherwise intended to be text files.

Sinan
 
T

Tad J McClellan

Mark Seger said:
thanks, but I thought someone had said earlier that there was a
different behavior using $_ and explicitly using my own variable


$_ is a package (global) variable.

global variables are "bad", in general.

and I
still don't get that as it seems to me they're one in the same,


Perl has two completely separate sets of variables, package variables
and lexical variables.

$_ is in one set, and a my()'d variable is in the other set.


See:

"Coping with Scoping":

http://perl.plover.com/FAQs/Namespaces.html

unless
of course $_ has special properties I don't know about.


See above. :)
 
T

Tad J McClellan

Joost Diepenmaat said:
As has been mentioned somewhere above, that's simply incorrect code in
most circumstances.


In what way is it incorrect?

while (<handle>) { ... } is a special case and if you
really must use a named variable the construct should be

while (defined(my $var = <>)) { ... }


Perl adds the defined() test for you whether you type it or not.
 
J

John W. Krahn

A. Sinan Unur said:
As an aside,



Strictly speaking, if the final newline is omitted, the file is not a
text file, but I do understand that we encounter the missing final
newline in files that are otherwise intended to be text files.

Which is why I put "text" inside quotes. (Its hard to do air-quotes on
usenet.)

:)

John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top